Difference between revisions of "Increasing AI Intelligence"

From GISAXS
Jump to: navigation, search
(Inference Time Compute)
(CoT reasoning model)
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
=Reviews=
 
=Reviews=
 
* 2024-12: [https://arxiv.org/abs/2412.11936 A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges]
 
* 2024-12: [https://arxiv.org/abs/2412.11936 A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges]
 +
* Links to papers: [https://github.com/hijkzzz/Awesome-LLM-Strawberry Awesome LLM Strawberry (OpenAI o1)]
  
 
=Prompt Engineering=
 
=Prompt Engineering=
Line 54: Line 55:
 
* 2024-12: [https://arxiv.org/abs/2412.18319 Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search]
 
* 2024-12: [https://arxiv.org/abs/2412.18319 Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search]
 
* 2024-12: [https://arxiv.org/abs/2412.14135 Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective]
 
* 2024-12: [https://arxiv.org/abs/2412.14135 Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective]
 +
* 2025-01: [https://arxiv.org/abs/2501.01904 Virgo: A Preliminary Exploration on Reproducing o1-like MLLM]
  
 
===Scaling===
 
===Scaling===
Line 160: Line 162:
 
[[Image:Compute.png|600px]]
 
[[Image:Compute.png|600px]]
 
* 2024-09-16: [https://www.oneusefulthing.org/p/scaling-the-state-of-play-in-ai Scaling: The State of Play in AI]
 
* 2024-09-16: [https://www.oneusefulthing.org/p/scaling-the-state-of-play-in-ai Scaling: The State of Play in AI]
 +
 +
===Pitfalls===
 +
* 2024-12: [https://arxiv.org/abs/2412.21187 Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs]
  
 
==Pragmatics==
 
==Pragmatics==
Line 170: Line 175:
 
=Tool Use=
 
=Tool Use=
 
* 2024-11: [https://arxiv.org/abs/2411.01747 DynaSaur: Large Language Agents Beyond Predefined Actions]: writes functions/code to increase capabilities
 
* 2024-11: [https://arxiv.org/abs/2411.01747 DynaSaur: Large Language Agents Beyond Predefined Actions]: writes functions/code to increase capabilities
 +
==Integrated==
 +
* 2018-08: [https://arxiv.org/abs/1808.00508 Neural Arithmetic Logic Units]
 +
* 2023-01: [https://arxiv.org/abs/2301.05062 Tracr: Compiled Transformers as a Laboratory for Interpretability] ([https://github.com/google-deepmind/tracr code])
 +
* 2024-05: [https://openreview.net/pdf?id=W77TygnBN5 Augmenting Language Models with Composable Differentiable Libraries] ([https://openreview.net/pdf/0ab6ab86a6adf52751f35b725056d5011ecc575d.pdf  pdf])
 +
* 2024-07: [https://arxiv.org/abs/2407.04899 Algorithmic Language Models with Neurally Compiled Libraries]
 +
* 2024-10: [https://arxiv.org/abs/2410.18077 ALTA: Compiler-Based Analysis of Transformers]
  
 
=Multi-agent Effort (and Emergent Intelligence)=
 
=Multi-agent Effort (and Emergent Intelligence)=
Line 182: Line 193:
 
* 2024-06: [https://arxiv.org/abs/2406.07496 TextGrad: Automatic "Differentiation" via Text] (gradient backpropagation through text)
 
* 2024-06: [https://arxiv.org/abs/2406.07496 TextGrad: Automatic "Differentiation" via Text] (gradient backpropagation through text)
 
* 2024-06: [https://arxiv.org/abs/2406.18532 Symbolic Learning Enables Self-Evolving Agents] (optimize LLM frameworks)
 
* 2024-06: [https://arxiv.org/abs/2406.18532 Symbolic Learning Enables Self-Evolving Agents] (optimize LLM frameworks)
 +
 +
=See Also=
 +
* [[AI]]
 +
* [[AI Agents]]

Latest revision as of 10:03, 6 January 2025

Reviews

Prompt Engineering

Fine Tuning

Proactive Search

Compute expended after training, but before inference.

Training Data (Data Refinement, Synthetic Data)

Generate consistent plans/thoughts

  • 2024-08: Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers (code)
    • (Microsoft) rStar is a self-play mutual reasoning approach. A small model adds to MCTS using some defined reasoning heuristics. Mutually consistent trajectories can be emphasized.
  • 2024-09: Self-Harmonized Chain of Thought
    • Produce refined chain-of-thought style solutions/prompts for diverse problems. Given a large set of problems/questions, first aggregated semantically, then apply zero-shot chain-of-thought to each problem. Then cross-pollinate between proposed solutions to similar problems, looking for refined and generalize solutions.
  • 2024-11: LLMs Do Not Think Step-by-step In Implicit Reasoning
    • They argue that models trained to reproduce CoT outputs do not, internally, perform stepwise reasoning (with intermediate representations); this suggests that explicit CoT could be superior to implicit CoT.

Sampling

Automated prompt generation

Distill inference-time-compute into model

CoT reasoning model

Scaling

Inference Time Compute

Methods

Review

In context learning (ICL), search, and other inference-time methods

Inference-time Sampling

Inference-time Gradient

Self-prompting

Retrieval or Memory

In-context thought

Naive multi-LLM (verification, majority voting, best-of-N, etc.)

Multi-LLM (multiple comparisons, branching, etc.)

Iteration (e.g. neural-like layered blocks)

Iterative reasoning via graphs

Monte Carlo Tree Search (MCTS)

Other Search

Chain-of-Thought Reasoning

Analysis

Scaling

Theory

Expending compute works

Compute.png

Pitfalls

Pragmatics

Code for Inference-time Compute

  • optillm: Inference proxy which implements state-of-the-art techniques to improve accuracy and performance of LLMs (improve reasoning over coding, logical and mathematical queries)

Memory

Tool Use

Integrated

Multi-agent Effort (and Emergent Intelligence)

ML-like Optimization of LLM Setup

See Also