Difference between revisions of "Increasing AI Intelligence"

From GISAXS
Jump to: navigation, search
(Meta-methods)
(Usage of Reasoning Compute)
 
(9 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
* 2025-02: [https://arxiv.org/abs/2502.03671 Advancing Reasoning in Large Language Models: Promising Methods and Approaches]
 
* 2025-02: [https://arxiv.org/abs/2502.03671 Advancing Reasoning in Large Language Models: Promising Methods and Approaches]
 
* 2025-02: [https://arxiv.org/abs/2502.09100 Logical Reasoning in Large Language Models: A Survey]
 
* 2025-02: [https://arxiv.org/abs/2502.09100 Logical Reasoning in Large Language Models: A Survey]
 +
* 2025-02: [https://arxiv.org/abs/2502.21321 LLM Post-Training: A Deep Dive into Reasoning Large Language Models]
 
* Links to papers: [https://github.com/hijkzzz/Awesome-LLM-Strawberry Awesome LLM Strawberry (OpenAI o1)]
 
* Links to papers: [https://github.com/hijkzzz/Awesome-LLM-Strawberry Awesome LLM Strawberry (OpenAI o1)]
  
Line 17: Line 18:
 
* 2023-09: [https://arxiv.org/abs/2309.16797 Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution]
 
* 2023-09: [https://arxiv.org/abs/2309.16797 Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution]
 
* 2025-02: [https://arxiv.org/abs/2502.16923 A Systematic Survey of Automatic Prompt Optimization Techniques]
 
* 2025-02: [https://arxiv.org/abs/2502.16923 A Systematic Survey of Automatic Prompt Optimization Techniques]
 +
* 2025-02: [https://arxiv.org/abs/2502.18746 Automatic Prompt Optimization via Heuristic Search: A Survey]
  
 
=Fine Tuning=
 
=Fine Tuning=
Line 34: Line 36:
 
* 2025-01: [https://arxiv.org/abs/2501.18845 Text Data Augmentation for Large Language Models: A Comprehensive Survey of Methods, Challenges, and Opportunities]
 
* 2025-01: [https://arxiv.org/abs/2501.18845 Text Data Augmentation for Large Language Models: A Comprehensive Survey of Methods, Challenges, and Opportunities]
 
* 2025-02: [https://arxiv.org/abs/2502.01718 ACECODER: Acing Coder RL via Automated Test-Case Synthesis]
 
* 2025-02: [https://arxiv.org/abs/2502.01718 ACECODER: Acing Coder RL via Automated Test-Case Synthesis]
 +
* 2025-02: [https://arxiv.org/abs/2502.15588 Improving the Scaling Laws of Synthetic Data with Deliberate Practice]
 
* Updating list of links: [https://github.com/wasiahmad/Awesome-LLM-Synthetic-Data Synthetic Data of LLMs, by LLMs, for LLMs]
 
* Updating list of links: [https://github.com/wasiahmad/Awesome-LLM-Synthetic-Data Synthetic Data of LLMs, by LLMs, for LLMs]
 +
 +
====Re-captioning====
 +
* 2023-10: [https://arxiv.org/abs/2310.16656 A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation]
 +
* 2024-07: [https://arxiv.org/abs/2407.06723 Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions]
  
 
===Generate consistent plans/thoughts===
 
===Generate consistent plans/thoughts===
Line 81: Line 88:
 
* 2025-02: [https://arxiv.org/abs/2502.03373 Demystifying Long Chain-of-Thought Reasoning in LLMs]
 
* 2025-02: [https://arxiv.org/abs/2502.03373 Demystifying Long Chain-of-Thought Reasoning in LLMs]
 
* 2025-02: [https://arxiv.org/abs/2502.05171 Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach] ([https://huggingface.co/tomg-group-umd/huginn-0125 Huginn-0125])
 
* 2025-02: [https://arxiv.org/abs/2502.05171 Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach] ([https://huggingface.co/tomg-group-umd/huginn-0125 Huginn-0125])
 +
* 2025-02: [https://arxiv.org/pdf/2502.20339 Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners]
  
 
===Scaling===
 
===Scaling===
 
* 2024-08: [https://arxiv.org/abs/2408.16737 Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling] (Google DeepMind)
 
* 2024-08: [https://arxiv.org/abs/2408.16737 Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling] (Google DeepMind)
 
* 2024-11: [https://arxiv.org/abs/2411.04434 Scaling Laws for Pre-training Agents and World Models]
 
* 2024-11: [https://arxiv.org/abs/2411.04434 Scaling Laws for Pre-training Agents and World Models]
 +
* 2025-02: [https://arxiv.org/pdf/2502.20339 Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners]
  
 
=Inference Time Compute=
 
=Inference Time Compute=
Line 195: Line 204:
 
* 2025-02: [https://www.arxiv.org/abs/2502.04463 Training Language Models to Reason Efficiently]
 
* 2025-02: [https://www.arxiv.org/abs/2502.04463 Training Language Models to Reason Efficiently]
 
* 2025-02: [https://arxiv.org/abs/2502.08235 The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks]
 
* 2025-02: [https://arxiv.org/abs/2502.08235 The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks]
 +
* 2025-03: [https://arxiv.org/abs/2503.01141 How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach]
  
 
====Usage of Training Data====
 
====Usage of Training Data====
Line 240: Line 250:
 
* 2023-03: [https://arxiv.org/abs/2310.03714 DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines] ([https://github.com/stanfordnlp/dspy code]: Programming—not prompting—Foundation Models)
 
* 2023-03: [https://arxiv.org/abs/2310.03714 DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines] ([https://github.com/stanfordnlp/dspy code]: Programming—not prompting—Foundation Models)
 
* 2024-05: [https://arxiv.org/abs/2305.03495 Automatic Prompt Optimization with "Gradient Descent" and Beam Search]
 
* 2024-05: [https://arxiv.org/abs/2305.03495 Automatic Prompt Optimization with "Gradient Descent" and Beam Search]
* 2024-06: [https://arxiv.org/abs/2406.07496 TextGrad: Automatic "Differentiation" via Text] (gradient backpropagation through text)
+
* 2024-06: [https://arxiv.org/abs/2406.07496 TextGrad: Automatic "Differentiation" via Text] (gradient backpropagation through text, analogous to gradient descent)
 
* 2024-06: [https://arxiv.org/abs/2406.18532 Symbolic Learning Enables Self-Evolving Agents] (optimize LLM frameworks)
 
* 2024-06: [https://arxiv.org/abs/2406.18532 Symbolic Learning Enables Self-Evolving Agents] (optimize LLM frameworks)
  

Latest revision as of 11:45, 5 March 2025

Reviews

Prompt Engineering

Thought Templates

Automatic Prompt Optimization

Fine Tuning

Proactive Search

Compute expended after training, but before inference.

Training Data (Data Refinement, Synthetic Data)

Re-captioning

Generate consistent plans/thoughts

  • 2024-08: Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers (code)
    • (Microsoft) rStar is a self-play mutual reasoning approach. A small model adds to MCTS using some defined reasoning heuristics. Mutually consistent trajectories can be emphasized.
  • 2024-09: Self-Harmonized Chain of Thought
    • Produce refined chain-of-thought style solutions/prompts for diverse problems. Given a large set of problems/questions, first aggregated semantically, then apply zero-shot chain-of-thought to each problem. Then cross-pollinate between proposed solutions to similar problems, looking for refined and generalize solutions.
  • 2024-11: LLMs Do Not Think Step-by-step In Implicit Reasoning
    • They argue that models trained to reproduce CoT outputs do not, internally, perform stepwise reasoning (with intermediate representations); this suggests that explicit CoT could be superior to implicit CoT.

Sampling

Automated prompt generation

Distill inference-time-compute into model

CoT reasoning model

See also: AI tools > LLM > Open-weights LLM > Reasoning

Scaling

Inference Time Compute

Methods

Review

In context learning (ICL), search, and other inference-time methods

Inference-time Sampling

Inference-time Gradient

Self-prompting

Retrieval or Memory

In-context thought

Naive multi-LLM (verification, majority voting, best-of-N, etc.)

Multi-LLM (multiple comparisons, branching, etc.)

Iteration (e.g. neural-like layered blocks)

Iterative reasoning via graphs

Monte Carlo Tree Search (MCTS)

Other Search

Chain-of-Thought Reasoning

Meta-methods

Analysis

Scaling

Usage of Reasoning Compute

Usage of Training Data

  • 2025-02: LIMO: Less is More for Reasoning (surprisingly easy generalization, from very few reasoning training examples; model can go from knowledge-retrieval to diverse reasoning using curated examples)

Theory

Expending compute works

Compute.png

Pragmatics

Code for Inference-time Compute

  • optillm: Inference proxy which implements state-of-the-art techniques to improve accuracy and performance of LLMs (improve reasoning over coding, logical and mathematical queries)

Interact with Environment

Memory

Tool Use

Integrated

Multi-agent Effort (and Emergent Intelligence)

ML-like Optimization of LLM Setup

Limitations/Requirements

Creativity

See Also