Difference between revisions of "AI understanding"

From GISAXS
Jump to: navigation, search
(Physical Model)
(Learning/Training)
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
=Interpretability=
 
=Interpretability=
 
* 2017-01: [https://arxiv.org/abs/1704.01444 Learning to Generate Reviews and Discovering Sentiment]
 
* 2017-01: [https://arxiv.org/abs/1704.01444 Learning to Generate Reviews and Discovering Sentiment]
 +
* 2025-02: [https://arxiv.org/abs/2502.11639 Neural Interpretable Reasoning]
  
 
==Mechanistic Interpretability==
 
==Mechanistic Interpretability==
Line 99: Line 100:
  
 
===Capturing Physics===
 
===Capturing Physics===
 +
* 2020-09: [https://arxiv.org/abs/2009.08292 Learning to Identify Physical Parameters from Video Using Differentiable Physics]
 +
* 2022-07: [https://arxiv.org/abs/2207.00419 Self-Supervised Learning for Videos: A Survey]
 
* 2025-02: Fair at Meta: [https://arxiv.org/abs/2502.11831 Intuitive physics understanding emerges from self-supervised pretraining on natural videos]
 
* 2025-02: Fair at Meta: [https://arxiv.org/abs/2502.11831 Intuitive physics understanding emerges from self-supervised pretraining on natural videos]
  
Line 179: Line 182:
 
* 2024-12: [https://arxiv.org/abs/2412.11521 On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory]
 
* 2024-12: [https://arxiv.org/abs/2412.11521 On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory]
 
* 2025-01: [https://arxiv.org/abs/2501.12391 Physics of Skill Learning]
 
* 2025-01: [https://arxiv.org/abs/2501.12391 Physics of Skill Learning]
 +
 +
==Hidden State==
 +
* 2025-02: [https://arxiv.org/abs/2502.06258 Emergent Response Planning in LLM]: They show that the latent representation contains information beyond that needed for the next token (i.e. the model learns to "plan ahead" and encode information relevant to future tokens)
  
 
=Failure Modes=
 
=Failure Modes=
Line 202: Line 208:
 
* 2025-01: [https://arxiv.org/abs/2501.18009 Large Language Models Think Too Fast To Explore Effectively]
 
* 2025-01: [https://arxiv.org/abs/2501.18009 Large Language Models Think Too Fast To Explore Effectively]
 
* 2025-01: [https://arxiv.org/abs/2501.18585 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs]
 
* 2025-01: [https://arxiv.org/abs/2501.18585 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs]
 +
* 2025-01: [https://arxiv.org/abs/2501.08156 Are DeepSeek R1 And Other Reasoning Models More Faithful?]: reasoning models can provide faithful explanations for why their reasoning is correct
  
 
=See Also=
 
=See Also=

Latest revision as of 08:47, 19 February 2025

Interpretability

Mechanistic Interpretability

Semanticity

Counter-Results

Reward Functions

Symbolic and Notation

Mathematical

Geometric

Topography

Challenges

GYe31yXXQAABwaZ.jpeg

Heuristic Understanding

Emergent Internal Model Building

Semantic Directions

Directions, e.g.: f(king)-f(man)+f(woman)=f(queen) or f(sushi)-f(Japan)+f(Italy)=f(pizza)

Task vectors:

Feature Geometry Reproduces Problem-space

Capturing Physics

Theory of Mind

Skeptical

Information Processing

Generalization

Grokking

Tests of Resilience to Dropouts/etc.

  • 2024-02: Explorations of Self-Repair in Language Models
  • 2024-06: What Matters in Transformers? Not All Attention is Needed
    • Removing entire transformer blocks leads to significant performance degradation
    • Removing MLP layers results in significant performance degradation
    • Removing attention layers causes almost no performance degradation
    • E.g. half of attention layers are deleted (48% speed-up), leads to only 2.4% decrease in the benchmarks
  • 2024-06: The Remarkable Robustness of LLMs: Stages of Inference?
    • They intentionally break the network (swapping layers), yet it continues to work remarkably well. This suggests LLMs are quite robust, and allows them to identify different stages in processing.
    • They also use these interventions to infer what different layers are doing. They break apart the LLM transformer layers into four stages:
      • Detokenization: Raw tokens are converted into meaningful entities that take into account local context (especially using nearby tokens).
      • Feature engineering: Features are progressively refined. Factual knowledge is leveraged.
      • Prediction ensembling: Predictions (for the ultimately-selected next-token) emerge. A sort of consensus voting is used, with “prediction neurons” and "suppression neurons" playing a major role in upvoting/downvoting.
      • Residual sharpening: The semantic representations are collapsed into specific next-token predictions. There is a strong emphasis on suppression neurons eliminating options. The confidence is calibrated.
    • This structure can be thought of as two halves (being roughly dual to each other): the first half broadens (goes from distinct tokens to a rich/elaborate concept-space) and the second half collapses (goes from rich concepts to concrete token predictions).

Other

Scaling Laws

Information Processing/Storage

Tokenization

For numbers/math

Learning/Training

Hidden State

  • 2025-02: Emergent Response Planning in LLM: They show that the latent representation contains information beyond that needed for the next token (i.e. the model learns to "plan ahead" and encode information relevant to future tokens)

Failure Modes

Psychology

Allow LLM to think

In-context Learning

Reasoning (CoT, etc.)

See Also