Difference between revisions of "AI understanding"

From GISAXS
Jump to: navigation, search
(Information Processing)
(Other)
 
(6 intermediate revisions by the same user not shown)
Line 6: Line 6:
 
* 2021-12Dec: Anthropic: [https://transformer-circuits.pub/2021/framework/index.html A Mathematical Framework for Transformer Circuits]
 
* 2021-12Dec: Anthropic: [https://transformer-circuits.pub/2021/framework/index.html A Mathematical Framework for Transformer Circuits]
 
* 2022-09Sep: [https://arxiv.org/abs/2211.00593 Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small]
 
* 2022-09Sep: [https://arxiv.org/abs/2211.00593 Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small]
 +
* 2023-01Jan: [https://arxiv.org/abs/2301.05062 Tracr: Compiled Transformers as a Laboratory for Interpretability] ([https://github.com/google-deepmind/tracr code])
 
* 2024-07Jul: Anthropic: [https://transformer-circuits.pub/2024/july-update/index.html Circuits Update]
 
* 2024-07Jul: Anthropic: [https://transformer-circuits.pub/2024/july-update/index.html Circuits Update]
  
Line 79: Line 80:
 
* [https://iopscience.iop.org/article/10.1088/1748-9326/ad2891 Reliable precipitation nowcasting using probabilistic diffusion models]. Generation of precipitation map imagery is predictive of actual future weather; implies model is learning scientifically-relevant modeling.
 
* [https://iopscience.iop.org/article/10.1088/1748-9326/ad2891 Reliable precipitation nowcasting using probabilistic diffusion models]. Generation of precipitation map imagery is predictive of actual future weather; implies model is learning scientifically-relevant modeling.
 
* [https://arxiv.org/abs/2405.07987 The Platonic Representation Hypothesis]: Different models (including across modalities) are converging to a consistent world model.
 
* [https://arxiv.org/abs/2405.07987 The Platonic Representation Hypothesis]: Different models (including across modalities) are converging to a consistent world model.
 +
* [https://arxiv.org/abs/2501.00070 ICLR: In-Context Learning of Representations]
  
 
===Theory of Mind===
 
===Theory of Mind===
Line 93: Line 95:
 
* 2024-02: [https://arxiv.org/abs/2402.12875 Chain of Thought Empowers Transformers to Solve Inherently Serial Problems]: Proves that transformers can solve any problem, if they can generate sufficient intermediate tokens
 
* 2024-02: [https://arxiv.org/abs/2402.12875 Chain of Thought Empowers Transformers to Solve Inherently Serial Problems]: Proves that transformers can solve any problem, if they can generate sufficient intermediate tokens
 
* 2024-11: [https://arxiv.org/abs/2411.01992 Ask, and it shall be given: Turing completeness of prompting]
 
* 2024-11: [https://arxiv.org/abs/2411.01992 Ask, and it shall be given: Turing completeness of prompting]
 +
 +
===Generalization===
 +
* 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]
  
 
===Tests of Resilience to Dropouts/etc.===
 
===Tests of Resilience to Dropouts/etc.===
Line 115: Line 120:
 
* 2024-11: [https://arxiv.org/abs/2411.12580 Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models]: LLMs learn reasoning by extracting procedures from training data, not by memorizing specific answers
 
* 2024-11: [https://arxiv.org/abs/2411.12580 Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models]: LLMs learn reasoning by extracting procedures from training data, not by memorizing specific answers
 
* 2024-11: [https://arxiv.org/abs/2411.15862 LLMs Do Not Think Step-by-step In Implicit Reasoning]
 
* 2024-11: [https://arxiv.org/abs/2411.15862 LLMs Do Not Think Step-by-step In Implicit Reasoning]
 +
* 2024-12: [https://arxiv.org/abs/2412.09810 The Complexity Dynamics of Grokking]
  
 
===Scaling Laws===
 
===Scaling Laws===
Line 150: Line 156:
 
=Psychology=
 
=Psychology=
 
* 2023-04: [https://arxiv.org/abs/2304.11111 Inducing anxiety in large language models can induce bias]
 
* 2023-04: [https://arxiv.org/abs/2304.11111 Inducing anxiety in large language models can induce bias]
 +
 +
==Allow LLM to think==
 +
* 2024-12: [https://arxiv.org/abs/2412.11536 Let your LLM generate a few tokens and you will reduce the need for retrieval]
 +
 +
===In-context Learning===
 +
* 2021-10: [https://arxiv.org/abs/2110.15943 MetaICL: Learning to Learn In Context]
 +
* 2022-02: [https://arxiv.org/abs/2202.12837 Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?]
 +
* 2022-08: [https://arxiv.org/abs/2208.01066 What Can Transformers Learn In-Context? A Case Study of Simple Function Classes]
 +
* 2022-11: [https://arxiv.org/abs/2211.15661 What learning algorithm is in-context learning? Investigations with linear models]
 +
* 2022-12: [https://arxiv.org/abs/2212.07677 Transformers learn in-context by gradient descent]
  
 
=See Also=
 
=See Also=

Latest revision as of 13:24, 6 January 2025

Interpretability

Mechanistic Interpretability

Semanticity

Reward Functions

Symbolic and Notation

Mathematical

Geometric

Challenges

GYe31yXXQAABwaZ.jpeg

Heuristic Understanding

Emergent Internal Model Building

Semantic Directions

Directions, e.g.: f(king)-f(man)+f(woman)=f(queen) or f(sushi)-f(Japan)+f(Italy)=f(pizza)

Task vectors:

Feature Geometry Reproduces Problem-space

Theory of Mind

Information Processing

Generalization

Tests of Resilience to Dropouts/etc.

  • 2024-02: Explorations of Self-Repair in Language Models
  • 2024-06: What Matters in Transformers? Not All Attention is Needed
    • Removing entire transformer blocks leads to significant performance degradation
    • Removing MLP layers results in significant performance degradation
    • Removing attention layers causes almost no performance degradation
    • E.g. half of attention layers are deleted (48% speed-up), leads to only 2.4% decrease in the benchmarks
  • 2024-06: The Remarkable Robustness of LLMs: Stages of Inference?
    • They intentionally break the network (swapping layers), yet it continues to work remarkably well. This suggests LLMs are quite robust, and allows them to identify different stages in processing.
    • They also use these interventions to infer what different layers are doing. They break apart the LLM transformer layers into four stages:
      • Detokenization: Raw tokens are converted into meaningful entities that take into account local context (especially using nearby tokens).
      • Feature engineering: Features are progressively refined. Factual knowledge is leveraged.
      • Prediction ensembling: Predictions (for the ultimately-selected next-token) emerge. A sort of consensus voting is used, with “prediction neurons” and "suppression neurons" playing a major role in upvoting/downvoting.
      • Residual sharpening: The semantic representations are collapsed into specific next-token predictions. There is a strong emphasis on suppression neurons eliminating options. The confidence is calibrated.
    • This structure can be thought of as two halves (being roughly dual to each other): the first half broadens (goes from distinct tokens to a rich/elaborate concept-space) and the second half collapses (goes from rich concepts to concrete token predictions).

Other

Scaling Laws

Information Processing/Storage

Tokenization

For numbers/math

Learning/Training

Failure Modes

Psychology

Allow LLM to think

In-context Learning

See Also