Difference between revisions of "AI research trends"

From GISAXS
Jump to: navigation, search
(Context Length)
(Novel Tokenization and/or Sampling)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
=Novel Tokenization and/or Sampling=
 
=Novel Tokenization and/or Sampling=
 +
* 2024-04: [https://arxiv.org/abs/2404.19737 Better & Faster Large Language Models via Multi-token Prediction]
 
* 2024-10: [https://github.com/xjdr-alt/entropix entropix: Entropy Based Sampling and Parallel CoT Decoding]
 
* 2024-10: [https://github.com/xjdr-alt/entropix entropix: Entropy Based Sampling and Parallel CoT Decoding]
* 2024-12: [https://arxiv.org/abs/2412.06676 I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token]
+
* 2024-10: [https://arxiv.org/abs/2410.01104 softmax is not enough (for sharp out-of-distribution)]
 +
* 2024-12: [https://arxiv.org/abs/2412.06676 I Don't Know: Explicit Modeling of Uncertainty with an <nowiki>[IDK]</nowiki> Token]
  
 
=System 2 Reasoning=
 
=System 2 Reasoning=
Line 24: Line 26:
 
* 2024-Apr-12: Meta et al. demonstrate [https://arxiv.org/abs/2404.08801 Megalodon] that enables infinite context via a more efficient architecture
 
* 2024-Apr-12: Meta et al. demonstrate [https://arxiv.org/abs/2404.08801 Megalodon] that enables infinite context via a more efficient architecture
 
* 2024-Apr-14: Google presents [https://arxiv.org/abs/2404.09173 TransformerFAM], which leverages a feedback loop so it attends to its own latent representations, acting as working memory and provides effectively infinite context
 
* 2024-Apr-14: Google presents [https://arxiv.org/abs/2404.09173 TransformerFAM], which leverages a feedback loop so it attends to its own latent representations, acting as working memory and provides effectively infinite context
 +
 +
==Retrieval beyond RAG==
 +
See also: [[AI_tools#Retrieval_Augmented_Generation_.28RAG.29|AI tools: Retrieval Augmented Generation (RAG)]]
 +
* 2024-12: [https://arxiv.org/abs/2412.11536 Let your LLM generate a few tokens and you will reduce the need for retrieval]
 +
* 2024-12: [https://arxiv.org/abs/2412.11919 RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation]
  
 
==Working Memory==
 
==Working Memory==

Latest revision as of 10:42, 30 December 2024

Novel Tokenization and/or Sampling

System 2 Reasoning

See: Increasing AI Agent Intelligence

Memory

Context Length

Retrieval beyond RAG

See also: AI tools: Retrieval Augmented Generation (RAG)

Working Memory

Episodic Memory

Neural (non-token) Latent Representation