Revision as of 10:46, 27 December 2024

Novel Tokenization and/or Sampling

TBD

2024-11: Microsoft: DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving: LLMs invent their own inter-communication language
2024-12: Meta: Training Large Language Models to Reason in a Continuous Latent Space: feeding the latent representation directly back into the model, instead of tokenizing intermediate thoughts (Chain of Continuous Thought, a.k.a. Coconut)
2024-12: Meta: Large Concept Models: Language Modeling in a Sentence Representation Space: train a model that operates at a higher level of abstraction than typical word/token LLMs; model operates in a space of concept embeddings (more akin to full sentences than individual words)
2024-12: Meta: Byte Latent Transformer: Patches Scale Better Than Tokens: Instead of tokenization, dynamically convert input byte-stream into patches, yielding gains in compute efficiency, with minimal loss in performance
2024-12: Compressed Chain of Thought: Efficient Reasoning Through Dense Representations
2024-12: Google DeepMind: Deliberation in Latent Space via Differentiable Cache Augmentation
2024-12: LANG-JEPA: Learning to Think in Latent Space

@@ Line 6: / Line 6: @@
 See: [[AI_Agents#Increasing_AI_Agent_Intelligence|Increasing AI Agent Intelligence]]
-=Episodic Memory=
+=Memory=
+==Context Length==
+TBD
+==Working Memory==
+* 2024-12: [https://www.arxiv.org/abs/2412.18069 Improving Factuality with Explicit Working Memory]
+==Episodic Memory==
 * 2024-03: [https://arxiv.org/abs/2403.11901 Larimar: Large Language Models with Episodic Memory Control]