Revision as of 16:53, 15 January 2025

ERI

Research Thrusts

How to adapt frontier methods and foundation models to science?

Topical fine-tuning
Tool-use
Advanced retrieval-augmented generation (RAG++)
- Novel: Pre-generation: Agents continually add content to RAG corpus. ("Pre-thinking" across many vectors.)
Science-adapted tokenization/embedding (xVal, [IDK])
Specialized sampling
- Entropy sampling: measure uncertainty of CoT trajectories
- Novel: Handoff sampling:
  - Useful for:
    - text-to-text (specialization, creativity, etc.)
    - text-to-tool (e.g. math)
    - test-to-field (integrate non-textual FM)
  - Implementation:
    - MI-SAE on both spaces, find matches (or maybe just "analogies"?)

How to make AI agents smarter?

Iteration schemes (loops, blocks)
1. Thinking:
  - Blocky/neural: Define architecture, allow system to pick hyper-parameters
2. Autonomous ideation:
  - Novel: Treat ideation as an AE problem in a semantic embedding space.
Memory
Thought-templates, thought-flows

What is the right architecture for AI swarms?

What software architecture is needed?

How to implement inference-time compute for exocortex?

What should the HCI be?

@@ Line 45: / Line 45: @@
 ## Treat interaction graph as ML optimization problem
 ## '''Novel:''' Map-spatial: Use a map (e.g. of BNL) to localize docs/resources/etc.
-## '''Novel:''' Pseudo-spatial: Use position in embedding space to localize everything
+## '''Novel:''' Pseudo-spatial: Use position in embedding space to localize everything. Evolving state (velocity/momentum) of agent carries information.
 ## '''Novel:''' Dynamic-pseudo-spatial: Allow the space to be learned and updated; directions in embedding space can dictate information flow
 # Establish benchmarks/challenges/validations