Difference between revisions of "ERI"
KevinYager (talk | contribs) (Created page with "'''ERI''' =Research Thrusts= ==Models== '''How to adapt frontier methods and foundation models to science?''' # Topical fine-tuning # Tool-use # Advanced retrieval-augmented...") |
KevinYager (talk | contribs) (→Research Thrusts) |
||
Line 9: | Line 9: | ||
# Advanced retrieval-augmented generation (RAG++) | # Advanced retrieval-augmented generation (RAG++) | ||
#* '''Novel:''' Pre-generation: Agents continually add content to RAG corpus. ("Pre-thinking" across many vectors.) | #* '''Novel:''' Pre-generation: Agents continually add content to RAG corpus. ("Pre-thinking" across many vectors.) | ||
− | # | + | # Science-adapted tokenization/embedding (xVal, [IDK]) |
# Specialized sampling | # Specialized sampling | ||
#* Entropy sampling: measure uncertainty of CoT trajectories | #* Entropy sampling: measure uncertainty of CoT trajectories | ||
Line 16: | Line 16: | ||
#** text-to-tool (e.g. math) | #** text-to-tool (e.g. math) | ||
#** test-to-field (integrate non-textual FM) | #** test-to-field (integrate non-textual FM) | ||
− | |||
− | |||
==Agents== | ==Agents== | ||
'''How to make AI agents smarter?''' | '''How to make AI agents smarter?''' | ||
+ | # Iteration schemes (loops, blocks) | ||
+ | ## Autonomous ideation: | ||
+ | ##* '''Novel:''' Treat ideation as an AE problem in a semantic embedding space. | ||
+ | |||
+ | # Memory | ||
+ | # Thought-templates, thought-flows | ||
==Exocortex== | ==Exocortex== | ||
'''What is the right architecture for AI swarms?''' | '''What is the right architecture for AI swarms?''' | ||
+ | # Interaction schemes | ||
+ | ## Test options, identify match between science task and scheme | ||
+ | ## Treat interaction graph as ML optimization problem | ||
+ | ## '''Novel:''' Map-spatial: Use a map (e.g. of BNL) to localize docs/resources/etc. | ||
+ | ## '''Novel:''' Pseudo-spatial: Use position in embedding space to localize everything | ||
+ | ## '''Novel:''' Dynamic-pseudo-spatial: Allow the space to be learned and updated | ||
+ | # Establish benchmarks/challenges/validations | ||
+ | |||
==Infrastructure== | ==Infrastructure== |
Revision as of 15:58, 11 January 2025
ERI
Contents
Research Thrusts
Models
How to adapt frontier methods and foundation models to science?
- Topical fine-tuning
- Tool-use
- Advanced retrieval-augmented generation (RAG++)
- Novel: Pre-generation: Agents continually add content to RAG corpus. ("Pre-thinking" across many vectors.)
- Science-adapted tokenization/embedding (xVal, [IDK])
- Specialized sampling
- Entropy sampling: measure uncertainty of CoT trajectories
- Novel: Handoff sampling:
- text-to-text (specialization, creativity, etc.)
- text-to-tool (e.g. math)
- test-to-field (integrate non-textual FM)
Agents
How to make AI agents smarter?
- Iteration schemes (loops, blocks)
- Autonomous ideation:
- Novel: Treat ideation as an AE problem in a semantic embedding space.
- Autonomous ideation:
- Memory
- Thought-templates, thought-flows
Exocortex
What is the right architecture for AI swarms?
- Interaction schemes
- Test options, identify match between science task and scheme
- Treat interaction graph as ML optimization problem
- Novel: Map-spatial: Use a map (e.g. of BNL) to localize docs/resources/etc.
- Novel: Pseudo-spatial: Use position in embedding space to localize everything
- Novel: Dynamic-pseudo-spatial: Allow the space to be learned and updated
- Establish benchmarks/challenges/validations
Infrastructure
Architecture
What software architecture is needed?
- Code for scaffolding
- Scheme for inter-agent messaging (plain English w/ pointers, etc.)
- Data management
Hardware
How to implement inference-time compute for exocortex?
- Heterogeneous hardware
- Elastic (combine local & cloud)
- Workflow management
Human-Computer Interaction (HCI)
What should the HCI be?