Difference between revisions of "ERI"

From GISAXS
Jump to: navigation, search
(Created page with "'''ERI''' =Research Thrusts= ==Models== '''How to adapt frontier methods and foundation models to science?''' # Topical fine-tuning # Tool-use # Advanced retrieval-augmented...")
 
(Research Thrusts)
Line 9: Line 9:
 
# Advanced retrieval-augmented generation (RAG++)
 
# Advanced retrieval-augmented generation (RAG++)
 
#* '''Novel:''' Pre-generation: Agents continually add content to RAG corpus. ("Pre-thinking" across many vectors.)
 
#* '''Novel:''' Pre-generation: Agents continually add content to RAG corpus. ("Pre-thinking" across many vectors.)
# Specialized tokenization/embedding (xVal, [IDK])
+
# Science-adapted tokenization/embedding (xVal, [IDK])
 
# Specialized sampling
 
# Specialized sampling
 
#* Entropy sampling: measure uncertainty of CoT trajectories
 
#* Entropy sampling: measure uncertainty of CoT trajectories
Line 16: Line 16:
 
#** text-to-tool (e.g. math)
 
#** text-to-tool (e.g. math)
 
#** test-to-field (integrate non-textual FM)
 
#** test-to-field (integrate non-textual FM)
 
 
  
 
==Agents==
 
==Agents==
 
'''How to make AI agents smarter?'''
 
'''How to make AI agents smarter?'''
 +
# Iteration schemes (loops, blocks)
 +
## Autonomous ideation:
 +
##* '''Novel:''' Treat ideation as an AE problem in a semantic embedding space.
 +
 +
# Memory
 +
# Thought-templates, thought-flows
  
 
==Exocortex==
 
==Exocortex==
 
'''What is the right architecture for AI swarms?'''
 
'''What is the right architecture for AI swarms?'''
 +
# Interaction schemes
 +
## Test options, identify match between science task and scheme
 +
## Treat interaction graph as ML optimization problem
 +
## '''Novel:''' Map-spatial: Use a map (e.g. of BNL) to localize docs/resources/etc.
 +
## '''Novel:''' Pseudo-spatial: Use position in embedding space to localize everything
 +
## '''Novel:''' Dynamic-pseudo-spatial: Allow the space to be learned and updated
 +
# Establish benchmarks/challenges/validations
 +
  
 
==Infrastructure==
 
==Infrastructure==

Revision as of 15:58, 11 January 2025

ERI

Research Thrusts

Models

How to adapt frontier methods and foundation models to science?

  1. Topical fine-tuning
  2. Tool-use
  3. Advanced retrieval-augmented generation (RAG++)
    • Novel: Pre-generation: Agents continually add content to RAG corpus. ("Pre-thinking" across many vectors.)
  4. Science-adapted tokenization/embedding (xVal, [IDK])
  5. Specialized sampling
    • Entropy sampling: measure uncertainty of CoT trajectories
    • Novel: Handoff sampling:
      • text-to-text (specialization, creativity, etc.)
      • text-to-tool (e.g. math)
      • test-to-field (integrate non-textual FM)

Agents

How to make AI agents smarter?

  1. Iteration schemes (loops, blocks)
    1. Autonomous ideation:
      • Novel: Treat ideation as an AE problem in a semantic embedding space.
  1. Memory
  2. Thought-templates, thought-flows

Exocortex

What is the right architecture for AI swarms?

  1. Interaction schemes
    1. Test options, identify match between science task and scheme
    2. Treat interaction graph as ML optimization problem
    3. Novel: Map-spatial: Use a map (e.g. of BNL) to localize docs/resources/etc.
    4. Novel: Pseudo-spatial: Use position in embedding space to localize everything
    5. Novel: Dynamic-pseudo-spatial: Allow the space to be learned and updated
  2. Establish benchmarks/challenges/validations


Infrastructure

Architecture

What software architecture is needed?

  1. Code for scaffolding
  2. Scheme for inter-agent messaging (plain English w/ pointers, etc.)
  3. Data management

Hardware

How to implement inference-time compute for exocortex?

  1. Heterogeneous hardware
  2. Elastic (combine local & cloud)
  3. Workflow management

Human-Computer Interaction (HCI)

What should the HCI be?