Difference between revisions of "Science Agents"

From GISAXS
Jump to: navigation, search
(Commercial)
(Science Benchmarks)
 
Line 93: Line 93:
 
* 2025-01: [https://agi.safe.ai/ Humanity's Last Exam]
 
* 2025-01: [https://agi.safe.ai/ Humanity's Last Exam]
 
* [https://github.com/OSU-NLP-Group/ScienceAgentBench ScienceAgentBench]
 
* [https://github.com/OSU-NLP-Group/ScienceAgentBench ScienceAgentBench]
 +
* 2025-02: [https://arxiv.org/abs/2502.20309 EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants]
 
* 2025-03: [https://huggingface.co/datasets/futurehouse/BixBench BixBench]: Novel hypotheses (accept/reject)
 
* 2025-03: [https://huggingface.co/datasets/futurehouse/BixBench BixBench]: Novel hypotheses (accept/reject)
  

Latest revision as of 12:38, 13 March 2025

AI Use-cases for Science

Literature

LLM extract data from papers

AI finding links in literature

Explanation

Autonomous Ideation

Adapting LLMs to Science

AI/ML Methods tailored to Science

Regression (Data Fitting)

Tabular Classification/Regression

Symbolic Regression

Literature Discovery

Commercial

AI/ML Methods in Science

Chemistry

Biology

Successes

AI/ML Methods co-opted for Science

Mechanistic Interpretability

Train large model on science data. Then apply mechanistic interpretability (e.g. sparse autoencoders, SAE) to the feature/activation space.

Uncertainty

Science Benchmarks

Science Agents

Reviews

Specific

Science Multi-Agent Setups

AI Science Systems

Inorganic Materials Discovery

Chemistry

Impact of AI in Science

Related Tools

Literature Search

Data Visualization

See Also