Difference between revisions of "Science Agents"

From GISAXS
Jump to: navigation, search
(AI Science Systems)
(AI/ML Methods tailored to Science)
 
(5 intermediate revisions by the same user not shown)
Line 56: Line 56:
  
 
==AI/ML Methods tailored to Science==
 
==AI/ML Methods tailored to Science==
 +
===Science Foundation Models===
 +
* 2025-08: [https://arxiv.org/abs/2508.15763 Intern-S1: A Scientific Multimodal Foundation Model]
 +
 
===Regression (Data Fitting)===
 
===Regression (Data Fitting)===
 
* 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it)
 
* 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it)
Line 99: Line 102:
 
* 2025-02: [https://www.nature.com/articles/s42256-025-00994-z Large language models for scientific discovery in molecular property prediction]
 
* 2025-02: [https://www.nature.com/articles/s42256-025-00994-z Large language models for scientific discovery in molecular property prediction]
 
* [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.)
 
* [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.)
 +
* 2025-04: [https://arxiv.org/abs/2504.08051 Compositional Flows for 3D Molecule and Synthesis Pathway Co-design]
 
* 2025-07: [https://arxiv.org/abs/2507.07456 General purpose models for the chemical sciences]
 
* 2025-07: [https://arxiv.org/abs/2507.07456 General purpose models for the chemical sciences]
  
Line 115: Line 119:
 
* [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.)
 
* [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.)
 
* 2025-03: [https://arxiv.org/abs/2503.16351 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences]
 
* 2025-03: [https://arxiv.org/abs/2503.16351 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences]
 +
* 2025-08: RosettaFold 3: [https://www.biorxiv.org/content/10.1101/2025.08.14.670328v2 Accelerating Biomolecular Modeling with AtomWorks and RF3]
  
 
===Medicine===
 
===Medicine===
Line 157: Line 162:
 
* 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github])
 
* 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github])
 
* 2025-07: [https://arxiv.org/abs/2507.01903 AI4Research: A Survey of Artificial Intelligence for Scientific Research]
 
* 2025-07: [https://arxiv.org/abs/2507.01903 AI4Research: A Survey of Artificial Intelligence for Scientific Research]
 +
* 2025-08: [https://arxiv.org/abs/2508.14111 From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery]
  
 
==Specific==
 
==Specific==
Line 198: Line 204:
 
===Chemistry===
 
===Chemistry===
 
* 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist)
 
* 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist)
 +
* 2024-09: [https://www.pnnl.gov/main/publications/external/technical_reports/PNNL-36692.pdf PNNL ChemAIst V0.2]
 
* 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models]
 
* 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models]
 
* 2025-06: [https://paper.ether0.ai/ Training a Scientific Reasoning Model for Chemistry]
 
* 2025-06: [https://paper.ether0.ai/ Training a Scientific Reasoning Model for Chemistry]
Line 233: Line 240:
  
 
=Science Datasets=
 
=Science Datasets=
 +
* [https://datasetsearch.research.google.com/ Google Dataset Search]
 
* [https://github.com/blaiszik/awesome-matchem-datasets/ Awesome Materials & Chemistry Datasets]
 
* [https://github.com/blaiszik/awesome-matchem-datasets/ Awesome Materials & Chemistry Datasets]
 
* NIST [https://jarvis.nist.gov/ Jarvis] (simulations)
 
* NIST [https://jarvis.nist.gov/ Jarvis] (simulations)

Latest revision as of 12:07, 23 August 2025

AI Use-cases for Science

Literature

LLM extract data from papers

AI finding links in literature

(Pre) Generate Articles

Explanation

Autonomous Ideation

Adapting LLMs to Science

AI/LLM Control of Scientific Instruments/Facilities

AI/ML Methods tailored to Science

Science Foundation Models

Regression (Data Fitting)

Tabular Classification/Regression

Symbolic Regression

Literature Discovery

Commercial

Bio

AI/ML Methods in Science

Imaging

Materials

Chemistry

Biology

Medicine

See: AI_Agents#Medicine

Successes

AI/ML Methods co-opted for Science

Mechanistic Interpretability

Train large model on science data. Then apply mechanistic interpretability (e.g. sparse autoencoders, SAE) to the feature/activation space.

Uncertainty

Science Benchmarks

Science Agents

Reviews

Specific

Science Multi-Agent Setups

AI Science Systems

Inorganic Materials Discovery

Materials Characterization

Chemistry

Bio

LLMs Optimized for Science

Impact of AI in Science

Related Tools

Literature Search

Data Visualization

Generative

Chemistry

Science Datasets

See Also