Difference between revisions of "Science Agents"

From GISAXS
Jump to: navigation, search
(Science Datasets)
(Genuine Discoveries)
 
(20 intermediate revisions by the same user not shown)
Line 56: Line 56:
  
 
==AI/ML Methods tailored to Science==
 
==AI/ML Methods tailored to Science==
 +
===Science Foundation Models===
 +
* 2025-08: [https://arxiv.org/abs/2508.15763 Intern-S1: A Scientific Multimodal Foundation Model]
 +
* 2025-11: [https://pubs.aip.org/aip/jcp/article/163/18/184110/3372267/A-foundation-model-for-atomistic-materials A foundation model for atomistic materials chemistry]
 +
* 2025-11: [https://arxiv.org/abs/2511.15684 Walrus: A Cross-Domain Foundation Model for Continuum Dynamics]
 +
 
===Regression (Data Fitting)===
 
===Regression (Data Fitting)===
 
* 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it)
 
* 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it)
Line 80: Line 85:
 
* [https://www.radical-ai.com/ Radical AI]: Material simulation/design
 
* [https://www.radical-ai.com/ Radical AI]: Material simulation/design
 
* [https://www.autoscience.ai/ Autoscience] ([https://www.autoscience.ai/blog/meet-carl-the-first-ai-system-to-produce-academically-peer-reviewed-research Carl])
 
* [https://www.autoscience.ai/ Autoscience] ([https://www.autoscience.ai/blog/meet-carl-the-first-ai-system-to-produce-academically-peer-reviewed-research Carl])
 +
* [https://periodic.com/ Periodic Labs]
 +
* [https://edisonscientific.com/articles/announcing-edison-scientific Edison Scientific] (drug discovery, spinoff from [https://www.futurehouse.org/ FutureHouse])
 +
 
====Bio====
 
====Bio====
 
* [https://www.bioptimus.com/ Bioptimus]
 
* [https://www.bioptimus.com/ Bioptimus]
Line 101: Line 109:
 
* 2025-04: [https://arxiv.org/abs/2504.08051 Compositional Flows for 3D Molecule and Synthesis Pathway Co-design]
 
* 2025-04: [https://arxiv.org/abs/2504.08051 Compositional Flows for 3D Molecule and Synthesis Pathway Co-design]
 
* 2025-07: [https://arxiv.org/abs/2507.07456 General purpose models for the chemical sciences]
 
* 2025-07: [https://arxiv.org/abs/2507.07456 General purpose models for the chemical sciences]
 +
* 2025-11: [https://chemrxiv.org/engage/chemrxiv/article-details/690357d9a482cba122e366b6 ChemTorch: A Deep Learning Framework for Benchmarking and Developing Chemical Reaction Property Prediction Models]
  
 
===Biology===
 
===Biology===
Line 117: Line 126:
 
* 2025-03: [https://arxiv.org/abs/2503.16351 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences]
 
* 2025-03: [https://arxiv.org/abs/2503.16351 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences]
 
* 2025-08: RosettaFold 3: [https://www.biorxiv.org/content/10.1101/2025.08.14.670328v2 Accelerating Biomolecular Modeling with AtomWorks and RF3]
 
* 2025-08: RosettaFold 3: [https://www.biorxiv.org/content/10.1101/2025.08.14.670328v2 Accelerating Biomolecular Modeling with AtomWorks and RF3]
 +
* 2025-09: [https://www.biorxiv.org/content/10.1101/2025.09.12.675911v1 Generative design of novel bacteriophages with genome language models]
 +
* 2025-10: [https://www.science.org/doi/10.1126/science.adu8578 Strengthening nucleic acid biosecurity screening against generative protein design tools]
  
 
===Medicine===
 
===Medicine===
Line 159: Line 170:
 
* 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github])
 
* 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github])
 
* 2025-07: [https://arxiv.org/abs/2507.01903 AI4Research: A Survey of Artificial Intelligence for Scientific Research]
 
* 2025-07: [https://arxiv.org/abs/2507.01903 AI4Research: A Survey of Artificial Intelligence for Scientific Research]
 +
* 2025-08: [https://arxiv.org/abs/2508.14111 From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery]
  
 
==Specific==
 
==Specific==
Line 173: Line 185:
 
* 2025-04-08: Sakana: [https://pub.sakana.ai/ai-scientist-v2/paper/paper.pdf The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search] ([https://github.com/SakanaAI/AI-Scientist-v2 code])
 
* 2025-04-08: Sakana: [https://pub.sakana.ai/ai-scientist-v2/paper/paper.pdf The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search] ([https://github.com/SakanaAI/AI-Scientist-v2 code])
 
* 2025-07: [https://arxiv.org/abs/2507.14267 DREAMS: Density Functional Theory Based Research Engine for Agentic Materials Simulation]
 
* 2025-07: [https://arxiv.org/abs/2507.14267 DREAMS: Density Functional Theory Based Research Engine for Agentic Materials Simulation]
 +
* 2025-11: [https://arxiv.org/abs/2511.02824 Kosmos: An AI Scientist for Autonomous Discovery]
 +
* 2025-11: [https://arxiv.org/abs/2511.08151 SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning]
  
 
==Science Multi-Agent Setups==
 
==Science Multi-Agent Setups==
Line 200: Line 214:
 
===Chemistry===
 
===Chemistry===
 
* 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist)
 
* 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist)
 +
* 2024-09: [https://www.pnnl.gov/main/publications/external/technical_reports/PNNL-36692.pdf PNNL ChemAIst V0.2]
 
* 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models]
 
* 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models]
 
* 2025-06: [https://paper.ether0.ai/ Training a Scientific Reasoning Model for Chemistry]
 
* 2025-06: [https://paper.ether0.ai/ Training a Scientific Reasoning Model for Chemistry]
Line 209: Line 224:
 
==LLMs Optimized for Science==
 
==LLMs Optimized for Science==
 
* 2022-11: [https://arxiv.org/abs/2211.09085 Galactica: A Large Language Model for Science]
 
* 2022-11: [https://arxiv.org/abs/2211.09085 Galactica: A Large Language Model for Science]
* 2024-12: [https://www.nature.com/articles/s41467-024-54639-7 Crystal structure generation with autoregressive large language modeling
+
* 2024-12: [https://www.nature.com/articles/s41467-024-54639-7 Crystal structure generation with autoregressive large language modeling]
 
* 2025-02: [https://arxiv.org/abs/2502.13107 MatterChat: A Multi-Modal LLM for Material Science]
 
* 2025-02: [https://arxiv.org/abs/2502.13107 MatterChat: A Multi-Modal LLM for Material Science]
 
* 2025-03: [https://arxiv.org/abs/2503.17604 OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery]
 
* 2025-03: [https://arxiv.org/abs/2503.17604 OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery]
Line 238: Line 253:
 
* [https://github.com/blaiszik/awesome-matchem-datasets/ Awesome Materials & Chemistry Datasets]
 
* [https://github.com/blaiszik/awesome-matchem-datasets/ Awesome Materials & Chemistry Datasets]
 
* NIST [https://jarvis.nist.gov/ Jarvis] (simulations)
 
* NIST [https://jarvis.nist.gov/ Jarvis] (simulations)
 +
 +
=Genuine Discoveries=
 +
* 2025-11: [https://cdn.openai.com/pdf/4a25f921-e4e0-479a-9b38-5367b47e8fd0/early-science-acceleration-experiments-with-gpt-5.pdf Early science acceleration experiments with GPT-5]
 +
 +
* '''Math:'''
 +
** 2023-07: [https://www.nature.com/articles/s41586-023-06004-9?utm_source=chatgpt.com Faster sorting algorithms discovered using deep reinforcement learning]
 +
** 2025-11: [https://arxiv.org/abs/2511.02864 Mathematical exploration and discovery at scale]
 +
** 2025-11: [https://www.nature.com/articles/s41586-025-09833-y Olympiad-level formal mathematical reasoning with reinforcement learning]
 +
* '''Physics assistance:'''
 +
** 2025-03: [https://arxiv.org/abs/2503.23758 Exact solution of the frustrated Potts model with next-nearest-neighbor interactions in one dimension via AI bootstrapping]
 +
* '''Literature exploration:'''
 +
** 2025-11: [https://arxiv.org/abs/2511.02824 Kosmos: An AI Scientist for Autonomous Discovery]
 +
*** [https://platform.edisonscientific.com/kosmos/c4bdef64-5e9b-43b9-a365-592dd1ed7587 Nucleotide metabolism in hypothermia]
 +
*** [https://platform.edisonscientific.com/kosmos/1fdbf827-be65-4d97-9b66-bf0da600091a Determinant of perovskite solar-cell failure]
 +
*** [https://platform.edisonscientific.com/kosmos/4fb3fbdb-c449-4064-9aa6-ff4ec53131d8 Log-normal connectivity in neural networks]
 +
*** [https://platform.edisonscientific.com/kosmos/c6849232-5858-4634-adf5-83780afbe3db SOD2 as driver of myocardial fibrosis]
 +
*** [https://platform.edisonscientific.com/kosmos/abac07da-a6bb-458f-b0ba-ef08f1be617e Protective variant of SSR1 in type 2 diabetes]
 +
*** [https://platform.edisonscientific.com/kosmos/a770052b-2334-4bbe-b086-5149e0f03d99 Temporal ordering in Alzheimer’s disease]
 +
*** [https://platform.edisonscientific.com/kosmos/28c427d2-be31-48b5-b272-28d5a1e3ea5c Mechanism of neuron vulnerability in aging]
 +
* '''Bio design:'''
 +
** 2023-07: [https://www.nature.com/articles/s41586-023-06415-8 De novo design of protein structure and function with RFdiffusion]
 +
** 2025-11: [https://www.nature.com/articles/s41586-025-09721-5 Atomically accurate de novo design of antibodies with RFdiffusion]
 +
** 2025-11: [https://x.com/GoogleDeepMind/status/1993350293703016451?s=20 ]
 +
* '''Material Discovery:'''
 +
** 2023-11: [https://deepmind.google/blog/alphafold-five-years-of-impact/ AlphaFold: Five years of impact]
  
 
=See Also=
 
=See Also=
 
* [[AI agents]]
 
* [[AI agents]]
 
* [https://nanobot.chat/ Nanobot.chat]: Intelligent AI for the labnetwork @ mtl.mit.edu forum
 
* [https://nanobot.chat/ Nanobot.chat]: Intelligent AI for the labnetwork @ mtl.mit.edu forum

Latest revision as of 10:03, 26 November 2025

AI Use-cases for Science

Literature

LLM extract data from papers

AI finding links in literature

(Pre) Generate Articles

Explanation

Autonomous Ideation

Adapting LLMs to Science

AI/LLM Control of Scientific Instruments/Facilities

AI/ML Methods tailored to Science

Science Foundation Models

Regression (Data Fitting)

Tabular Classification/Regression

Symbolic Regression

Literature Discovery

Commercial

Bio

AI/ML Methods in Science

Imaging

Materials

Chemistry

Biology

Medicine

See: AI_Agents#Medicine

Successes

AI/ML Methods co-opted for Science

Mechanistic Interpretability

Train large model on science data. Then apply mechanistic interpretability (e.g. sparse autoencoders, SAE) to the feature/activation space.

Uncertainty

Science Benchmarks

Science Agents

Reviews

Specific

Science Multi-Agent Setups

AI Science Systems

Inorganic Materials Discovery

Materials Characterization

Chemistry

Bio

LLMs Optimized for Science

Impact of AI in Science

Related Tools

Literature Search

Data Visualization

Generative

Chemistry

Science Datasets

Genuine Discoveries

See Also