Difference between revisions of "Science Agents"
| KevinYager (talk | contribs)  (→Science Benchmarks) | KevinYager (talk | contribs)   (→LLMs Optimized for Science) | ||
| (44 intermediate revisions by the same user not shown) | |||
| Line 19: | Line 19: | ||
| * 2024-02: Wikipedia style: [https://arxiv.org/abs/2402.14207 Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models] | * 2024-02: Wikipedia style: [https://arxiv.org/abs/2402.14207 Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models] | ||
| * 2024-02: [https://arxiv.org/abs/2408.07055 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs] ([https://github.com/THUDM/LongWriter code]) | * 2024-02: [https://arxiv.org/abs/2408.07055 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs] ([https://github.com/THUDM/LongWriter code]) | ||
| − | * 2024-08: Scientific papers: [The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery] | + | * 2024-08: Scientific papers: [https://arxiv.org/abs/2408.06292 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery] | 
| * 2024-09: PaperQA2: [https://paper.wikicrow.ai/ Language Models Achieve Superhuman Synthesis of Scientific Knowledge] ([https://x.com/SGRodriques/status/1833908643856818443 𝕏 post], [https://github.com/Future-House/paper-qa code]) | * 2024-09: PaperQA2: [https://paper.wikicrow.ai/ Language Models Achieve Superhuman Synthesis of Scientific Knowledge] ([https://x.com/SGRodriques/status/1833908643856818443 𝕏 post], [https://github.com/Future-House/paper-qa code]) | ||
| * 2025-03: [https://arxiv.org/abs/2503.18866 Reasoning to Learn from Latent Thoughts] | * 2025-03: [https://arxiv.org/abs/2503.18866 Reasoning to Learn from Latent Thoughts] | ||
| * 2025-03: [https://arxiv.org/abs/2503.19065 WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation] | * 2025-03: [https://arxiv.org/abs/2503.19065 WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation] | ||
| + | * 2025-04: [https://arxiv.org/abs/2504.13171 Sleep-time Compute: Beyond Inference Scaling at Test-time] | ||
| ==Explanation== | ==Explanation== | ||
| − | * [https://tiger-ai-lab.github.io/TheoremExplainAgent/ TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding] ([https://arxiv.org/abs/2502.19400 preprint]) | + | * 2025-02: [https://tiger-ai-lab.github.io/TheoremExplainAgent/ TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding] ([https://arxiv.org/abs/2502.19400 preprint]) | 
| + | * 2025-04: [https://arxiv.org/abs/2504.02822 Do Two AI Scientists Agree?] | ||
| ==Autonomous Ideation== | ==Autonomous Ideation== | ||
| + | * 2024-04: [https://arxiv.org/abs/2404.07738 ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models] | ||
| * 2024-09: [https://arxiv.org/abs/2409.14202 Mining Causality: AI-Assisted Search for Instrumental Variables] | * 2024-09: [https://arxiv.org/abs/2409.14202 Mining Causality: AI-Assisted Search for Instrumental Variables] | ||
| * 2024-12: [https://arxiv.org/abs/2412.07977 Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events] | * 2024-12: [https://arxiv.org/abs/2412.07977 Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events] | ||
| * 2024-12: [https://arxiv.org/abs/2412.14141 LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research] | * 2024-12: [https://arxiv.org/abs/2412.14141 LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research] | ||
| * 2024-12: [https://arxiv.org/abs/2412.17596 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context] | * 2024-12: [https://arxiv.org/abs/2412.17596 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context] | ||
| + | * 2025-01: [https://arxiv.org/abs/2501.13299 Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents] | ||
| * 2025-02: [https://arxiv.org/abs/2502.13025 Agentic Deep Graph Reasoning Yields Self-Organizing Knowledge Networks] | * 2025-02: [https://arxiv.org/abs/2502.13025 Agentic Deep Graph Reasoning Yields Self-Organizing Knowledge Networks] | ||
| + | * 2025-06: [https://arxiv.org/abs/2506.00794 Predicting Empirical AI Research Outcomes with Language Models] | ||
| + | * 2025-06: [https://arxiv.org/abs/2506.20803 The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas] | ||
| ==Adapting LLMs to Science== | ==Adapting LLMs to Science== | ||
| Line 47: | Line 53: | ||
| * 2024-12: [https://arxiv.org/abs/2412.18161 VISION: A Modular AI Assistant for Natural Human-Instrument Interaction at Scientific User Facilities] | * 2024-12: [https://arxiv.org/abs/2412.18161 VISION: A Modular AI Assistant for Natural Human-Instrument Interaction at Scientific User Facilities] | ||
| * 2025-01: [https://www.science.org/doi/10.1126/sciadv.adr4173 Large language models for human-machine collaborative particle accelerator tuning through natural language] | * 2025-01: [https://www.science.org/doi/10.1126/sciadv.adr4173 Large language models for human-machine collaborative particle accelerator tuning through natural language] | ||
| + | * 2025-04: [https://openreview.net/forum?id=iA9UN1dEgJ Operating Robotic Laboratories with Large Language Models and Teachable Agents] | ||
| ==AI/ML Methods tailored to Science== | ==AI/ML Methods tailored to Science== | ||
| + | ===Science Foundation Models=== | ||
| + | * 2025-08: [https://arxiv.org/abs/2508.15763 Intern-S1: A Scientific Multimodal Foundation Model] | ||
| + | |||
| ===Regression (Data Fitting)=== | ===Regression (Data Fitting)=== | ||
| * 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it) | * 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it) | ||
| Line 73: | Line 83: | ||
| * [https://www.radical-ai.com/ Radical AI]: Material simulation/design | * [https://www.radical-ai.com/ Radical AI]: Material simulation/design | ||
| * [https://www.autoscience.ai/ Autoscience] ([https://www.autoscience.ai/blog/meet-carl-the-first-ai-system-to-produce-academically-peer-reviewed-research Carl]) | * [https://www.autoscience.ai/ Autoscience] ([https://www.autoscience.ai/blog/meet-carl-the-first-ai-system-to-produce-academically-peer-reviewed-research Carl]) | ||
| + | * [https://periodic.com/ Periodic Labs] | ||
| + | |||
| + | ====Bio==== | ||
| + | * [https://www.bioptimus.com/ Bioptimus] | ||
| + | * [https://www.evolutionaryscale.ai/ EvolutionaryScale] | ||
| ==AI/ML Methods in Science== | ==AI/ML Methods in Science== | ||
| + | * 2025-07: [https://www.mdpi.com/2313-433X/11/8/252 Synthetic Scientific Image Generation with VAE, GAN, and Diffusion Model Architectures] | ||
| + | |||
| + | ===Imaging=== | ||
| + | * 2025-05: [https://arxiv.org/abs/2505.08176 Behind the Noise: Conformal Quantile Regression Reveals Emergent Representations] (blog: [https://phzwart.github.io/behindthenoise/ Behind the Noise]) | ||
| + | |||
| + | ===Materials=== | ||
| + | * 2024-12: [https://www.nature.com/articles/s41467-024-54639-7 Crystal structure generation with autoregressive large language modeling | ||
| + | * 2025-03: [https://arxiv.org/abs/2503.03965 All-atom Diffusion Transformers: Unified generative modelling of molecules and materials] | ||
| + | |||
| ===Chemistry=== | ===Chemistry=== | ||
| * 2025-01: [https://www.nature.com/articles/s41578-025-00772-8 Large language models for reticular chemistry] | * 2025-01: [https://www.nature.com/articles/s41578-025-00772-8 Large language models for reticular chemistry] | ||
| Line 80: | Line 104: | ||
| * 2025-02: [https://www.nature.com/articles/s42256-025-00994-z Large language models for scientific discovery in molecular property prediction] | * 2025-02: [https://www.nature.com/articles/s42256-025-00994-z Large language models for scientific discovery in molecular property prediction] | ||
| * [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.) | * [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.) | ||
| + | * 2025-04: [https://arxiv.org/abs/2504.08051 Compositional Flows for 3D Molecule and Synthesis Pathway Co-design] | ||
| + | * 2025-07: [https://arxiv.org/abs/2507.07456 General purpose models for the chemical sciences] | ||
| ===Biology=== | ===Biology=== | ||
| Line 95: | Line 121: | ||
| * [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.) | * [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.) | ||
| * 2025-03: [https://arxiv.org/abs/2503.16351 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences] | * 2025-03: [https://arxiv.org/abs/2503.16351 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences] | ||
| + | * 2025-08: RosettaFold 3: [https://www.biorxiv.org/content/10.1101/2025.08.14.670328v2 Accelerating Biomolecular Modeling with AtomWorks and RF3] | ||
| + | * 2025-09: [https://www.biorxiv.org/content/10.1101/2025.09.12.675911v1 Generative design of novel bacteriophages with genome language models] | ||
| + | * 2025-10: [https://www.science.org/doi/10.1126/science.adu8578 Strengthening nucleic acid biosecurity screening against generative protein design tools] | ||
| + | |||
| + | ===Medicine=== | ||
| + | See: [[AI_Agents#Medicine]] | ||
| ===Successes=== | ===Successes=== | ||
| Line 133: | Line 165: | ||
| * 2024-10: [https://www.cell.com/cell/fulltext/S0092-8674(24)01070-5?target=_blank Empowering biomedical discovery with AI agents] | * 2024-10: [https://www.cell.com/cell/fulltext/S0092-8674(24)01070-5?target=_blank Empowering biomedical discovery with AI agents] | ||
| * 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github]) | * 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github]) | ||
| + | * 2025-07: [https://arxiv.org/abs/2507.01903 AI4Research: A Survey of Artificial Intelligence for Scientific Research] | ||
| + | * 2025-08: [https://arxiv.org/abs/2508.14111 From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery] | ||
| ==Specific== | ==Specific== | ||
| Line 145: | Line 179: | ||
| * 2024-12-30: [https://arxiv.org/abs/2412.21154 Aviary: training language agents on challenging scientific tasks] | * 2024-12-30: [https://arxiv.org/abs/2412.21154 Aviary: training language agents on challenging scientific tasks] | ||
| * See also: [[AI_Agents#Deep_Research|AI Agents > Deep Research]] | * See also: [[AI_Agents#Deep_Research|AI Agents > Deep Research]] | ||
| + | * 2025-04-08: Sakana: [https://pub.sakana.ai/ai-scientist-v2/paper/paper.pdf The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search] ([https://github.com/SakanaAI/AI-Scientist-v2 code]) | ||
| + | * 2025-07: [https://arxiv.org/abs/2507.14267 DREAMS: Density Functional Theory Based Research Engine for Agentic Materials Simulation] | ||
| ==Science Multi-Agent Setups== | ==Science Multi-Agent Setups== | ||
| * 2025-01: [https://arxiv.org/abs/2501.04227 Agent Laboratory: Using LLM Agents as Research Assistants] | * 2025-01: [https://arxiv.org/abs/2501.04227 Agent Laboratory: Using LLM Agents as Research Assistants] | ||
| + | * 2025-04: [https://www.nature.com/articles/s41551-025-01363-2 Coordinated AI agents for advancing healthcare] ([https://www.nature.com/articles/s41551-025-01363-2.epdf?sharing_token=CIYP3J8LZE4BX31fV3WxUdRgN0jAjWel9jnR3ZoTv0O9iD-yhgqzRaz_7VASayWRePPhWDD2xFyfuOpSXbdPaOtt7oH4nfXo7telALzNwY3V1p9SxoqBEJy2OuaJ_cA35-CYQC1XgjCNTZUw46dh1KX-Dj8e7-1Vk_RlZKFLrc8%3D pdf]) | ||
| =AI Science Systems= | =AI Science Systems= | ||
| * 2025-01: [https://arxiv.org/abs/2501.03916 Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback] | * 2025-01: [https://arxiv.org/abs/2501.03916 Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback] | ||
| + | * 2025-01: [https://arxiv.org/abs/2501.13299 Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents] | ||
| * 2025-02: [https://storage.googleapis.com/coscientist_paper/ai_coscientist.pdf Towards an AI co-scientist] (Google blog post: [https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/ Accelerating scientific breakthroughs with an AI co-scientist]) | * 2025-02: [https://storage.googleapis.com/coscientist_paper/ai_coscientist.pdf Towards an AI co-scientist] (Google blog post: [https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/ Accelerating scientific breakthroughs with an AI co-scientist]) | ||
| + | * 2025-06: [https://zenodo.org/records/15693353 The Discovery Engine] | ||
| + | ** 2025-07: [https://arxiv.org/abs/2507.00964 Benchmarking the Discovery Engine] ([https://www.leap-labs.com/blog/how-we-replicated-five-peer-reviewed-papers-in-five-hours blog]) | ||
| + | * 2025-07: [https://www.preprints.org/manuscript/202507.1951/v1 Autonomous Scientific Discovery Through Hierarchical AI Scientist Systems] | ||
| ===Inorganic Materials Discovery=== | ===Inorganic Materials Discovery=== | ||
| * 2023-11: [https://doi.org/10.1038/s41586-023-06735-9 Scaling deep learning for materials discovery] | * 2023-11: [https://doi.org/10.1038/s41586-023-06735-9 Scaling deep learning for materials discovery] | ||
| * 2023-11: [https://doi.org/10.1038/s41586-023-06734-w An autonomous laboratory for the accelerated synthesis of novel materials] | * 2023-11: [https://doi.org/10.1038/s41586-023-06734-w An autonomous laboratory for the accelerated synthesis of novel materials] | ||
| + | * 2024-09: [https://arxiv.org/abs/2409.00135 HoneyComb: A Flexible LLM-Based Agent System for Materials Science] | ||
| * 2024-10: [https://arxiv.org/abs/2410.12771 Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models] ([https://github.com/FAIR-Chem/fairchem code], [https://huggingface.co/datasets/fairchem/OMAT24 datasets], [https://huggingface.co/fairchem/OMAT24 checkpoints], [https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-sona/ blogpost]) | * 2024-10: [https://arxiv.org/abs/2410.12771 Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models] ([https://github.com/FAIR-Chem/fairchem code], [https://huggingface.co/datasets/fairchem/OMAT24 datasets], [https://huggingface.co/fairchem/OMAT24 checkpoints], [https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-sona/ blogpost]) | ||
| * 2025-01: [https://www.nature.com/articles/s41586-025-08628-5 A generative model for inorganic materials design] | * 2025-01: [https://www.nature.com/articles/s41586-025-08628-5 A generative model for inorganic materials design] | ||
| + | * 2025-04: [https://arxiv.org/abs/2504.14110 System of Agentic AI for the Discovery of Metal-Organic Frameworks] | ||
| + | * 2025-05: [https://arxiv.org/abs/2505.08762 The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models] | ||
| + | |||
| + | ===Materials Characterization=== | ||
| + | * 2025-08: [https://arxiv.org/abs/2508.06569 Operationalizing Serendipity: Multi-Agent AI Workflows for Enhanced Materials Characterization with Theory-in-the-Loop] | ||
| ===Chemistry=== | ===Chemistry=== | ||
| * 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist) | * 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist) | ||
| + | * 2024-09: [https://www.pnnl.gov/main/publications/external/technical_reports/PNNL-36692.pdf PNNL ChemAIst V0.2] | ||
| * 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models] | * 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models] | ||
| + | * 2025-06: [https://paper.ether0.ai/ Training a Scientific Reasoning Model for Chemistry] | ||
| + | * 2025-06: [https://arxiv.org/abs/2506.06363 ChemGraph: An Agentic Framework for Computational Chemistry Workflows] ([https://github.com/argonne-lcf/ChemGraph code]) | ||
| + | |||
| + | ===Bio=== | ||
| + | * 2025-07: [https://arxiv.org/abs/2507.01485 BioMARS: A Multi-Agent Robotic System for Autonomous Biological Experiments] | ||
| ==LLMs Optimized for Science== | ==LLMs Optimized for Science== | ||
| * 2022-11: [https://arxiv.org/abs/2211.09085 Galactica: A Large Language Model for Science] | * 2022-11: [https://arxiv.org/abs/2211.09085 Galactica: A Large Language Model for Science] | ||
| + | * 2024-12: [https://www.nature.com/articles/s41467-024-54639-7 Crystal structure generation with autoregressive large language modeling] | ||
| + | * 2025-02: [https://arxiv.org/abs/2502.13107 MatterChat: A Multi-Modal LLM for Material Science] | ||
| * 2025-03: [https://arxiv.org/abs/2503.17604 OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery] | * 2025-03: [https://arxiv.org/abs/2503.17604 OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery] | ||
| * 2025-03: Google [https://huggingface.co/collections/google/txgemma-release-67dd92e931c857d15e4d1e87 TxGemma] (2B, 9B, 27B): [https://developers.googleblog.com/en/introducing-txgemma-open-models-improving-therapeutics-development/ drug development] | * 2025-03: Google [https://huggingface.co/collections/google/txgemma-release-67dd92e931c857d15e4d1e87 TxGemma] (2B, 9B, 27B): [https://developers.googleblog.com/en/introducing-txgemma-open-models-improving-therapeutics-development/ drug development] | ||
| =Impact of AI in Science= | =Impact of AI in Science= | ||
| − | * 2024-11: [https://aidantr.github.io/files/AI_innovation.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation] | + | * 2024-11: <strike>[https://aidantr.github.io/files/AI_innovation.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation]</strike> | 
| + | ** 2025-05: Retraction: [https://economics.mit.edu/news/assuring-accurate-research-record Assuring an accurate research record] | ||
| * 2025-02: [https://arxiv.org/abs/2502.05151 Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation] | * 2025-02: [https://arxiv.org/abs/2502.05151 Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation] | ||
| Line 188: | Line 244: | ||
| =Science Datasets= | =Science Datasets= | ||
| + | * [https://datasetsearch.research.google.com/ Google Dataset Search] | ||
| * [https://github.com/blaiszik/awesome-matchem-datasets/ Awesome Materials & Chemistry Datasets] | * [https://github.com/blaiszik/awesome-matchem-datasets/ Awesome Materials & Chemistry Datasets] | ||
| + | * NIST [https://jarvis.nist.gov/ Jarvis] (simulations) | ||
| =See Also= | =See Also= | ||
| * [[AI agents]] | * [[AI agents]] | ||
| * [https://nanobot.chat/ Nanobot.chat]: Intelligent AI for the labnetwork @ mtl.mit.edu forum | * [https://nanobot.chat/ Nanobot.chat]: Intelligent AI for the labnetwork @ mtl.mit.edu forum | ||
Latest revision as of 08:22, 14 October 2025
Contents
- 1 AI Use-cases for Science
- 2 Science Benchmarks
- 3 Science Agents
- 4 AI Science Systems
- 5 Impact of AI in Science
- 6 Related Tools
- 7 Science Datasets
- 8 See Also
AI Use-cases for Science
Literature
- alphaXiv | Explore: Understand arXiv papers
LLM extract data from papers
AI finding links in literature
- 2019-07: Unsupervised word embeddings capture latent knowledge from materials science literature
- 2024-11: Large language models surpass human experts in predicting neuroscience results
(Pre) Generate Articles
- 2022-12: Re3: Generating Longer Stories With Recursive Reprompting and Revision
- 2023-03: English essays: Artificial intelligence (AI) technology in OpenAI ChatGPT application: A review of ChatGPT in writing English essay
- 2023-01: Journalism: Collaborating With ChatGPT: Considering the Implications of Generative Artificial Intelligence for Journalism and Media Education
- 2023-07: Science writing: Artificial intelligence in scientific writing: a friend or a foe?
- 2024-02: Wikipedia style: Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
- 2024-02: LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs (code)
- 2024-08: Scientific papers: The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
- 2024-09: PaperQA2: Language Models Achieve Superhuman Synthesis of Scientific Knowledge (𝕏 post, code)
- 2025-03: Reasoning to Learn from Latent Thoughts
- 2025-03: WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation
- 2025-04: Sleep-time Compute: Beyond Inference Scaling at Test-time
Explanation
- 2025-02: TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding (preprint)
- 2025-04: Do Two AI Scientists Agree?
Autonomous Ideation
- 2024-04: ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
- 2024-09: Mining Causality: AI-Assisted Search for Instrumental Variables
- 2024-12: Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events
- 2024-12: LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research
- 2024-12: LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
- 2025-01: Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents
- 2025-02: Agentic Deep Graph Reasoning Yields Self-Organizing Knowledge Networks
- 2025-06: Predicting Empirical AI Research Outcomes with Language Models
- 2025-06: The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
Adapting LLMs to Science
- 2023-06: Domain-specific chatbots for science using embeddings
- 2024-10: Personalization of Large Language Models: A Survey
- 2024-11: Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
AI/LLM Control of Scientific Instruments/Facilities
- 2023-12: Opportunities for retrieval and tool augmented large language models in scientific facilities
- 2023-12: Virtual Scientific Companion for Synchrotron Beamlines: A Prototype
- 2023-12: Autonomous chemical research with large language models
- 2024-01: Synergizing Human Expertise and AI Efficiency with Language Model for Microscopy Operation and Automated Experiment Design
- 2024-06: From Text to Test: AI-Generated Control Software for Materials Science Instruments
- 2024-12: VISION: A Modular AI Assistant for Natural Human-Instrument Interaction at Scientific User Facilities
- 2025-01: Large language models for human-machine collaborative particle accelerator tuning through natural language
- 2025-04: Operating Robotic Laboratories with Large Language Models and Teachable Agents
AI/ML Methods tailored to Science
Science Foundation Models
Regression (Data Fitting)
- 2024-06: Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it)
- 2024-12: OmniPred: Language Models as Universal Regressors
Tabular Classification/Regression
Symbolic Regression
Literature Discovery
- FutureHouse
- Lumina
- Automated-AI-Web-Researcher-Ollama
- 2025-01: Search-o1: Agentic Search-Enhanced Large Reasoning Models (project, code)
Commercial
- Sakana AI
- Cusp AI: Materials/AI
- Lila AI: Life sciences
- Radical AI: Material simulation/design
- Autoscience (Carl)
- Periodic Labs
Bio
AI/ML Methods in Science
Imaging
- 2025-05: Behind the Noise: Conformal Quantile Regression Reveals Emergent Representations (blog: Behind the Noise)
Materials
- 2024-12: [https://www.nature.com/articles/s41467-024-54639-7 Crystal structure generation with autoregressive large language modeling
- 2025-03: All-atom Diffusion Transformers: Unified generative modelling of molecules and materials
Chemistry
- 2025-01: Large language models for reticular chemistry
- 2025-02: Image-based generation for molecule design with SketchMol
- 2025-02: Large language models for scientific discovery in molecular property prediction
- 2025-03: Vant AI Neo-1: atomistic foundation model (small molecules, proteins, etc.)
- 2025-04: Compositional Flows for 3D Molecule and Synthesis Pathway Co-design
- 2025-07: General purpose models for the chemical sciences
Biology
- 2018: AlphaFold
- 2021-07: AlphaFold 2
- 2024-05: AlphaFold 3
- 2023-03: Evolutionary-scale prediction of atomic-level protein structure with a language model (ESMFold)
- 2023-11: Illuminating protein space with a programmable generative model
- 2024-11: Sequence modeling and design from molecular to genome scale with Evo (Evo)
- 2025-01: Targeting protein–ligand neosurfaces with a generalizable deep learning tool (Chroma)
- 2025-01: Simulating 500 million years of evolution with a language model (ESM 3 model)
- 2025-02: Genome modeling and design across all domains of life with Evo 2
- 2025-02: Exploring the structural changes driving protein function with BioEmu-1
- 2025-02: Protein Large Language Models: A Comprehensive Survey
- 2025-03: Vant AI Neo-1: atomistic foundation model (small molecules, proteins, etc.)
- 2025-03: Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences
- 2025-08: RosettaFold 3: Accelerating Biomolecular Modeling with AtomWorks and RF3
- 2025-09: Generative design of novel bacteriophages with genome language models
- 2025-10: Strengthening nucleic acid biosecurity screening against generative protein design tools
Medicine
See: AI_Agents#Medicine
Successes
AI/ML Methods co-opted for Science
Mechanistic Interpretability
Train large model on science data. Then apply mechanistic interpretability (e.g. sparse autoencoders, SAE) to the feature/activation space.
- Mechanistic interpretability for protein language models (visualizer, code, SAE)
- Markov Bio: Through a Glass Darkly: Mechanistic Interpretability as the Bridge to End-to-End Biology (quick description, background info on recent bio progress)
- 2023-01: Tracr: Compiled Transformers as a Laboratory for Interpretability (code)
- 2024-10: An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
- 2024-12: Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
- 2024-12: InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
- 2025-01: Insights on Galaxy Evolution from Interpretable Sparse Feature Networks
- 2025-02: From Mechanistic Interpretability to Mechanistic Biology: Training, Evaluating, and Interpreting Sparse Autoencoders on Protein Language Models
- 2025-02: Interpreting Evo 2: Arc Institute's Next-Generation Genomic Foundation Model
Uncertainty
- 2024-10: entropix: Entropy Based Sampling and Parallel CoT Decoding
- 2024-10: Taming Overconfidence in LLMs: Reward Calibration in RLHF
Science Benchmarks
- 2024-07: SciCode: A Research Coding Benchmark Curated by Scientists (project)
- 2024-11: AidanBench: Evaluating Novel Idea Generation on Open-Ended Questions (code)
- 2024-12: LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
- 2025-01: Humanity's Last Exam
- ScienceAgentBench
- 2025-02: EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants
- 2025-03: BixBench: Novel hypotheses (accept/reject)
- 2025-04: Google: Evaluating progress of LLMs on scientific problem-solving
Science Agents
Reviews
- 2024-10: Empowering biomedical discovery with AI agents
- 2025-01: A review of large language models and autonomous agents in chemistry (github)
- 2025-07: AI4Research: A Survey of Artificial Intelligence for Scientific Research
- 2025-08: From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
Specific
- 2024-01-13: ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization (video)
- 2024-06-19: LLMatDesign: Autonomous Materials Discovery with Large Language Models
- 2024-08-12: Sakana AI: AI Scientist; The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (code)
- 2024-09-09: SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning (code)
- 2024-09-11: PaperQA2: Language Models Achieve Superhuman Synthesis of Scientific Knowledge (𝕏 post, code)
- 2024-10-17: Rapid and Automated Alloy Design with Graph Neural Network-Powered LLM-Driven Multi-Agent Systems
- 2024-10-28: Large Language Model-Guided Prediction Toward Quantum Materials Synthesis
- 2024-12-06: The Virtual Lab: AI Agents Design New SARS-CoV-2 Nanobodies with Experimental Validation (writeup: Virtual lab powered by ‘AI scientists’ super-charges biomedical research: Could human–AI collaborations be the future of interdisciplinary studies?)
- 2024-12-30: Aviary: training language agents on challenging scientific tasks
- See also: AI Agents > Deep Research
- 2025-04-08: Sakana: The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search (code)
- 2025-07: DREAMS: Density Functional Theory Based Research Engine for Agentic Materials Simulation
Science Multi-Agent Setups
- 2025-01: Agent Laboratory: Using LLM Agents as Research Assistants
- 2025-04: Coordinated AI agents for advancing healthcare (pdf)
AI Science Systems
- 2025-01: Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
- 2025-01: Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents
- 2025-02: Towards an AI co-scientist (Google blog post: Accelerating scientific breakthroughs with an AI co-scientist)
- 2025-06: The Discovery Engine
- 2025-07: Benchmarking the Discovery Engine (blog)
 
- 2025-07: Autonomous Scientific Discovery Through Hierarchical AI Scientist Systems
Inorganic Materials Discovery
- 2023-11: Scaling deep learning for materials discovery
- 2023-11: An autonomous laboratory for the accelerated synthesis of novel materials
- 2024-09: HoneyComb: A Flexible LLM-Based Agent System for Materials Science
- 2024-10: Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models (code, datasets, checkpoints, blogpost)
- 2025-01: A generative model for inorganic materials design
- 2025-04: System of Agentic AI for the Discovery of Metal-Organic Frameworks
- 2025-05: The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models
Materials Characterization
Chemistry
- 2023-12: Autonomous chemical research with large language models (Coscientist)
- 2024-09: PNNL ChemAIst V0.2
- 2024-11: An automatic end-to-end chemical synthesis development platform powered by large language models
- 2025-06: Training a Scientific Reasoning Model for Chemistry
- 2025-06: ChemGraph: An Agentic Framework for Computational Chemistry Workflows (code)
Bio
LLMs Optimized for Science
- 2022-11: Galactica: A Large Language Model for Science
- 2024-12: Crystal structure generation with autoregressive large language modeling
- 2025-02: MatterChat: A Multi-Modal LLM for Material Science
- 2025-03: OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery
- 2025-03: Google TxGemma (2B, 9B, 27B): drug development
Impact of AI in Science
- 2024-11: Artificial Intelligence, Scientific Discovery, and Product Innovation- 2025-05: Retraction: Assuring an accurate research record
 
- 2025-02: Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation
Related Tools
Literature Search
Data Visualization
- 2024-10: Microsoft Data Formulator: Create Rich Visualization with AI iteratively (video, code)
- Julius AI: Analyze your data with computational AI
Generative
- 2025-03: StarVector 1B, 8B: text or image to SVG
Chemistry
Science Datasets
- Google Dataset Search
- Awesome Materials & Chemistry Datasets
- NIST Jarvis (simulations)
See Also
- AI agents
- Nanobot.chat: Intelligent AI for the labnetwork @ mtl.mit.edu forum

