Latest revision as of 08:22, 14 October 2025

@@ Line 19: / Line 19: @@
 * 2024-02: Wikipedia style: [https://arxiv.org/abs/2402.14207 Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models]
 * 2024-02: [https://arxiv.org/abs/2408.07055 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs] ([https://github.com/THUDM/LongWriter code])
-* 2024-08: Scientific papers: [The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery]
+* 2024-08: Scientific papers: [https://arxiv.org/abs/2408.06292 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery]
 * 2024-09: PaperQA2: [https://paper.wikicrow.ai/ Language Models Achieve Superhuman Synthesis of Scientific Knowledge] ([https://x.com/SGRodriques/status/1833908643856818443 𝕏 post], [https://github.com/Future-House/paper-qa code])
 * 2025-03: [https://arxiv.org/abs/2503.18866 Reasoning to Learn from Latent Thoughts]
 * 2025-03: [https://arxiv.org/abs/2503.19065 WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation]
+* 2025-04: [https://arxiv.org/abs/2504.13171 Sleep-time Compute: Beyond Inference Scaling at Test-time]
 ==Explanation==
@@ Line 29: / Line 30: @@
 ==Autonomous Ideation==
+* 2024-04: [https://arxiv.org/abs/2404.07738 ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models]
 * 2024-09: [https://arxiv.org/abs/2409.14202 Mining Causality: AI-Assisted Search for Instrumental Variables]
 * 2024-12: [https://arxiv.org/abs/2412.07977 Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events]
 * 2024-12: [https://arxiv.org/abs/2412.14141 LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research]
 * 2024-12: [https://arxiv.org/abs/2412.17596 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context]
+* 2025-01: [https://arxiv.org/abs/2501.13299 Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents]
 * 2025-02: [https://arxiv.org/abs/2502.13025 Agentic Deep Graph Reasoning Yields Self-Organizing Knowledge Networks]
+* 2025-06: [https://arxiv.org/abs/2506.00794 Predicting Empirical AI Research Outcomes with Language Models]
+* 2025-06: [https://arxiv.org/abs/2506.20803 The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas]
 ==Adapting LLMs to Science==
@@ Line 51: / Line 56: @@
 ==AI/ML Methods tailored to Science==
+===Science Foundation Models===
+* 2025-08: [https://arxiv.org/abs/2508.15763 Intern-S1: A Scientific Multimodal Foundation Model]
 ===Regression (Data Fitting)===
 * 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it)
@@ Line 75: / Line 83: @@
 * [https://www.radical-ai.com/ Radical AI]: Material simulation/design
 * [https://www.autoscience.ai/ Autoscience] ([https://www.autoscience.ai/blog/meet-carl-the-first-ai-system-to-produce-academically-peer-reviewed-research Carl])
+* [https://periodic.com/ Periodic Labs]
+====Bio====
+* [https://www.bioptimus.com/ Bioptimus]
+* [https://www.evolutionaryscale.ai/ EvolutionaryScale]
 ==AI/ML Methods in Science==
+* 2025-07: [https://www.mdpi.com/2313-433X/11/8/252 Synthetic Scientific Image Generation with VAE, GAN, and Diffusion Model Architectures]
+===Imaging===
+* 2025-05: [https://arxiv.org/abs/2505.08176 Behind the Noise: Conformal Quantile Regression Reveals Emergent Representations] (blog: [https://phzwart.github.io/behindthenoise/ Behind the Noise])
+===Materials===
+* 2024-12: [https://www.nature.com/articles/s41467-024-54639-7 Crystal structure generation with autoregressive large language modeling
+* 2025-03: [https://arxiv.org/abs/2503.03965 All-atom Diffusion Transformers: Unified generative modelling of molecules and materials]
 ===Chemistry===
 * 2025-01: [https://www.nature.com/articles/s41578-025-00772-8 Large language models for reticular chemistry]
@@ Line 82: / Line 104: @@
 * 2025-02: [https://www.nature.com/articles/s42256-025-00994-z Large language models for scientific discovery in molecular property prediction]
 * [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.)
+* 2025-04: [https://arxiv.org/abs/2504.08051 Compositional Flows for 3D Molecule and Synthesis Pathway Co-design]
+* 2025-07: [https://arxiv.org/abs/2507.07456 General purpose models for the chemical sciences]
 ===Biology===
@@ Line 97: / Line 121: @@
 * [https://x.com/vant_ai/status/1903070297991110657 2025-03]: [https://www.vant.ai/ Vant AI] [https://www.vant.ai/neo-1 Neo-1]: atomistic foundation model (small molecules, proteins, etc.)
 * 2025-03: [https://arxiv.org/abs/2503.16351 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences]
+* 2025-08: RosettaFold 3: [https://www.biorxiv.org/content/10.1101/2025.08.14.670328v2 Accelerating Biomolecular Modeling with AtomWorks and RF3]
+* 2025-09: [https://www.biorxiv.org/content/10.1101/2025.09.12.675911v1 Generative design of novel bacteriophages with genome language models]
+* 2025-10: [https://www.science.org/doi/10.1126/science.adu8578 Strengthening nucleic acid biosecurity screening against generative protein design tools]
 ===Medicine===
@@ Line 138: / Line 165: @@
 * 2024-10: [https://www.cell.com/cell/fulltext/S0092-8674(24)01070-5?target=_blank Empowering biomedical discovery with AI agents]
 * 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github])
+* 2025-07: [https://arxiv.org/abs/2507.01903 AI4Research: A Survey of Artificial Intelligence for Scientific Research]
+* 2025-08: [https://arxiv.org/abs/2508.14111 From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery]
 ==Specific==
@@ Line 151: / Line 180: @@
 * See also: [[AI_Agents#Deep_Research|AI Agents > Deep Research]]
 * 2025-04-08: Sakana: [https://pub.sakana.ai/ai-scientist-v2/paper/paper.pdf The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search] ([https://github.com/SakanaAI/AI-Scientist-v2 code])
+* 2025-07: [https://arxiv.org/abs/2507.14267 DREAMS: Density Functional Theory Based Research Engine for Agentic Materials Simulation]
 ==Science Multi-Agent Setups==
@@ Line 158: / Line 188: @@
 =AI Science Systems=
 * 2025-01: [https://arxiv.org/abs/2501.03916 Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback]
+* 2025-01: [https://arxiv.org/abs/2501.13299 Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents]
 * 2025-02: [https://storage.googleapis.com/coscientist_paper/ai_coscientist.pdf Towards an AI co-scientist] (Google blog post: [https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/ Accelerating scientific breakthroughs with an AI co-scientist])
+* 2025-06: [https://zenodo.org/records/15693353 The Discovery Engine]
+** 2025-07: [https://arxiv.org/abs/2507.00964 Benchmarking the Discovery Engine] ([https://www.leap-labs.com/blog/how-we-replicated-five-peer-reviewed-papers-in-five-hours blog])
+* 2025-07: [https://www.preprints.org/manuscript/202507.1951/v1 Autonomous Scientific Discovery Through Hierarchical AI Scientist Systems]
 ===Inorganic Materials Discovery===
 * 2023-11: [https://doi.org/10.1038/s41586-023-06735-9 Scaling deep learning for materials discovery]
 * 2023-11: [https://doi.org/10.1038/s41586-023-06734-w An autonomous laboratory for the accelerated synthesis of novel materials]
+* 2024-09: [https://arxiv.org/abs/2409.00135 HoneyComb: A Flexible LLM-Based Agent System for Materials Science]
 * 2024-10: [https://arxiv.org/abs/2410.12771 Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models] ([https://github.com/FAIR-Chem/fairchem code], [https://huggingface.co/datasets/fairchem/OMAT24 datasets], [https://huggingface.co/fairchem/OMAT24 checkpoints], [https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-sona/ blogpost])
 * 2025-01: [https://www.nature.com/articles/s41586-025-08628-5 A generative model for inorganic materials design]
+* 2025-04: [https://arxiv.org/abs/2504.14110 System of Agentic AI for the Discovery of Metal-Organic Frameworks]
+* 2025-05: [https://arxiv.org/abs/2505.08762 The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models]
+===Materials Characterization===
+* 2025-08: [https://arxiv.org/abs/2508.06569 Operationalizing Serendipity: Multi-Agent AI Workflows for Enhanced Materials Characterization with Theory-in-the-Loop]
 ===Chemistry===
 * 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist)
+* 2024-09: [https://www.pnnl.gov/main/publications/external/technical_reports/PNNL-36692.pdf PNNL ChemAIst V0.2]
 * 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models]
+* 2025-06: [https://paper.ether0.ai/ Training a Scientific Reasoning Model for Chemistry]
+* 2025-06: [https://arxiv.org/abs/2506.06363 ChemGraph: An Agentic Framework for Computational Chemistry Workflows] ([https://github.com/argonne-lcf/ChemGraph code])
+===Bio===
+* 2025-07: [https://arxiv.org/abs/2507.01485 BioMARS: A Multi-Agent Robotic System for Autonomous Biological Experiments]
 ==LLMs Optimized for Science==
 * 2022-11: [https://arxiv.org/abs/2211.09085 Galactica: A Large Language Model for Science]
+* 2024-12: [https://www.nature.com/articles/s41467-024-54639-7 Crystal structure generation with autoregressive large language modeling]
+* 2025-02: [https://arxiv.org/abs/2502.13107 MatterChat: A Multi-Modal LLM for Material Science]
 * 2025-03: [https://arxiv.org/abs/2503.17604 OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery]
 * 2025-03: Google [https://huggingface.co/collections/google/txgemma-release-67dd92e931c857d15e4d1e87 TxGemma] (2B, 9B, 27B): [https://developers.googleblog.com/en/introducing-txgemma-open-models-improving-therapeutics-development/ drug development]
 =Impact of AI in Science=
-* 2024-11: [https://aidantr.github.io/files/AI_innovation.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation]
+* 2024-11: <strike>[https://aidantr.github.io/files/AI_innovation.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation]</strike>
+** 2025-05: Retraction: [https://economics.mit.edu/news/assuring-accurate-research-record Assuring an accurate research record]
 * 2025-02: [https://arxiv.org/abs/2502.05151 Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation]
@@ Line 195: / Line 244: @@
 =Science Datasets=
+* [https://datasetsearch.research.google.com/ Google Dataset Search]
 * [https://github.com/blaiszik/awesome-matchem-datasets/ Awesome Materials & Chemistry Datasets]
+* NIST [https://jarvis.nist.gov/ Jarvis] (simulations)
 =See Also=
 * [[AI agents]]
 * [https://nanobot.chat/ Nanobot.chat]: Intelligent AI for the labnetwork @ mtl.mit.edu forum

Difference between revisions of "Science Agents"

Latest revision as of 08:22, 14 October 2025

Contents

AI Use-cases for Science

Literature

LLM extract data from papers

AI finding links in literature

(Pre) Generate Articles

Explanation

Autonomous Ideation

Adapting LLMs to Science

AI/LLM Control of Scientific Instruments/Facilities

AI/ML Methods tailored to Science

Science Foundation Models

Regression (Data Fitting)

Tabular Classification/Regression

Symbolic Regression

Literature Discovery

Commercial

Bio

AI/ML Methods in Science

Imaging

Materials

Chemistry

Biology

Medicine

Successes

AI/ML Methods co-opted for Science

Mechanistic Interpretability

Uncertainty

Science Benchmarks

Science Agents

Reviews

Specific

Science Multi-Agent Setups

AI Science Systems

Inorganic Materials Discovery

Materials Characterization

Chemistry

Bio

LLMs Optimized for Science

Impact of AI in Science

Related Tools

Literature Search

Data Visualization

Generative

Chemistry

Science Datasets

See Also

Navigation menu

Search