Difference between revisions of "Science Agents"

From GISAXS
Jump to: navigation, search
(Biology)
(Science Benchmarks)
 
(4 intermediate revisions by the same user not shown)
Line 168: Line 168:
 
** 2024-07: [https://arxiv.org/abs/2407.09413 SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers]
 
** 2024-07: [https://arxiv.org/abs/2407.09413 SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers]
 
** 2024-10: [https://neurips.cc/virtual/2024/98540 FEABench: Evaluating Language Models on Real World Physics Reasoning Ability]
 
** 2024-10: [https://neurips.cc/virtual/2024/98540 FEABench: Evaluating Language Models on Real World Physics Reasoning Ability]
 +
* 2026-02: [https://edisonscientific.com/ Edison]: [https://lab-bench.ai/ LABBench 2]
  
 
=Science Agents=
 
=Science Agents=
Line 194: Line 195:
 
* 2025-11: [https://arxiv.org/abs/2511.02824 Kosmos: An AI Scientist for Autonomous Discovery]
 
* 2025-11: [https://arxiv.org/abs/2511.02824 Kosmos: An AI Scientist for Autonomous Discovery]
 
* 2025-11: [https://arxiv.org/abs/2511.08151 SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning]
 
* 2025-11: [https://arxiv.org/abs/2511.08151 SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning]
 +
* 2026-02: [https://arxiv.org/abs/2601.23265 PaperBanana: Automating Academic Illustration for AI Scientists]
  
 
==Science Multi-Agent Setups==
 
==Science Multi-Agent Setups==
Line 245: Line 247:
 
** 2025-05: Retraction: [https://economics.mit.edu/news/assuring-accurate-research-record Assuring an accurate research record]
 
** 2025-05: Retraction: [https://economics.mit.edu/news/assuring-accurate-research-record Assuring an accurate research record]
 
* 2025-02: [https://arxiv.org/abs/2502.05151 Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation]
 
* 2025-02: [https://arxiv.org/abs/2502.05151 Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation]
 +
* 2026-02: [https://arxiv.org/abs/2602.03837 Accelerating Scientific Research with Gemini: Case Studies and Common Techniques]
  
 
=Related Tools=
 
=Related Tools=
Line 280: Line 283:
 
*** 2026-01: [https://www.erdosproblems.com/205 Erdős Problem #205] solved by Aristotle using ChatGPT 5.2 Pro
 
*** 2026-01: [https://www.erdosproblems.com/205 Erdős Problem #205] solved by Aristotle using ChatGPT 5.2 Pro
 
*** 2026-01: [https://www.erdosproblems.com/forum/thread/281 Erdős Problem #281] [https://x.com/neelsomani/status/2012695714187325745?s=20 solved] by [https://neelsomani.com/ Neel Somani] using ChatGPT 5.2 Pro
 
*** 2026-01: [https://www.erdosproblems.com/forum/thread/281 Erdős Problem #281] [https://x.com/neelsomani/status/2012695714187325745?s=20 solved] by [https://neelsomani.com/ Neel Somani] using ChatGPT 5.2 Pro
 +
*** 2026-01: Google DeepMind: [https://arxiv.org/abs/2601.21442 Irrationality of rapidly converging series: a problem of Erdős and Graham]
 +
**** [https://www.erdosproblems.com/1051 Erdős Problem #1051] [https://x.com/slow_developer/status/2018321002623901885?s=20 solved] by Google DeepMind Aletheia agent
 +
*** 2026-01: Google DeepMind: [https://arxiv.org/abs/2601.22401 Semi-Autonomous Mathematics Discovery with Gemini: A Case Study on the Erdős Problems]
 +
**** Attempted 700 problems, solved 13 open Erdős problems: 5 novel autonomous solutions, 8 through existing literature.
 
** 2026-01: [https://arxiv.org/abs/2601.07222 The motivic class of the space of genus 0 maps to the flag variety]
 
** 2026-01: [https://arxiv.org/abs/2601.07222 The motivic class of the space of genus 0 maps to the flag variety]
 
* '''Physics assistance:'''
 
* '''Physics assistance:'''

Latest revision as of 10:24, 6 February 2026

AI Use-cases for Science

Literature

LLM extract data from papers

AI finding links in literature

(Pre) Generate Articles

Explanation

Autonomous Ideation

Adapting LLMs to Science

AI/LLM Control of Scientific Instruments/Facilities

AI/ML Methods tailored to Science

Science Foundation Models

Regression (Data Fitting)

Tabular Classification/Regression

Symbolic Regression

Literature Discovery

Commercial

Bio

AI/ML Methods in Science

Imaging

Materials

Chemistry

Biology

Medicine

See: AI_Agents#Medicine

Successes

AI/ML Methods co-opted for Science

Mechanistic Interpretability

Train large model on science data. Then apply mechanistic interpretability (e.g. sparse autoencoders, SAE) to the feature/activation space.

Uncertainty

Science Benchmarks

Science Agents

Reviews

Challenges

Specific

Science Multi-Agent Setups

AI Science Systems

Inorganic Materials Discovery

Materials Characterization

Chemistry

Bio

Physics

LLMs Optimized for Science

Impact of AI in Science

Related Tools

Literature Search

Data Visualization

Generative

Chemistry

Science Datasets

Genuine Discoveries

See Also