Difference between revisions of "Science Agents"
KevinYager (talk | contribs) (→Mechanistic Interpretability) |
KevinYager (talk | contribs) (→Science Benchmarks) |
||
(13 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
==Literature== | ==Literature== | ||
+ | ===LLM extract data from papers=== | ||
+ | * 2024-14: [https://pubs.rsc.org/en/content/articlelanding/2025/cs/d4cs00913d From text to insight: large language models for chemical data extraction] | ||
+ | |||
===AI finding links in literature=== | ===AI finding links in literature=== | ||
* 2019-07: [https://doi.org/10.1038/s41586-019-1335-8 Unsupervised word embeddings capture latent knowledge from materials science literature] | * 2019-07: [https://doi.org/10.1038/s41586-019-1335-8 Unsupervised word embeddings capture latent knowledge from materials science literature] | ||
Line 11: | Line 14: | ||
* 2024-12: [https://arxiv.org/abs/2412.07977 Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events] | * 2024-12: [https://arxiv.org/abs/2412.07977 Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events] | ||
* 2024-12: [https://arxiv.org/abs/2412.14141 LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research] | * 2024-12: [https://arxiv.org/abs/2412.14141 LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research] | ||
+ | * 2024-12: [https://arxiv.org/abs/2412.17596 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context] | ||
==Adapting LLMs to Science== | ==Adapting LLMs to Science== | ||
Line 21: | Line 25: | ||
* 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it) | * 2024-06: [https://arxiv.org/abs/2406.14546 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data]: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it) | ||
* 2024-12: [https://arxiv.org/abs/2402.14547 OmniPred: Language Models as Universal Regressors] | * 2024-12: [https://arxiv.org/abs/2402.14547 OmniPred: Language Models as Universal Regressors] | ||
+ | |||
+ | ===Tabular Classification/Regression=== | ||
+ | * 2025-01: [https://www.nature.com/articles/s41586-024-08328-6 Accurate predictions on small data with a tabular foundation model] ([https://github.com/PriorLabs/TabPFN code]) | ||
===Symbolic Regression=== | ===Symbolic Regression=== | ||
Line 31: | Line 38: | ||
* [https://lumina.sh/ Lumina] | * [https://lumina.sh/ Lumina] | ||
* [https://github.com/TheBlewish/Automated-AI-Web-Researcher-Ollama Automated-AI-Web-Researcher-Ollama] | * [https://github.com/TheBlewish/Automated-AI-Web-Researcher-Ollama Automated-AI-Web-Researcher-Ollama] | ||
+ | * 2025-01: [https://arxiv.org/abs/2501.05366 Search-o1: Agentic Search-Enhanced Large Reasoning Models] ([https://search-o1.github.io/ project], [https://github.com/sunnynexus/Search-o1 code]) | ||
===Commercial=== | ===Commercial=== | ||
Line 48: | Line 56: | ||
* 2024-10: [https://github.com/xjdr-alt/entropix entropix: Entropy Based Sampling and Parallel CoT Decoding] | * 2024-10: [https://github.com/xjdr-alt/entropix entropix: Entropy Based Sampling and Parallel CoT Decoding] | ||
* 2024-10: [https://arxiv.org/abs/2410.09724 Taming Overconfidence in LLMs: Reward Calibration in RLHF] | * 2024-10: [https://arxiv.org/abs/2410.09724 Taming Overconfidence in LLMs: Reward Calibration in RLHF] | ||
+ | |||
+ | =Science Benchmarks= | ||
+ | * 2024-07: [https://arxiv.org/abs/2407.13168 SciCode: A Research Coding Benchmark Curated by Scientists] ([http://scicode-bench.github.io/ project]) | ||
+ | * 2024-11: [https://openreview.net/pdf?id=fz969ahcvJ AidanBench: Evaluating Novel Idea Generation on Open-Ended Questions] ([https://github.com/aidanmclaughlin/AidanBench code]) | ||
+ | * 2024-12: [https://arxiv.org/abs/2412.17596 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context] | ||
+ | * 2025-01: [https://agi.safe.ai/ Humanity's Last Exam] | ||
=Science Agents= | =Science Agents= | ||
+ | ==Reviews== | ||
+ | * 2024-10: [https://www.cell.com/cell/fulltext/S0092-8674(24)01070-5?target=_blank Empowering biomedical discovery with AI agents] | ||
+ | * 2025-01: [https://pubs.rsc.org/en/content/articlehtml/2024/sc/d4sc03921a A review of large language models and autonomous agents in chemistry] ([https://github.com/ur-whitelab/LLMs-in-science github]) | ||
+ | |||
+ | ==Specific== | ||
* 2024-01-13: [https://arxiv.org/abs/2401.06949 ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization] ([https://www.youtube.com/watch?v=N6qMMwJ8hKQ video]) | * 2024-01-13: [https://arxiv.org/abs/2401.06949 ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization] ([https://www.youtube.com/watch?v=N6qMMwJ8hKQ video]) | ||
* 2024-06-19: [https://arxiv.org/abs/2406.13163 LLMatDesign: Autonomous Materials Discovery with Large Language Models] | * 2024-06-19: [https://arxiv.org/abs/2406.13163 LLMatDesign: Autonomous Materials Discovery with Large Language Models] | ||
Line 60: | Line 79: | ||
* 2024-12-11: Google [https://blog.google/products/gemini/google-gemini-deep-research/ Deep Research] | * 2024-12-11: Google [https://blog.google/products/gemini/google-gemini-deep-research/ Deep Research] | ||
* 2024-12-30: [https://arxiv.org/abs/2412.21154 Aviary: training language agents on challenging scientific tasks] | * 2024-12-30: [https://arxiv.org/abs/2412.21154 Aviary: training language agents on challenging scientific tasks] | ||
+ | |||
+ | ==Science Multi-Agent Setups== | ||
+ | * 2025-01: [https://arxiv.org/abs/2501.04227 Agent Laboratory: Using LLM Agents as Research Assistants] | ||
=AI Science Systems= | =AI Science Systems= | ||
Line 69: | Line 91: | ||
===Chemistry=== | ===Chemistry=== | ||
* 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist) | * 2023-12: [https://doi.org/10.1038/s41586-023-06792-0 Autonomous chemical research with large language models] (Coscientist) | ||
+ | * 2024-11: [https://www.nature.com/articles/s41467-024-54457-x An automatic end-to-end chemical synthesis development platform powered by large language models] | ||
+ | * 2025-01: [https://www.nature.com/articles/s41578-025-00772-8 Large language models for reticular chemistry[] | ||
=Impact of AI in Science= | =Impact of AI in Science= | ||
Line 74: | Line 98: | ||
=Related Tools= | =Related Tools= | ||
+ | ==Literature Search== | ||
+ | * [https://www.perplexity.ai/ Perplexity] | ||
+ | * [https://www.arxival.xyz/ ArXival] | ||
+ | |||
==Data Visualization== | ==Data Visualization== | ||
* 2024-10: [https://www.microsoft.com/en-us/research/blog/data-formulator-exploring-how-ai-can-help-analysts-create-rich-data-visualizations/ Data Formulator: Create Rich Visualization with AI iteratively] ([https://www.microsoft.com/en-us/research/video/data-formulator-create-rich-visualization-with-ai-iteratively/ video], [https://github.com/microsoft/data-formulator code]) | * 2024-10: [https://www.microsoft.com/en-us/research/blog/data-formulator-exploring-how-ai-can-help-analysts-create-rich-data-visualizations/ Data Formulator: Create Rich Visualization with AI iteratively] ([https://www.microsoft.com/en-us/research/video/data-formulator-create-rich-visualization-with-ai-iteratively/ video], [https://github.com/microsoft/data-formulator code]) |
Latest revision as of 09:35, 3 February 2025
Contents
AI Use-cases for Science
Literature
LLM extract data from papers
AI finding links in literature
- 2019-07: Unsupervised word embeddings capture latent knowledge from materials science literature
- 2024-11: Large language models surpass human experts in predicting neuroscience results
Autonomous Ideation
- 2024-09: Mining Causality: AI-Assisted Search for Instrumental Variables
- 2024-12: Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events
- 2024-12: LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research
- 2024-12: LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
Adapting LLMs to Science
- 2023-06: Domain-specific chatbots for science using embeddings
- 2024-10: Personalization of Large Language Models: A Survey
- 2024-11: Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
AI/ML Methods tailored to Science
Regression (Data Fitting)
- 2024-06: Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data: training on (x,y) pairs enables inferring underlying function (define it in code, invert it, compose it)
- 2024-12: OmniPred: Language Models as Universal Regressors
Tabular Classification/Regression
Symbolic Regression
Literature Discovery
- FutureHouse
- Lumina
- Automated-AI-Web-Researcher-Ollama
- 2025-01: Search-o1: Agentic Search-Enhanced Large Reasoning Models (project, code)
Commercial
- Cusp AI: Materials/AI
AI/ML Methods co-opted for Science
Mechanistic Interpretability
Train large model on science data. Then apply mechanistic interpretability (e.g. sparse autoencoders, SAE) to the feature/activation space.
- Mechanistic interpretability for protein language models (visualizer, code, SAE)
- Markov Bio: Through a Glass Darkly: Mechanistic Interpretability as the Bridge to End-to-End Biology (quick description, background info on recent bio progress)
- 2023-01: Tracr: Compiled Transformers as a Laboratory for Interpretability (code)
- 2024-12: Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
- 2024-12: InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
- 2025-01: Insights on Galaxy Evolution from Interpretable Sparse Feature Networks
Uncertainty
- 2024-10: entropix: Entropy Based Sampling and Parallel CoT Decoding
- 2024-10: Taming Overconfidence in LLMs: Reward Calibration in RLHF
Science Benchmarks
- 2024-07: SciCode: A Research Coding Benchmark Curated by Scientists (project)
- 2024-11: AidanBench: Evaluating Novel Idea Generation on Open-Ended Questions (code)
- 2024-12: LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
- 2025-01: Humanity's Last Exam
Science Agents
Reviews
- 2024-10: Empowering biomedical discovery with AI agents
- 2025-01: A review of large language models and autonomous agents in chemistry (github)
Specific
- 2024-01-13: ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization (video)
- 2024-06-19: LLMatDesign: Autonomous Materials Discovery with Large Language Models
- 2024-08-12: Sakana AI: AI Scientist; The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (code)
- 2024-09-09: SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning (code)
- 2024-09-11: PaperQA2: Language Models Achieve Superhuman Synthesis of Scientific Knowledge (𝕏 post, code)
- 2024-10-17: Rapid and Automated Alloy Design with Graph Neural Network-Powered LLM-Driven Multi-Agent Systems
- 2024-10-28: Large Language Model-Guided Prediction Toward Quantum Materials Synthesis
- 2024-12-06: The Virtual Lab: AI Agents Design New SARS-CoV-2 Nanobodies with Experimental Validation (writeup: Virtual lab powered by ‘AI scientists’ super-charges biomedical research: Could human–AI collaborations be the future of interdisciplinary studies?)
- 2024-12-11: Google Deep Research
- 2024-12-30: Aviary: training language agents on challenging scientific tasks
Science Multi-Agent Setups
AI Science Systems
Inorganic Materials Discovery
- 2023-11: Scaling deep learning for materials discovery
- 2023-11: An autonomous laboratory for the accelerated synthesis of novel materials
- 2024-10: Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models (code, datasets, checkpoints, blogpost)
Chemistry
- 2023-12: Autonomous chemical research with large language models (Coscientist)
- 2024-11: An automatic end-to-end chemical synthesis development platform powered by large language models
- 2025-01: Large language models for reticular chemistry[
Impact of AI in Science
Related Tools
Literature Search
Data Visualization
- 2024-10: Data Formulator: Create Rich Visualization with AI iteratively (video, code)
- Julius AI: Analyze your data with computational AI
See Also
- AI agents
- Nanobot.chat: Intelligent AI for the labnetwork @ mtl.mit.edu forum