Difference between revisions of "AI Agents"

From GISAXS
Jump to: navigation, search
(Agentic Systems)
(Open Source Frameworks)
 
(8 intermediate revisions by the same user not shown)
Line 11: Line 11:
 
* [https://github.com/open-thought/system-2-research OpenThought - System 2 Research Links]
 
* [https://github.com/open-thought/system-2-research OpenThought - System 2 Research Links]
 
* [https://github.com/hijkzzz/Awesome-LLM-Strawberry Awesome LLM Strawberry (OpenAI o1): Collection of research papers & blogs for OpenAI Strawberry(o1) and Reasoning]
 
* [https://github.com/hijkzzz/Awesome-LLM-Strawberry Awesome LLM Strawberry (OpenAI o1): Collection of research papers & blogs for OpenAI Strawberry(o1) and Reasoning]
 +
* [https://github.com/e2b-dev/awesome-ai-agents Awesome AI Agents]
  
 
===Analysis/Opinions===
 
===Analysis/Opinions===
Line 49: Line 50:
 
** '''Tools:'''
 
** '''Tools:'''
 
*** [https://github.com/jlowin/fastmcp FastMCP]: The fast, Pythonic way to build MCP servers
 
*** [https://github.com/jlowin/fastmcp FastMCP]: The fast, Pythonic way to build MCP servers
 +
*** [https://github.com/fleuristes/fleur/ Fleur]: A desktop app marketplace for Claude Desktop
 
** '''Servers''' ([https://github.com/modelcontextprotocol/servers full list here]):
 
** '''Servers''' ([https://github.com/modelcontextprotocol/servers full list here]):
 
**# [https://github.com/modelcontextprotocol/servers/tree/main/src/github Github MCP server]
 
**# [https://github.com/modelcontextprotocol/servers/tree/main/src/github Github MCP server]
Line 79: Line 81:
 
===Science Agents===
 
===Science Agents===
 
See [[Science Agents]].
 
See [[Science Agents]].
 +
 +
===Medicine===
 +
* 2025-03: [https://news.microsoft.com/2025/03/03/microsoft-dragon-copilot-provides-the-healthcare-industrys-first-unified-voice-ai-assistant-that-enables-clinicians-to-streamline-clinical-documentation-surface-information-and-automate-task/ Microsoft Dragon Copilot]: streamline clinical workflows and paperwork
  
 
===LLM-as-judge===
 
===LLM-as-judge===
Line 196: Line 201:
 
=Multi-agent orchestration=
 
=Multi-agent orchestration=
 
==Research==
 
==Research==
 +
===Organization Schemes===
 +
* 2025-03: [https://arxiv.org/abs/2503.02390 ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks]
 +
 
===Societies and Communities of AI agents===
 
===Societies and Communities of AI agents===
 
* 2024-12: [https://arxiv.org/abs/2412.10270 Cultural Evolution of Cooperation among LLM Agents]
 
* 2024-12: [https://arxiv.org/abs/2412.10270 Cultural Evolution of Cooperation among LLM Agents]
Line 224: Line 232:
 
* 2024-10: [https://arxiv.org/abs/2410.08164 Agent S: An Open Agentic Framework that Uses Computers Like a Human] ([https://github.com/simular-ai/Agent-S code])
 
* 2024-10: [https://arxiv.org/abs/2410.08164 Agent S: An Open Agentic Framework that Uses Computers Like a Human] ([https://github.com/simular-ai/Agent-S code])
 
* 2024-10: [https://arxiv.org/abs/2410.20424 AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions]
 
* 2024-10: [https://arxiv.org/abs/2410.20424 AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions]
 +
* 2025-02: [https://arxiv.org/abs/2502.16111 PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving]
  
 
===Related work===
 
===Related work===
Line 250: Line 259:
 
* [https://github.com/HKUDS/AutoAgent AutoAgent]: Fully-Automated & Zero-Code LLM Agent Framework
 
* [https://github.com/HKUDS/AutoAgent AutoAgent]: Fully-Automated & Zero-Code LLM Agent Framework
 
* [https://mastra.ai/ Mastra] ([https://github.com/mastra-ai/mastra github]): opinionated Typescript framework for AI applications (primitives for workflows, agents, RAG, integrations and evals)
 
* [https://mastra.ai/ Mastra] ([https://github.com/mastra-ai/mastra github]): opinionated Typescript framework for AI applications (primitives for workflows, agents, RAG, integrations and evals)
 +
* [https://github.com/orra-dev/orra Orra]: multi-agent applications with complex real-world interactions
 +
* [https://github.com/gensx-inc/gensx/blob/main/README.md GenSX]
 +
* Cloudflare [https://developers.cloudflare.com/agents/ agents-sdk] ([https://blog.cloudflare.com/build-ai-agents-on-cloudflare/ info], [https://github.com/cloudflare/agents code])
 +
* OpenAI [https://platform.openai.com/docs/api-reference/responses responses API] and [https://platform.openai.com/docs/guides/agents agents SDK]
  
 
==Open Source Systems==
 
==Open Source Systems==
Line 340: Line 353:
 
=See Also=
 
=See Also=
 
* [[Science Agents]]
 
* [[Science Agents]]
 +
* [[Increasing AI Intelligence]]
 
* [[AI tools]]
 
* [[AI tools]]
 
* [[AI understanding]]
 
* [[AI understanding]]
 
* [[Robots]]
 
* [[Robots]]
 
* [[Exocortex]]
 
* [[Exocortex]]

Latest revision as of 09:20, 12 March 2025

Reviews & Perspectives

Published

Continually updating

Analysis/Opinions

Guides

AI Assistants

Components of AI Assistants

Agent Internal Workflow Management

Information Retrieval (Memory)

Contextual Memory

  • Memobase: user profile-based memory (long-term user memory for genAI) applications)

Control (tool-use, computer use, etc.)

Open-source

Personalities/Personas

Specific Uses for AI Assistants

Computer Use

Software Engineering

Science Agents

See Science Agents.

Medicine

LLM-as-judge

Deep Research

Advanced Workflows

Streamline Administrative Tasks

Author Research Articles

Software Development Workflows

Several paradigms of AI-assisted coding have arisen:

  1. Manual, human driven
  2. AI-aided through chat/dialogue, where the human asks for code and then copies it into the project
    1. OpenAI ChatGPT
    2. Anthropic Claude
  3. API calls to an LLM, which generates code and inserts the file into the project
  4. LLM-integration into the IDE
    1. Copilot
    2. Qodo (Codium) & AlphaCodium (preprint, code)
    3. Cursor
    4. Codeium Windsurf (with "Cascade" AI Agent)
    5. ByteDance Trae AI
    6. Tabnine
    7. Traycer
    8. IDX: free
    9. Aide: open-source AI-native code editor (fork of VS Code)
    10. continue.dev: open-source code assistant
    11. Pear AI: open-source code editor
    12. Haystack Editor: canvas UI
    13. Onlook: for designers
  5. AI-assisted IDE, where the AI generates and manages the dev environment
    1. Replit
    2. Aider (code): Pair programming on commandline
    3. Pythagora
    4. StackBlitz bolt.new
    5. Cline (formerly Claude Dev)
  6. Prompt-to-product
    1. Github Spark (demo video)
    2. Create.xyz: text-to-app, replicate product from link
    3. a0.dev: generate mobil apps (from your phone)
    4. Softgen: web app developer
    5. wrapifai: build form-based apps
    6. Lovable: web app (from text, screenshot, etc.)
    7. Vercel v0
    8. MarsX (John Rush): SaaS builder
    9. Webdraw: turn sketches into web apps
    10. Tempo Labs: build React apps
    11. Databutton: no-code software development
    12. base44: no-code dashboard apps
    13. Origin AI
  7. Semi-autonomous software engineer agents
    1. Devin (Cognition AI)
    2. Amazon Q (and CodeWhisperer)
    3. Honeycomb
    4. Claude Code

For a review of the current state of software-engineering agentic approaches, see:

Corporate AI Agent Ventures

Mundane Workflows and Capabilities

Inference-compute Reasoning

AI Assistant

Agentic Systems

Increasing AI Agent Intelligence

See: Increasing AI Intelligence

Multi-agent orchestration

Research

Organization Schemes

Societies and Communities of AI agents

Domain-specific

Research demos

Related work

Inter-agent communications

Architectures

Open Source Frameworks

Open Source Systems

Commercial Automation Frameworks

Spreadsheet

Cloud solutions

Frameworks

Optimization

Metrics, Benchmarks

Evaluation Schemes

Multi-agent

Agent Challenges

  • Aidan-Bench: Test creativity by having a particular LLM generate long sequence of outputs (meant to be different), and measuring how long it can go before duplications appear.
  • Pictionary: LLM suggests prompt, multiple LLMs generate outputs, LLM judges; allows raking of the generation abilities.
  • MC-bench: Request LLMs to build an elaborate structure in Minecraft; outputs can be A/B tested by human judges.

Automated Improvement

See Also