|
|
(One intermediate revision by the same user not shown) |
Line 1: |
Line 1: |
− | =LLM=
| + | Up-to-date list [http://gisaxs.com/CS/index.php/AI_tools here]. |
− | ==Open-weights LLM==
| |
− | * [https://blogs.nvidia.com/blog/nemotron-4-synthetic-data-generation-llm-training/ Nemetron-4 340B]
| |
− | | |
− | ==Cloud LLM==
| |
− | * [https://groq.com/ Groq] [https://wow.groq.com/ cloud] (very fast inference)
| |
− | | |
− | ===Multi-modal: Audio===
| |
− | * [https://kyutai.org/ kyutai Open Science AI Lab] chatbot [https://www.us.moshi.chat/?queue_id=talktomoshi moshi]
| |
− | | |
− | ==Triage==
| |
− | * [https://arxiv.org/abs/2406.18665 RouteLLM: Learning to Route LLMs with Preference Data]
| |
− | | |
− | ==Retrieval Augmented Generation (RAG)==
| |
− | * GraphRAG ([https://arxiv.org/abs/2404.16130 preprint], [https://github.com/microsoft/graphrag code])
| |
− | | |
− | ==Automatic Optimization==
| |
− | ===Analogous to Gradient Descent===
| |
− | * [https://arxiv.org/abs/2406.07496 TextGrad: Automatic "Differentiation" via Text]
| |
− | * [https://arxiv.org/abs/2406.18532 Symbolic Learning Enables Self-Evolving Agents]
| |
− | | |
− | ==LLM for scoring/ranking==
| |
− | * [https://arxiv.org/abs/2302.04166 GPTScore: Evaluate as You Desire]
| |
− | * [https://arxiv.org/abs/2306.17563 Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting]
| |
− | * [https://doi.org/10.1039/D3DD00112A Domain-specific chatbots for science using embeddings]
| |
− | * [https://arxiv.org/abs/2407.02977 Large Language Models as Evaluators for Scientific Synthesis]
| |
− | | |
− | =LLM Agents=
| |
− | | |
− | ==Multi-agent orchestration==
| |
− | ===Research demos===
| |
− | * [https://github.com/camel-ai/camel Camel]
| |
− | * [https://github.com/farizrahman4u/loopgpt/tree/main LoopGPT]
| |
− | * [https://github.com/microsoft/JARVIS JARVIS]
| |
− | * [https://github.com/agiresearch/OpenAGI OpenAGI]
| |
− | * [https://github.com/microsoft/autogen AutoGen]
| |
− | * [https://github.com/microsoft/TaskWeaver TaskWeaver]
| |
− | * [https://github.com/geekan/MetaGPT MetaGPT]
| |
− | | |
− | ===Architectures===
| |
− | * [https://arxiv.org/abs/2406.04692 Mixture-of-Agents Enhances Large Language Model Capabilities]
| |
− | | |
− | ===Cloud solutions===
| |
− | * [https://numbersstation.ai/ Numbers Station] [https://numbersstation.ai/introducing-meadow-llm-agents-for-data-tasks/ Meadow]: agentic framework for data workflows ([https://github.com/NumbersStationAI/meadow code]).
| |
− | * [https://www.crewai.com/ CrewAI] says they provide multi-agent automations ([https://github.com/joaomdmoura/crewAI code]).
| |
− | * [https://www.langchain.com/ LangChain] introduced [https://www.langchain.com/langgraph?ref=blog.langchain.dev LangGraph] to help build agents, and [https://blog.langchain.dev/langgraph-cloud/ LangGraph Cloud] as a service for running those agents.
| |
− | | |
− | ==Optimization==
| |
− | ===Metrics, Benchmarks===
| |
− | * [https://arxiv.org/abs/2407.01502 AI Agents That Matter]
| |
− | | |
− | ===Evolution Improvement===
| |
− | * [https://arxiv.org/abs/2406.14228 EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms]
| |
− | | |
− | =Text-to-speech (TTS)=
| |
− | ==Open Source==
| |
− | * [https://github.com/huggingface/parler-tts Parler TTS]
| |
− | * [https://github.com/DigitalPhonetics/IMS-Toucan Toucan] ([https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS demo])
| |
− | * [https://tts.themetavoice.xyz/ MetaVoice] ([https://github.com/metavoiceio/metavoice-src github])
| |
− | * [https://github.com/2noise/ChatTTS ChatTTS]
| |
− | * [https://www.camb.ai/ Camb.ai] [https://github.com/Camb-ai/MARS5-TTS MARS5-TTS]
| |
− | | |
− | ==Cloud==
| |
− | * [https://elevenlabs.io/ Elevenlabs]
| |
− | ** [https://elevenlabs.io/voice-isolator voice isolator]
| |
− | * [https://cartesia.ai/ Cartesia] [https://cartesia.ai/sonic Sonic]
| |
− | | |
− | =Conversational Audio Chatbot=
| |
− | * Swift is a fast AI voice assistant ([https://github.com/ai-ng/swift code], [https://swift-ai.vercel.app/ live demo]) uses:
| |
− | ** [https://groq.com/ Groq] cloud running [https://github.com/openai/whisper OpenAI Whisper] for fast speech transcription.
| |
− | ** [https://cartesia.ai/ Cartesia] [https://cartesia.ai/sonic Sonic] for fast speech synthesis
| |
− | ** [https://www.vad.ricky0123.com/ VAD] to detect when user is talking
| |
− | ** [https://vercel.com/ Vercel] for app deployment
| |
− | * [https://kyutai.org/ kyutai] Moshi chatbot ([https://us.moshi.chat/ demo])
| |
− | | |
− | =Vision=
| |
− | * [https://github.com/roboflow/supervision Supervision]
| |
− | * [https://arxiv.org/abs/2311.06242 Florence-2]
| |