Difference between revisions of "AI:tools"

From GISAXS
Jump to: navigation, search
(Created page with "=LLM= ==Open-weights LLM== * [https://blogs.nvidia.com/blog/nemotron-4-synthetic-data-generation-llm-training/ Nemetron-4 340B] ==Cloud LLM== * [https://groq.com/ Groq] [http...")
 
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
=LLM=
+
Up-to-date list [http://gisaxs.com/CS/index.php/AI_tools here].
==Open-weights LLM==
 
* [https://blogs.nvidia.com/blog/nemotron-4-synthetic-data-generation-llm-training/ Nemetron-4 340B]
 
 
 
==Cloud LLM==
 
* [https://groq.com/ Groq] [https://wow.groq.com/ cloud] (very fast inference)
 
 
 
===Multi-modal: Audio===
 
* [https://kyutai.org/ kyutai Open Science AI Lab] chatbot [https://www.us.moshi.chat/?queue_id=talktomoshi moshi]
 
 
 
==Triage==
 
* [https://arxiv.org/abs/2406.18665 RouteLLM: Learning to Route LLMs with Preference Data]
 
 
 
==Retrieval Augmented Generation (RAG)==
 
* GraphRAG ([https://arxiv.org/abs/2404.16130 preprint], [https://github.com/microsoft/graphrag code])
 
 
 
==Automatic Optimization==
 
===Analogous to Gradient Descent===
 
* [https://arxiv.org/abs/2406.07496 TextGrad: Automatic "Differentiation" via Text]
 
* [https://arxiv.org/abs/2406.18532 Symbolic Learning Enables Self-Evolving Agents]
 
 
 
==LLM for scoring/ranking==
 
* [https://arxiv.org/abs/2302.04166 GPTScore: Evaluate as You Desire]
 
* [https://arxiv.org/abs/2306.17563 Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting]
 
* [https://doi.org/10.1039/D3DD00112A Domain-specific chatbots for science using embeddings]
 
* [https://arxiv.org/abs/2407.02977 Large Language Models as Evaluators for Scientific Synthesis]
 
 
 
=LLM Agents=
 
 
 
==Multi-agent orchestration==
 
===Research demos===
 
* [https://github.com/camel-ai/camel Camel]
 
* [https://github.com/farizrahman4u/loopgpt/tree/main LoopGPT]
 
* [https://github.com/microsoft/JARVIS JARVIS]
 
* [https://github.com/agiresearch/OpenAGI OpenAGI]
 
* [https://github.com/microsoft/autogen AutoGen]
 
* [https://github.com/microsoft/TaskWeaver TaskWeaver]
 
* [https://github.com/geekan/MetaGPT MetaGPT]
 
 
 
===Architectures===
 
* [https://arxiv.org/abs/2406.04692 Mixture-of-Agents Enhances Large Language Model Capabilities]
 
 
 
===Cloud solutions===
 
* [https://numbersstation.ai/ Numbers Station] [https://numbersstation.ai/introducing-meadow-llm-agents-for-data-tasks/ Meadow]: agentic framework for data workflows ([https://github.com/NumbersStationAI/meadow code]).
 
* [https://www.crewai.com/ CrewAI] says they provide multi-agent automations ([https://github.com/joaomdmoura/crewAI code]).
 
* [https://www.langchain.com/ LangChain] introduced [https://www.langchain.com/langgraph?ref=blog.langchain.dev LangGraph] to help build agents, and [https://blog.langchain.dev/langgraph-cloud/ LangGraph Cloud] as a service for running those agents.
 
 
 
==Optimization==
 
===Metrics, Benchmarks===
 
* [https://arxiv.org/abs/2407.01502 AI Agents That Matter]
 
 
 
===Evolution Improvement===
 
* [https://arxiv.org/abs/2406.14228 EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms]
 
 
 
=Text-to-speech (TTS)=
 
==Open Source==
 
* [https://github.com/huggingface/parler-tts Parler TTS]
 
* [https://github.com/DigitalPhonetics/IMS-Toucan Toucan] ([https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS demo])
 
* [https://tts.themetavoice.xyz/ MetaVoice] ([https://github.com/metavoiceio/metavoice-src github])
 
* [https://github.com/2noise/ChatTTS ChatTTS]
 
* [https://www.camb.ai/ Camb.ai] [https://github.com/Camb-ai/MARS5-TTS MARS5-TTS]
 
 
 
==Cloud==
 
* [https://elevenlabs.io/ Elevenlabs]
 
** [https://elevenlabs.io/voice-isolator voice isolator]
 
* [https://cartesia.ai/ Cartesia] [https://cartesia.ai/sonic Sonic]
 
 
 
=Conversational Audio Chatbot=
 
* Swift is a fast AI voice assistant ([https://github.com/ai-ng/swift code], [https://swift-ai.vercel.app/ live demo]) uses:
 
** [https://groq.com/ Groq] cloud running [https://github.com/openai/whisper OpenAI Whisper] for fast speech transcription.
 
** [https://cartesia.ai/ Cartesia] [https://cartesia.ai/sonic Sonic] for fast speech synthesis
 
** [https://www.vad.ricky0123.com/ VAD] to detect when user is talking
 
** [https://vercel.com/ Vercel] for app deployment
 
* [https://kyutai.org/ kyutai] Moshi chatbot ([https://us.moshi.chat/ demo])
 
 
 
=Vision=
 
* [https://github.com/roboflow/supervision Supervision]
 
* [https://arxiv.org/abs/2311.06242 Florence-2]
 

Latest revision as of 10:59, 22 August 2024

Up-to-date list here.