Difference between revisions of "AI and Humans"

From GISAXS
Jump to: navigation, search
(Medical)
(Medical)
 
(13 intermediate revisions by the same user not shown)
Line 100: Line 100:
 
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?]
 
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?]
  
===Marketing===
+
===Business & Marketing===
 
* 2023-11: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4597899 The power of generative marketing: Can generative AI create superhuman visual marketing content?]
 
* 2023-11: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4597899 The power of generative marketing: Can generative AI create superhuman visual marketing content?]
 +
* 2024-02: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4714776 Generative Artificial Intelligence and Evaluating Strategic Decisions]
  
 
===Professions===
 
===Professions===
 
* [https://agi.safe.ai/submit Humanity's Last Exam]
 
* [https://agi.safe.ai/submit Humanity's Last Exam]
 
** [https://x.com/alexandr_wang/status/1835738937719140440 Effort to build] a dataset of challenging (but resolvable) questions in specific domain areas, to act as a benchmark to test whether AIs are improving in these challenging topics.
 
** [https://x.com/alexandr_wang/status/1835738937719140440 Effort to build] a dataset of challenging (but resolvable) questions in specific domain areas, to act as a benchmark to test whether AIs are improving in these challenging topics.
 +
 +
====Coding====
 +
* 2025-02: [https://arxiv.org/abs/2502.06807 Competitive Programming with Large Reasoning Models]
  
 
====Medical====
 
====Medical====
Line 129: Line 133:
 
* 2025-04: [https://www.nature.com/articles/s41586-025-08866-7?linkId=13898052 Towards conversational diagnostic artificial intelligence]
 
* 2025-04: [https://www.nature.com/articles/s41586-025-08866-7?linkId=13898052 Towards conversational diagnostic artificial intelligence]
 
* 2025-04: [https://www.nature.com/articles/s41586-025-08869-4?linkId=13898054 Towards accurate differential diagnosis with large language models]
 
* 2025-04: [https://www.nature.com/articles/s41586-025-08869-4?linkId=13898054 Towards accurate differential diagnosis with large language models]
 +
 +
====Bio====
 +
* 2025-04: [https://www.virologytest.ai/vct_paper.pdf Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark]
 +
** Time: [https://time.com/7279010/ai-virus-lab-biohazard-study/ Exclusive: AI Outsmarts Virus Experts in the Lab, Raising Biohazard Fears]
 +
** AI Frontiers: [https://www.ai-frontiers.org/articles/ais-are-disseminating-expert-level-virology-skills AIs Are Disseminating Expert-Level Virology Skills]
  
 
====Therapy====
 
====Therapy====
Line 162: Line 171:
  
 
===Medical===
 
===Medical===
* 2025-03L: [https://www.medrxiv.org/content/10.1101/2025.02.28.25323115v1.full Medical Hallucination in Foundation Models and Their Impact on Healthcare]
+
* 2025-03: [https://www.medrxiv.org/content/10.1101/2025.02.28.25323115v1.full Medical Hallucination in Foundation Models and Their Impact on Healthcare]
 +
* 2025-03: [https://journals.lww.com/international-journal-of-surgery/fulltext/2025/03000/chatgpt_s_role_in_alleviating_anxiety_in_total.20.aspx ChatGPT’s role in alleviating anxiety in total knee arthroplasty consent process: a randomized controlled trial pilot study]
  
 
===Translation===
 
===Translation===
Line 185: Line 195:
 
* 2024-07: [https://arxiv.org/abs/2407.19096 AI Companions Reduce Loneliness]
 
* 2024-07: [https://arxiv.org/abs/2407.19096 AI Companions Reduce Loneliness]
 
* 2025-03: [https://dam-prod2.media.mit.edu/x/2025/03/21/Randomized_Control_Study_on_Chatbot_Psychosocial_Effect.pdf How AI and Human Behaviors Shape Psychosocial Effects of Chatbot Use: A Longitudinal Controlled Study]
 
* 2025-03: [https://dam-prod2.media.mit.edu/x/2025/03/21/Randomized_Control_Study_on_Chatbot_Psychosocial_Effect.pdf How AI and Human Behaviors Shape Psychosocial Effects of Chatbot Use: A Longitudinal Controlled Study]
 +
 +
==AI worse than humans==
 +
* 2025-04: [https://spinup-000d1a-wp-offload-media.s3.amazonaws.com/faculty/wp-content/uploads/sites/27/2025/03/AI-debt-collection-20250331.pdf How Good is AI at Twisting Arms? Experiments in Debt Collection]
  
 
==Human Perceptions of AI==
 
==Human Perceptions of AI==
Line 198: Line 211:
 
* 2024-07: [https://arxiv.org/abs/2407.08853 GPT-4 is judged more human than humans in displaced and inverted Turing tests]
 
* 2024-07: [https://arxiv.org/abs/2407.08853 GPT-4 is judged more human than humans in displaced and inverted Turing tests]
 
* 2025-03: [https://arxiv.org/abs/2503.23674 Large Language Models Pass the Turing Test]
 
* 2025-03: [https://arxiv.org/abs/2503.23674 Large Language Models Pass the Turing Test]
 +
* 2025-04: [https://www.sciencedirect.com/science/article/abs/pii/S0022103117303980 A Minimal Turing Test]
 +
 
'''Art'''
 
'''Art'''
 
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?] Differentiation was only slightly above random (60%). AI art was often ranked higher than human-made.
 
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?] Differentiation was only slightly above random (60%). AI art was often ranked higher than human-made.
Line 225: Line 240:
 
==Usage For==
 
==Usage For==
 
* 2024-12: [https://assets.anthropic.com/m/7e1ab885d1b24176/original/Clio-Privacy-Preserving-Insights-into-Real-World-AI-Use.pdf Clio: A system for privacy-preserving insights into real-world AI use] (Anthropic [https://www.anthropic.com/research/clio Clio])
 
* 2024-12: [https://assets.anthropic.com/m/7e1ab885d1b24176/original/Clio-Privacy-Preserving-Insights-into-Real-World-AI-Use.pdf Clio: A system for privacy-preserving insights into real-world AI use] (Anthropic [https://www.anthropic.com/research/clio Clio])
 +
* 2025-03: [https://learn.filtered.com/hubfs/The%202025%20Top-100%20Gen%20AI%20Use%20Case%20Report.pdf How People are Really Using Generative AI Now] ([https://hbr.org/2025/04/how-people-are-really-using-gen-ai-in-2025 writeup])
 +
* 2025-04: [https://www.anthropic.com/news/anthropic-education-report-how-university-students-use-claude Anthropic Education Report: How University Students Use Claude]
  
 
=Sentiment=
 
=Sentiment=
Line 247: Line 264:
 
* 2024-10: [https://www.pnas.org/doi/10.1073/pnas.2407639121 Large Language Models based on historical text could offer informative tools for behavioral science]
 
* 2024-10: [https://www.pnas.org/doi/10.1073/pnas.2407639121 Large Language Models based on historical text could offer informative tools for behavioral science]
 
* 2025-04: [https://arxiv.org/abs/2504.02234 LLM Social Simulations Are a Promising Research Method]
 
* 2025-04: [https://arxiv.org/abs/2504.02234 LLM Social Simulations Are a Promising Research Method]
 +
* 2025-04: [https://www.nber.org/papers/w33662 Measuring Human Leadership Skills with AI Agents]
 +
* 2025-04: [https://arxiv.org/abs/2504.10157 SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users]
  
 
=See Also=
 
=See Also=
 
* [https://www.google.com/books/edition/_/cKnYEAAAQBAJ?hl=en&gbpv=1&pg=PA2 UNESCO. Guidance for Generative AI in Education and Research]
 
* [https://www.google.com/books/edition/_/cKnYEAAAQBAJ?hl=en&gbpv=1&pg=PA2 UNESCO. Guidance for Generative AI in Education and Research]
 
* [[AI]]
 
* [[AI]]

Latest revision as of 09:28, 28 April 2025

AI in Education

Survey/study of

AI improves learning/education

AI harms learning

Software/systems

LLMs

Individual tools

Systems

AI for grading

Detection

AI Text Detectors Don't Work

AI/human

Capabilities

Writing

AI out-performs humans

Tests

Creativity

Art

Business & Marketing

Professions

  • Humanity's Last Exam
    • Effort to build a dataset of challenging (but resolvable) questions in specific domain areas, to act as a benchmark to test whether AIs are improving in these challenging topics.

Coding

Medical

Bio

Therapy

Financial

AI improves human work

Coding

Forecasting

Finance

Law

Medical

Translation

Customer service

  • 2023-11: Generative AI at Work: Improvements for workers and clients (though also a ceiling to improvement)

Creativity

Equity

Counter loneliness

AI worse than humans

Human Perceptions of AI

AI passes Turing Test

Text Dialog

Art

Psychological Effects of AI Usage

Uptake

Usage For

Sentiment

Persuasion

(AI can update beliefs, change opinions, tackle conspiracy theories, etc.)

Simulate Humans

See Also