Difference between revisions of "AI and Humans"

From GISAXS
Jump to: navigation, search
(AI improves human work)
(Software/systems)
 
(18 intermediate revisions by the same user not shown)
Line 20: Line 20:
 
** 6 weeks of after-school AI tutoring = 2 years of typical learning gains
 
** 6 weeks of after-school AI tutoring = 2 years of typical learning gains
 
** outperforms 80% of other educational interventions
 
** outperforms 80% of other educational interventions
 +
* [https://arxiv.org/abs/2409.09047 AI Meets the Classroom: When Do Large Language Models Harm Learning?]
 +
** Outcomes depend on usage
 +
* [https://www.deeplearning.ai/the-batch/gpt-4-boosts-remote-tutors-performance-in-real-time-study-finds/ LLM Support for Tutors GPT-4 boosts remote tutors’ performance in real time, study finds]
 +
** [https://arxiv.org/abs/2410.03017 Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise]
  
 
==AI harms learning==
 
==AI harms learning==
Line 40: Line 44:
 
* [https://notebooklm.google.com/ NotebookLM]: Enables one to "chat with documents".
 
* [https://notebooklm.google.com/ NotebookLM]: Enables one to "chat with documents".
 
* Google [https://learning.google.com/experiments/learn-about/signup Learn About]
 
* Google [https://learning.google.com/experiments/learn-about/signup Learn About]
 +
 +
===Systems===
 +
* [https://www.anthropic.com/news/introducing-claude-for-education Anthropic] [https://www.anthropic.com/education Claude for Education]
  
 
==AI for grading==
 
==AI for grading==
Line 55: Line 62:
  
 
=AI/human=
 
=AI/human=
 +
==Capabilities==
 +
===Writing===
 +
 +
* 2022-12: [https://aclanthology.org/2022.emnlp-main.296/ Re3: Generating Longer Stories With Recursive Reprompting and Revision]
 +
* 2023-03: English essays: [https://journal.unnes.ac.id/sju/index.php/elt/article/view/64069 Artificial intelligence (AI) technology in OpenAI ChatGPT application: A review of ChatGPT in writing English essay]
 +
* 2023-01: Journalism: [https://journals.sagepub.com/doi/10.1177/10776958221149577 Collaborating With ChatGPT: Considering the Implications of Generative Artificial Intelligence for Journalism and Media Education]
 +
* 2023-07: Science writing: [https://www.rbmojournal.com/article/S1472-6483(23)00219-5/fulltext Artificial intelligence in scientific writing: a friend or a foe?]
 +
* 2024-02: Wikipedia style: [https://arxiv.org/abs/2402.14207 Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models]
 +
* 2024-02: [https://arxiv.org/abs/2408.07055 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs] ([https://github.com/THUDM/LongWriter code])
 +
* 2024-08: Scientific papers: [The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery]
 +
* 2024-09: PaperQA2: [https://paper.wikicrow.ai/ Language Models Achieve Superhuman Synthesis of Scientific Knowledge] ([https://x.com/SGRodriques/status/1833908643856818443 𝕏 post], [https://github.com/Future-House/paper-qa code])
 +
* 2025-03: [https://arxiv.org/abs/2503.19065 WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation]
 +
 
==AI out-performs humans==
 
==AI out-performs humans==
 
===Tests===
 
===Tests===
Line 96: Line 116:
 
** 2024-01: [https://arxiv.org/abs/2401.05654 Towards Conversational Diagnostic AI] ([https://research.google/blog/amie-a-research-ai-system-for-diagnostic-medical-reasoning-and-conversations/ blog]: Articulate Medical Intelligence Explorer, AMIE)
 
** 2024-01: [https://arxiv.org/abs/2401.05654 Towards Conversational Diagnostic AI] ([https://research.google/blog/amie-a-research-ai-system-for-diagnostic-medical-reasoning-and-conversations/ blog]: Articulate Medical Intelligence Explorer, AMIE)
 
** 2025-03: [https://www.gstatic.com/amie/towards_conversational_ai_for_disease_management.pdf Towards Conversational AI for Disease Management] ([https://research.google/blog/from-diagnosis-to-treatment-advancing-amie-for-longitudinal-disease-management/ blog])
 
** 2025-03: [https://www.gstatic.com/amie/towards_conversational_ai_for_disease_management.pdf Towards Conversational AI for Disease Management] ([https://research.google/blog/from-diagnosis-to-treatment-advancing-amie-for-longitudinal-disease-management/ blog])
 +
* 2025-02: [https://arxiv.org/abs/2502.19655 Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning]
 +
* 2025-03: [https://arxiv.org/abs/2503.13939 Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models]
  
 
====Therapy====
 
====Therapy====
 
* 2025-02: [https://journals.plos.org/mentalhealth/article?id=10.1371/journal.pmen.0000145 When ELIZA meets therapists: A Turing test for the heart and mind]
 
* 2025-02: [https://journals.plos.org/mentalhealth/article?id=10.1371/journal.pmen.0000145 When ELIZA meets therapists: A Turing test for the heart and mind]
 +
* 2025-03: Therabot: [https://ai.nejm.org/doi/full/10.1056/AIoa2400802 Randomized Trial of a Generative AI Chatbot for Mental Health Treatment]
  
 
====Financial====
 
====Financial====
Line 110: Line 133:
 
* 2023-12: [https://arxiv.org/abs/2312.05481 Artificial Intelligence in the Knowledge Economy]: Non-autonomous AI (chatbot) benefits least knowledgeable workers; autonomous agents benefit the most knowledgeable workers
 
* 2023-12: [https://arxiv.org/abs/2312.05481 Artificial Intelligence in the Knowledge Economy]: Non-autonomous AI (chatbot) benefits least knowledgeable workers; autonomous agents benefit the most knowledgeable workers
 
* 2024-07: [https://www.microsoft.com/en-us/research/publication/generative-ai-in-real-world-workplaces/ Generative AI in Real-World Workplaces: The Second Microsoft Report on AI and Productivity Research]
 
* 2024-07: [https://www.microsoft.com/en-us/research/publication/generative-ai-in-real-world-workplaces/ Generative AI in Real-World Workplaces: The Second Microsoft Report on AI and Productivity Research]
 +
* 2025-03: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5188231 The Cybernetic Teammate: A Field Experiment on Generative AI Reshaping Teamwork and Expertise]
 +
** 2025-03: Ethan Mollick: [https://www.oneusefulthing.org/p/the-cybernetic-teammateThe Cybernetic Teammate]: Having an AI on your team can increase performance, provide expertise, and improve your experience
  
 
===Coding===
 
===Coding===
Line 138: Line 163:
 
* 2024-11: [https://conference.nber.org/conf_papers/f210475.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation]: diffusion model increases "innovation" (patents), boosts the best performers, but also removes some enjoyable tasks.
 
* 2024-11: [https://conference.nber.org/conf_papers/f210475.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation]: diffusion model increases "innovation" (patents), boosts the best performers, but also removes some enjoyable tasks.
 
* 2024-12: [https://doi.org/10.1080/10400419.2024.2440691 Using AI to Generate Visual Art: Do Individual Differences in Creativity Predict AI-Assisted Art Quality?] ([https://osf.io/preprints/psyarxiv/ygzw6 preprint]): shows that more creative humans produce more creative genAI outputs
 
* 2024-12: [https://doi.org/10.1080/10400419.2024.2440691 Using AI to Generate Visual Art: Do Individual Differences in Creativity Predict AI-Assisted Art Quality?] ([https://osf.io/preprints/psyarxiv/ygzw6 preprint]): shows that more creative humans produce more creative genAI outputs
 +
* 2025-01: [https://arxiv.org/abs/2501.11433 One Does Not Simply Meme Alone: Evaluating Co-Creativity Between LLMs and Humans in the Generation of Humor]
  
 
===Equity===
 
===Equity===
Line 144: Line 170:
 
===Counter loneliness===
 
===Counter loneliness===
 
* 2024-07: [https://arxiv.org/abs/2407.19096 AI Companions Reduce Loneliness]
 
* 2024-07: [https://arxiv.org/abs/2407.19096 AI Companions Reduce Loneliness]
 +
* 2025-03: [https://dam-prod2.media.mit.edu/x/2025/03/21/Randomized_Control_Study_on_Chatbot_Psychosocial_Effect.pdf How AI and Human Behaviors Shape Psychosocial Effects of Chatbot Use: A Longitudinal Controlled Study]
  
 
==Human Perceptions of AI==
 
==Human Perceptions of AI==
 
* 2023-09: [https://www.nature.com/articles/d41586-023-02980-0 AI and science: what 1,600 researchers think. A Nature survey finds that scientists are concerned, as well as excited, by the increasing use of artificial-intelligence tools in research.]
 
* 2023-09: [https://www.nature.com/articles/d41586-023-02980-0 AI and science: what 1,600 researchers think. A Nature survey finds that scientists are concerned, as well as excited, by the increasing use of artificial-intelligence tools in research.]
 
* 2024-11: [https://doi.org/10.1016/S2589-7500(24)00202-4 Attitudes and perceptions of medical researchers towards the use of artificial intelligence chatbots in the scientific process: an international cross-sectional survey] (Nature commentary: [https://www.nature.com/articles/s41592-024-02369-5 Quest for AI literacy])
 
* 2024-11: [https://doi.org/10.1016/S2589-7500(24)00202-4 Attitudes and perceptions of medical researchers towards the use of artificial intelligence chatbots in the scientific process: an international cross-sectional survey] (Nature commentary: [https://www.nature.com/articles/s41592-024-02369-5 Quest for AI literacy])
 +
* 2025-03: [https://www.arxiv.org/abs/2503.16458 Users Favor LLM-Generated Content -- Until They Know It's AI]
  
 
===AI passes Turing Test===
 
===AI passes Turing Test===
Line 155: Line 183:
 
* 2024-05: [https://arxiv.org/abs/2405.08007 People cannot distinguish GPT-4 from a human in a Turing test]
 
* 2024-05: [https://arxiv.org/abs/2405.08007 People cannot distinguish GPT-4 from a human in a Turing test]
 
* 2024-07: [https://arxiv.org/abs/2407.08853 GPT-4 is judged more human than humans in displaced and inverted Turing tests]
 
* 2024-07: [https://arxiv.org/abs/2407.08853 GPT-4 is judged more human than humans in displaced and inverted Turing tests]
 +
* 2025-03: [https://arxiv.org/abs/2503.23674 Large Language Models Pass the Turing Test]
 
'''Art'''
 
'''Art'''
 
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?] Differentiation was only slightly above random (60%). AI art was often ranked higher than human-made.
 
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?] Differentiation was only slightly above random (60%). AI art was often ranked higher than human-made.
 
* 2024-11: [https://doi.org/10.1038/s41598-024-76900-1 AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably]
 
* 2024-11: [https://doi.org/10.1038/s41598-024-76900-1 AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably]
 +
 +
==Psychological Effects of AI Usage==
 +
* 2025-03: [https://cdn.openai.com/papers/15987609-5f71-433c-9972-e91131f399a1/openai-affective-use-study.pdf Investigating Affective Use and Emotional Well-being on ChatGPT]
 +
* 2025-03: [https://dam-prod2.media.mit.edu/x/2025/03/21/Randomized_Control_Study_on_Chatbot_Psychosocial_Effect.pdf How AI and Human Behaviors Shape Psychosocial Effects of Chatbot Use: A Longitudinal Controlled Study]
 +
* 2025-03: [https://www.microsoft.com/en-us/research/publication/the-impact-of-generative-ai-on-critical-thinking-self-reported-reductions-in-cognitive-effort-and-confidence-effects-from-a-survey-of-knowledge-workers/ The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers]
  
 
=Uptake=
 
=Uptake=
Line 169: Line 203:
 
** 72% of leaders use genAI at least once a week (c.f. 23% in 2023); 90% agree AI enhances skills (c.f. 80% in 2023)
 
** 72% of leaders use genAI at least once a week (c.f. 23% in 2023); 90% agree AI enhances skills (c.f. 80% in 2023)
 
** Spending on genAI is up 130% (most companies plan to invest going forward)
 
** Spending on genAI is up 130% (most companies plan to invest going forward)
 +
* 2024-12: [https://www.pnas.org/doi/10.1073/pnas.2414972121 The unequal adoption of ChatGPT exacerbates existing inequalities among workers]
 +
** Higher adoption among young and less experienced
 +
** Lower adoption among women and lower-earning workers
 
* 2025-02: [https://arxiv.org/abs/2502.09747 The Widespread Adoption of Large Language Model-Assisted Writing Across Society]: 10-25% adoption across a range of contexts
 
* 2025-02: [https://arxiv.org/abs/2502.09747 The Widespread Adoption of Large Language Model-Assisted Writing Across Society]: 10-25% adoption across a range of contexts
 
* 2025-02: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5078805 Local Heterogeneity in Artificial Intelligence Jobs Over Time and Space]
 
* 2025-02: [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5078805 Local Heterogeneity in Artificial Intelligence Jobs Over Time and Space]

Latest revision as of 09:09, 3 April 2025

AI in Education

Survey/study of

AI improves learning/education

AI harms learning

Software/systems

LLMs

Individual tools

Systems

AI for grading

Detection

AI Text Detectors Don't Work

AI/human

Capabilities

Writing

AI out-performs humans

Tests

Creativity

Art

Professions

  • Humanity's Last Exam
    • Effort to build a dataset of challenging (but resolvable) questions in specific domain areas, to act as a benchmark to test whether AIs are improving in these challenging topics.

Medical

Therapy

Financial

AI improves human work

Coding

Forecasting

Finance

Law

Medical

Translation

Creativity

Equity

Counter loneliness

Human Perceptions of AI

AI passes Turing Test

Text Dialog

Art

Psychological Effects of AI Usage

Uptake

Usage For

Persuasion

(AI can update beliefs, change opinions, tackle conspiracy theories, etc.)

See Also