Difference between revisions of "AI in education"

From GISAXS
Jump to: navigation, search
(AI out-performs humans)
(Various)
 
(19 intermediate revisions by the same user not shown)
Line 29: Line 29:
 
* [https://arxiv.org/abs/2308.02773 EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education]
 
* [https://arxiv.org/abs/2308.02773 EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education]
 
* [https://eurekalabs.ai/ Eureka Labs] (founded by [https://en.wikipedia.org/wiki/Andrej_Karpathy Andrej Karpathy]) aims to create AI-driven courses (first course is [https://github.com/karpathy/LLM101n Intro to LLMs])
 
* [https://eurekalabs.ai/ Eureka Labs] (founded by [https://en.wikipedia.org/wiki/Andrej_Karpathy Andrej Karpathy]) aims to create AI-driven courses (first course is [https://github.com/karpathy/LLM101n Intro to LLMs])
 +
 +
===LLMs===
 +
* 2024-12: [https://www.arxiv.org/abs/2412.16429 LearnLM: Improving Gemini for Learning]
 +
 
===Individual tools===
 
===Individual tools===
 
* Chatbot (OpenAI [https://chatgpt.com/ ChatGPT], Anthropic [https://www.anthropic.com/claude Claude], Google [https://gemini.google.com/app Gemini])
 
* Chatbot (OpenAI [https://chatgpt.com/ ChatGPT], Anthropic [https://www.anthropic.com/claude Claude], Google [https://gemini.google.com/app Gemini])
Line 66: Line 70:
 
* 2024-09: [https://docs.iza.org/dp17302.pdf Creative and Strategic Capabilities of Generative AI: Evidence from Large-Scale Experiments]
 
* 2024-09: [https://docs.iza.org/dp17302.pdf Creative and Strategic Capabilities of Generative AI: Evidence from Large-Scale Experiments]
  
===Various===
+
===Art===
 
* 2024-11: [https://doi.org/10.1038/s41598-024-76900-1 AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably]
 
* 2024-11: [https://doi.org/10.1038/s41598-024-76900-1 AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably]
 +
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?]
  
 
===Professions===
 
===Professions===
 +
* [https://agi.safe.ai/submit Humanity's Last Exam]
 +
** [https://x.com/alexandr_wang/status/1835738937719140440 Effort to build] a dataset of challenging (but resolvable) questions in specific domain areas, to act as a benchmark to test whether AIs are improving in these challenging topics.
 +
 +
====Medical====
 
* 2024-03: [https://www.medrxiv.org/content/10.1101/2024.03.12.24303785v1 Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study]
 
* 2024-03: [https://www.medrxiv.org/content/10.1101/2024.03.12.24303785v1 Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study]
 
** GPT4 improves medical practitioner work; surprisingly, GPT4 alone scored better than a human with GPT4 as aid (on selected tasks).
 
** GPT4 improves medical practitioner work; surprisingly, GPT4 alone scored better than a human with GPT4 as aid (on selected tasks).
 
* 2024-10: [https://doi.org/10.1001/jamanetworkopen.2024.38535 Perspectives on Artificial Intelligence–Generated Responses to Patient Messages]
 
* 2024-10: [https://doi.org/10.1001/jamanetworkopen.2024.38535 Perspectives on Artificial Intelligence–Generated Responses to Patient Messages]
* [https://agi.safe.ai/submit Humanity's Last Exam]
+
* 2024-10: [https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2825395 Large Language Model Influence on Diagnostic Reasoning; A Randomized Clinical Trial]
** [https://x.com/alexandr_wang/status/1835738937719140440 Effort to build] a dataset of challenging (but resolvable) questions in specific domain areas, to act as a benchmark to test whether AIs are improving in these challenging topics.
+
** Use of ChatGPT does not strongly improve medical expert work; but AI alone out-scores human or human+AI
 +
* 2024-11: [https://www.nature.com/articles/s41562-024-02046-9 Large language models surpass human experts in predicting neuroscience results] (writeup: [https://medicalxpress.com/news/2024-11-ai-neuroscience-results-human-experts.html AI can predict neuroscience study results better than human experts, study finds])
 +
* 2024-12: [https://www.arxiv.org/abs/2412.10849 Superhuman performance of a large language model on the reasoning tasks of a physician]
 +
* 2024-12: [https://arxiv.org/abs/2412.18925 HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs]
  
 
==AI improves human work==
 
==AI improves human work==
Line 81: Line 93:
 
* 2023-11: [https://www.nber.org/papers/w31161 Generative AI at Work] (National Bureau of Economic Research)
 
* 2023-11: [https://www.nber.org/papers/w31161 Generative AI at Work] (National Bureau of Economic Research)
 
* 2023-12: [https://osf.io/hdjpk The Uneven Impact of Generative AI on Entrepreneurial Performance] ([https://doi.org/10.31219/osf.io/hdjpk doi: 10.31219/osf.io/hdjpk])
 
* 2023-12: [https://osf.io/hdjpk The Uneven Impact of Generative AI on Entrepreneurial Performance] ([https://doi.org/10.31219/osf.io/hdjpk doi: 10.31219/osf.io/hdjpk])
 +
* 2023-12: [https://arxiv.org/abs/2312.05481 Artificial Intelligence in the Knowledge Economy]: Non-autonomous AI (chatbot) benefits least knowledgeable workers; autonomous agents benefit the most knowledgeable workers
 
* 2024-07: [https://www.microsoft.com/en-us/research/publication/generative-ai-in-real-world-workplaces/ Generative AI in Real-World Workplaces: The Second Microsoft Report on AI and Productivity Research]
 
* 2024-07: [https://www.microsoft.com/en-us/research/publication/generative-ai-in-real-world-workplaces/ Generative AI in Real-World Workplaces: The Second Microsoft Report on AI and Productivity Research]
  
Line 90: Line 103:
 
===Forecasting===
 
===Forecasting===
 
* 2024-02: [https://arxiv.org/abs/2402.07862 AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy]
 
* 2024-02: [https://arxiv.org/abs/2402.07862 AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy]
 +
 +
===Finance===
 +
* 2024-12: [https://dx.doi.org/10.2139/ssrn.5075727 AI, Investment Decisions, and Inequality]: Novices see improvements in investment performance, sophisticated investors see even greater improvements.
 +
 
===Creativity===
 
===Creativity===
 
* 2024-07: [https://www.science.org/doi/10.1126/sciadv.adn5290 Generative AI enhances individual creativity but reduces the collective diversity of novel content]
 
* 2024-07: [https://www.science.org/doi/10.1126/sciadv.adn5290 Generative AI enhances individual creativity but reduces the collective diversity of novel content]
 
* 2024-08: [https://www.nature.com/articles/s41562-024-01953-1 An empirical investigation of the impact of ChatGPT on creativity]
 
* 2024-08: [https://www.nature.com/articles/s41562-024-01953-1 An empirical investigation of the impact of ChatGPT on creativity]
 +
* 2024-10: [https://arxiv.org/abs/2410.03703 Human Creativity in the Age of LLMs]
 
* 2024-11: [https://conference.nber.org/conf_papers/f210475.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation]: diffusion model increases "innovation" (patents), boosts the best performers, but also removes some enjoyable tasks.
 
* 2024-11: [https://conference.nber.org/conf_papers/f210475.pdf Artificial Intelligence, Scientific Discovery, and Product Innovation]: diffusion model increases "innovation" (patents), boosts the best performers, but also removes some enjoyable tasks.
 +
* 2024-12: [https://doi.org/10.1080/10400419.2024.2440691 Using AI to Generate Visual Art: Do Individual Differences in Creativity Predict AI-Assisted Art Quality?] ([https://osf.io/preprints/psyarxiv/ygzw6 preprint]): shows that more creative humans produce more creative genAI outputs
  
 
===Counter loneliness===
 
===Counter loneliness===
 
* 2024-07: [https://arxiv.org/abs/2407.19096 AI Companions Reduce Loneliness]
 
* 2024-07: [https://arxiv.org/abs/2407.19096 AI Companions Reduce Loneliness]
  
==AI passes Turing Test==
+
==Human Perceptions of AI==
 +
* 2023-09: [https://www.nature.com/articles/d41586-023-02980-0 AI and science: what 1,600 researchers think. A Nature survey finds that scientists are concerned, as well as excited, by the increasing use of artificial-intelligence tools in research.]
 +
* 2024-11: [https://doi.org/10.1016/S2589-7500(24)00202-4 Attitudes and perceptions of medical researchers towards the use of artificial intelligence chatbots in the scientific process: an international cross-sectional survey] (Nature commentary: [https://www.nature.com/articles/s41592-024-02369-5 Quest for AI literacy])
 +
 
 +
===AI passes Turing Test===
 +
'''Text Dialog'''
 
* 2023-05: [https://arxiv.org/abs/2305.20010 Human or Not? A Gamified Approach to the Turing Test]
 
* 2023-05: [https://arxiv.org/abs/2305.20010 Human or Not? A Gamified Approach to the Turing Test]
 
* 2023-10: [https://arxiv.org/abs/2310.20216 Does GPT-4 pass the Turing test?]
 
* 2023-10: [https://arxiv.org/abs/2310.20216 Does GPT-4 pass the Turing test?]
 
* 2024-05: [https://arxiv.org/abs/2405.08007 People cannot distinguish GPT-4 from a human in a Turing test]
 
* 2024-05: [https://arxiv.org/abs/2405.08007 People cannot distinguish GPT-4 from a human in a Turing test]
 
* 2024-07: [https://arxiv.org/abs/2407.08853 GPT-4 is judged more human than humans in displaced and inverted Turing tests]
 
* 2024-07: [https://arxiv.org/abs/2407.08853 GPT-4 is judged more human than humans in displaced and inverted Turing tests]
 +
'''Art'''
 +
* 2024-11: [https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing How Did You Do On The AI Art Turing Test?] Differentiation was only slightly above random (60%). AI art was often ranked higher than human-made.
 +
* 2024-11: [https://doi.org/10.1038/s41598-024-76900-1 AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably]
  
 
=Uptake=
 
=Uptake=
 +
* 2023-07: [https://doi.org/10.9734/ajrcos/2023/v16i4392 ChatGPT: Early Adopters, Teething Issues and the Way Forward]
 
* 2024-03: [https://arxiv.org/abs/2403.07183 Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews]
 
* 2024-03: [https://arxiv.org/abs/2403.07183 Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews]
 
* 2024-05:  Humlum, Anders and Vestergaard, Emilie, [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4827166 The Adoption of ChatGPT]. IZA Discussion Paper No. 16992 [http://dx.doi.org/10.2139/ssrn.4827166 doi: 10.2139/ssrn.4827166]
 
* 2024-05:  Humlum, Anders and Vestergaard, Emilie, [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4827166 The Adoption of ChatGPT]. IZA Discussion Paper No. 16992 [http://dx.doi.org/10.2139/ssrn.4827166 doi: 10.2139/ssrn.4827166]
Line 113: Line 141:
 
** 72% of leaders use genAI at least once a week (c.f. 23% in 2023); 90% agree AI enhances skills (c.f. 80% in 2023)
 
** 72% of leaders use genAI at least once a week (c.f. 23% in 2023); 90% agree AI enhances skills (c.f. 80% in 2023)
 
** Spending on genAI is up 130% (most companies plan to invest going forward)
 
** Spending on genAI is up 130% (most companies plan to invest going forward)
 +
 +
==Usage For==
 +
* 2024-12: [https://assets.anthropic.com/m/7e1ab885d1b24176/original/Clio-Privacy-Preserving-Insights-into-Real-World-AI-Use.pdf Clio: A system for privacy-preserving insights into real-world AI use] (Anthropic [https://www.anthropic.com/research/clio Clio])
  
 
=See Also=
 
=See Also=
 
* [https://www.google.com/books/edition/_/cKnYEAAAQBAJ?hl=en&gbpv=1&pg=PA2 UNESCO. Guidance for Generative AI in Education and Research]
 
* [https://www.google.com/books/edition/_/cKnYEAAAQBAJ?hl=en&gbpv=1&pg=PA2 UNESCO. Guidance for Generative AI in Education and Research]
 +
* [[AI]]

Latest revision as of 12:40, 7 January 2025

AI in Education

Survey/study of

AI improves learning/education

AI harms learning

Software/systems

LLMs

Individual tools

AI for grading

Detection

AI Text Detectors Don't Work

AI/human

AI out-performs humans

Tests

Creativity

Art

Professions

  • Humanity's Last Exam
    • Effort to build a dataset of challenging (but resolvable) questions in specific domain areas, to act as a benchmark to test whether AIs are improving in these challenging topics.

Medical

AI improves human work

Coding

Forecasting

Finance

Creativity

Counter loneliness

Human Perceptions of AI

AI passes Turing Test

Text Dialog

Art

Uptake

Usage For

See Also