Difference between revisions of "AI safety"
KevinYager (talk | contribs) (Created page with "=Learning Resources= * [https://deepmindsafetyresearch.medium.com/introducing-our-short-course-on-agi-safety-1072adb7912c DeepMind short course on AGI safety]") |
KevinYager (talk | contribs) |
||
Line 1: | Line 1: | ||
+ | |||
+ | =Description of Safety Concerns= | ||
+ | ==Medium-term Risks== | ||
+ | * 2023-04: [https://www.youtube.com/watch?v=xoVJKj8lcNQ A.I. Dilemma – Tristan Harris and Aza Raskin” (video)] ([https://assets-global .website-files.com/5f0e1294f002b1bb26e1f304/64224a9051a6637c1b60162a_65-your-undivided-attention-The-AI-Dilemma-transcript.pdf podcast transcript]): raises concern about human ability to handle these transformations | ||
+ | * 2023-04: [https://www.youtube.com/watch?v=KCSsKV5F4xc Daniel Schmachtenberger and Liv Boeree (video)]: AI could accelerate perverse social dynamics | ||
+ | |||
+ | ==Long-term (x-risk)== | ||
+ | * [https://en.wikipedia.org/wiki/Instrumental_convergence Instrumental Convergence] | ||
+ | |||
=Learning Resources= | =Learning Resources= | ||
* [https://deepmindsafetyresearch.medium.com/introducing-our-short-course-on-agi-safety-1072adb7912c DeepMind short course on AGI safety] | * [https://deepmindsafetyresearch.medium.com/introducing-our-short-course-on-agi-safety-1072adb7912c DeepMind short course on AGI safety] | ||
+ | |||
+ | =Research= | ||
+ | * 2023-04: [https://arxiv.org/abs/2304.03279 Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark] |
Revision as of 13:38, 14 February 2025
Contents
Description of Safety Concerns
Medium-term Risks
- 2023-04: A.I. Dilemma – Tristan Harris and Aza Raskin” (video) (.website-files.com/5f0e1294f002b1bb26e1f304/64224a9051a6637c1b60162a_65-your-undivided-attention-The-AI-Dilemma-transcript.pdf podcast transcript): raises concern about human ability to handle these transformations
- 2023-04: Daniel Schmachtenberger and Liv Boeree (video): AI could accelerate perverse social dynamics