Difference between revisions of "AI safety"

From GISAXS

Jump to: navigation, search

Revision as of 14:42, 14 February 2025

Contents

1 Description of Safety Concerns
- 1.1 Medium-term Risks
- 1.2 Long-term (x-risk)
2 Learning Resources
3 Research

Description of Safety Concerns

Medium-term Risks

2023-04: A.I. Dilemma – Tristan Harris and Aza Raskin” (video) (.website-files.com/5f0e1294f002b1bb26e1f304/64224a9051a6637c1b60162a_65-your-undivided-attention-The-AI-Dilemma-transcript.pdf podcast transcript): raises concern about human ability to handle these transformations
2023-04: Daniel Schmachtenberger and Liv Boeree (video): AI could accelerate perverse social dynamics

Long-term (x-risk)

Instrumental Convergence

Learning Resources

DeepMind short course on AGI safety

Research

2022-12: Discovering Latent Knowledge in Language Models Without Supervision
2023-04: Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
2023-05: Model evaluation for extreme risks (DeepMind)

Retrieved from "http://gisaxs.com/index.php?title=AI_safety&oldid=6901"