Difference between revisions of "AI safety"

From GISAXS
Jump to: navigation, search
(Medium-term Risks)
(Research)
 
(One intermediate revision by the same user not shown)
Line 35: Line 35:
 
* 2025-04: [https://www.forethought.org/research/ai-enabled-coups-how-a-small-group-could-use-ai-to-seize-power AI-Enabled Coups: How a Small Group Could Use AI to Seize Power]
 
* 2025-04: [https://www.forethought.org/research/ai-enabled-coups-how-a-small-group-could-use-ai-to-seize-power AI-Enabled Coups: How a Small Group Could Use AI to Seize Power]
 
* 2025-06: [https://arxiv.org/abs/2506.20702 The Singapore Consensus on Global AI Safety Research Priorities]
 
* 2025-06: [https://arxiv.org/abs/2506.20702 The Singapore Consensus on Global AI Safety Research Priorities]
* 2026-01: [https://www.science.org/doi/10.1126/science.adz1697 How malicious AI swarms can threaten democracy] (Science Magazine, [https://arxiv.org/abs/2506.06299 preprint])
+
* 2026-01: [https://www.science.org/doi/10.1126/science.adz1697 How malicious AI swarms can threaten democracy: The fusion of agentic AI and LLMs marks a new frontier in information warfare] (Science Magazine, [https://arxiv.org/abs/2506.06299 preprint])
  
 
==Long-term  (x-risk)==
 
==Long-term  (x-risk)==
Line 65: Line 65:
  
 
=Research=
 
=Research=
 +
* 2008: [https://selfawaresystems.com/wp-content/uploads/2008/01/ai_drives_final.pdf The Basic AI Drives]
 
* 2022-09: [https://arxiv.org/abs/2209.00626v1 The alignment problem from a deep learning perspective]
 
* 2022-09: [https://arxiv.org/abs/2209.00626v1 The alignment problem from a deep learning perspective]
 
* 2022-12: [https://arxiv.org/abs/2212.03827 Discovering Latent Knowledge in Language Models Without Supervision]
 
* 2022-12: [https://arxiv.org/abs/2212.03827 Discovering Latent Knowledge in Language Models Without Supervision]

Latest revision as of 14:40, 26 January 2026

Learning Resources

Light

Deep

Description of Safety Concerns

Key Concepts

Medium-term Risks

Long-term (x-risk)

Status

Assessmment

Policy

Proposals

Research

Demonstrations of Negative Use Capabilities

Threat Vectors

See Also