Difference between revisions of "AI safety"

From GISAXS
Jump to: navigation, search
(Research)
(Long-term (x-risk))
Line 42: Line 42:
 
* 2024-11: Marcus Arvan: [https://link.springer.com/article/10.1007/s00146-024-02113-9 ‘Interpretability’ and ‘alignment’ are fool’s errands: a proof that controlling misaligned large language models is the best anyone can hope for]
 
* 2024-11: Marcus Arvan: [https://link.springer.com/article/10.1007/s00146-024-02113-9 ‘Interpretability’ and ‘alignment’ are fool’s errands: a proof that controlling misaligned large language models is the best anyone can hope for]
 
* 2025-04: [https://michaelnotebook.com/xriskbrief/index.html ASI existential risk: reconsidering alignment as a goal]
 
* 2025-04: [https://michaelnotebook.com/xriskbrief/index.html ASI existential risk: reconsidering alignment as a goal]
 +
* 2025-12: Philip Trammell and Leopold Aschenbrenner: [https://philiptrammell.com/static/Existential_Risk_and_Growth.pdf Existential Risk and Growth]
  
 
=Status=
 
=Status=

Revision as of 13:17, 4 January 2026

Learning Resources

Light

Deep

Description of Safety Concerns

Key Concepts

Medium-term Risks

Long-term (x-risk)

Status

Assessmment

Policy

Proposals

Research

Demonstrations of Negative Use Capabilities

Threat Vectors

See Also