Difference between revisions of "AI safety"

From GISAXS
Jump to: navigation, search
(Long-term (x-risk))
(3 intermediate revisions by the same user not shown)
Line 33: Line 33:
 
==Long-term  (x-risk)==
 
==Long-term  (x-risk)==
 
* [https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities List AGI Ruin: A List of Lethalities] (Eliezer Yudkowsky)
 
* [https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities List AGI Ruin: A List of Lethalities] (Eliezer Yudkowsky)
 +
* [https://link.springer.com/article/10.1007/s00146-024-02113-9 ‘Interpretability’ and ‘alignment’ are fool’s errands: a proof that controlling misaligned large language models is the best anyone can hope for] (Marcus Arvan)
  
 
=Status=
 
=Status=
Line 45: Line 46:
 
* 2025-02: [https://arxiv.org/abs/2502.18359 Responsible AI Agents]
 
* 2025-02: [https://arxiv.org/abs/2502.18359 Responsible AI Agents]
 
* 2025-03: [https://controlai.com/ Control AI] [https://controlai.com/dip The Direct Institutional Plan]
 
* 2025-03: [https://controlai.com/ Control AI] [https://controlai.com/dip The Direct Institutional Plan]
 +
* 2025-04: Google DeepMind: [https://deepmind.google/discover/blog/taking-a-responsible-path-to-agi/ Taking a responsible path to AGI]
 +
** Paper: [https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/evaluating-potential-cybersecurity-threats-of-advanced-ai/An_Approach_to_Technical_AGI_Safety_Apr_2025.pdf An Approach to Technical AGI Safety and Security]
  
 
=Research=
 
=Research=

Revision as of 14:41, 10 April 2025

Learning Resources

Light

Deep

Description of Safety Concerns

Key Concepts

Medium-term Risks

Long-term (x-risk)

Status

Policy

Proposals

Research

Demonstrations of Negative Use Capabilities

See Also