Difference between revisions of "AI safety"

From GISAXS
Jump to: navigation, search
(Description of Safety Concerns)
(Proposals)
 
(One intermediate revision by the same user not shown)
Line 55: Line 55:
 
* 2025-01: [https://assets.publishing.service.gov.uk/media/679a0c48a77d250007d313ee/International_AI_Safety_Report_2025_accessible_f.pdf International Safety Report: The International Scientific Report on the Safety of Advanced AI (January 2025)]
 
* 2025-01: [https://assets.publishing.service.gov.uk/media/679a0c48a77d250007d313ee/International_AI_Safety_Report_2025_accessible_f.pdf International Safety Report: The International Scientific Report on the Safety of Advanced AI (January 2025)]
 
* [https://ailabwatch.org/ AI Lab Watch] (safety scorecard)
 
* [https://ailabwatch.org/ AI Lab Watch] (safety scorecard)
 +
* 2026-03: [https://windowsontheory.org/2026/03/30/the-state-of-ai-safety-in-four-fake-graphs/ The state of AI safety in four fake graphs]
  
 
==Assessmment==
 
==Assessmment==
Line 70: Line 71:
 
* 2025-04: Google DeepMind: [https://deepmind.google/discover/blog/taking-a-responsible-path-to-agi/ Taking a responsible path to AGI]
 
* 2025-04: Google DeepMind: [https://deepmind.google/discover/blog/taking-a-responsible-path-to-agi/ Taking a responsible path to AGI]
 
** Paper: [https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/evaluating-potential-cybersecurity-threats-of-advanced-ai/An_Approach_to_Technical_AGI_Safety_Apr_2025.pdf An Approach to Technical AGI Safety and Security]
 
** Paper: [https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/evaluating-potential-cybersecurity-threats-of-advanced-ai/An_Approach_to_Technical_AGI_Safety_Apr_2025.pdf An Approach to Technical AGI Safety and Security]
 +
* 2026-04: Joe Carlsmith: [https://joecarlsmith.substack.com/p/video-and-transcript-of-talk-on-writing Writing AI constitutions]
  
 
=Research=
 
=Research=

Latest revision as of 15:52, 14 April 2026

Learning Resources

Light

Deep

Description of Safety Concerns

Key Concepts

Medium-term Risks

Long-term (x-risk)

Status

Assessmment

Policy

Proposals

Research

Demonstrations of Negative Use Capabilities

Threat Vectors

See Also