Difference between revisions of "AI benchmarks"

From GISAXS
Jump to: navigation, search
(Task Length)
(Task Length)
Line 6: Line 6:
  
 
==Task Length==
 
==Task Length==
 +
* 2020-09: Ajeya Cotra: [https://www.lesswrong.com/posts/KrJfoZzpSDpnrv9va/draft-report-on-ai-timelines Draft report on AI timelines]
 
* 2025-03: [https://arxiv.org/abs/2503.14499 Measuring AI Ability to Complete Long Tasks Measuring AI Ability to Complete Long Tasks]
 
* 2025-03: [https://arxiv.org/abs/2503.14499 Measuring AI Ability to Complete Long Tasks Measuring AI Ability to Complete Long Tasks]
 
[[Image:GmZHL8xWQAAtFlF.jpeg|450px]]
 
[[Image:GmZHL8xWQAAtFlF.jpeg|450px]]

Revision as of 10:43, 20 March 2025