Difference between revisions of "AI benchmarks"

From GISAXS
Jump to: navigation, search
(Methods)
(Task Length)
Line 6: Line 6:
  
 
==Task Length==
 
==Task Length==
* 2025-03: [Measuring AI Ability to Complete Long Tasks Measuring AI Ability to Complete Long Tasks]
+
* 2025-03: [https://arxiv.org/abs/2503.14499 Measuring AI Ability to Complete Long Tasks Measuring AI Ability to Complete Long Tasks]
 
[[Image:GmZHL8xWQAAtFlF.jpeg|450px]]
 
[[Image:GmZHL8xWQAAtFlF.jpeg|450px]]
  

Revision as of 18:02, 19 March 2025