Flutterby™! : accuracy collapse

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

accuracy collapse

2025-06-08 06:38:11.596415+02 by Dan Lyke 1 comments

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity (pdf)

Through extensive experimentation across diverse puzzles, we show that frontier LRMs face a complete accuracy collapse beyond certain complexities. Moreover, they exhibit a counter-intuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget.

Via

[ related topics: Apple Computer Macintosh ]

comments in ascending chronological order (reverse):

#Comment Re: accuracy collapse made: 2025-06-09 05:38:15.497581+02 by: Dan Lyke

The Apple html page for this paper