The bigger the model, the less the reliability
2024-09-26 18:14:32.405895+02 by Dan Lyke 0 comments
Nature: Larger and more instructable language models become less reliable.
We also find that early models often avoid user questions but scaled-up, shaped-up models tend to give an apparently sensible yet wrong answer much more often, including errors on difficult questions that human supervisors frequently overlook. Moreover, we observe that stability to different natural phrasings of the same question is improved by scaling-up and shaping-up interventions, but pockets of variability persist across difficulty levels. These findings highlight the need for a fundamental shift in the design and development of general-purpose artificial intelligence, particularly in high-stakes areas for which a predictable distribution of errors is paramount.
https://doi.org/10.1038/s41586-024-07930-y
Via.