arXive clamps down on slop papers
2026-05-18 18:32:11.401235+02 by Dan Lyke 0 comments
Large language models (LLMs) are known to generate plausible but false information across a wide range of contexts, yet the real-world magnitude and consequences of this hallucination problem remain poorly understood. Here we leverage a uniquely verifiable object - scientific citations - to audit 111 million references across 2.5 million papers in arXiv, bioRxiv, SSRN, and PubMed Central. We find a sharp rise in non- existent references following widespread LLM adoption, with a conservative estimate of 146,932 hallucinated citations in 2025 alone. These errors are diffusely embedded across many papers but especially pronounced in fields with rapid AI uptake, in manuscripts with linguistic signatures of AI-assisted writing, and among small and early-career author teams. At the same time, hallucinated references disproportionately assign credit to already prominent and male scholars, suggesting that LLM-generated errors may reinforce existing inequities in scientific recognition. Preprint moderation and journal publication processes capture only a fraction of these errors, suggesting that the spread of hallucinated content has outpaced existing safeguards. Together, these findings demonstrate that LLM hallucinations are infiltrating knowledge production at scale, threatening both the reliability and equity of future scientific discovery as human and AI systems draw on the existing literature.
Which brings us to: Fuck yeah! Tech Crunch: Research repository ArXiv will ban authors for a year if they let AI do all the work.
404 Media: ArXiv to Ban Researchers for a Year if They Submit AI Slop
One of the amazing things about this is the number of people who are whining that it's unfair that they've actually read the work they're citing, or are creating other hypotheticals. This doofucs on the Fediverse is, for instance, willing to lay the blame on his co-authors in order to take the credit.
It gets worse if you head over to X/Twitter, which... I'm not gonna link to individually, you can find your own list off of Thomas G. Dietterich @tdietterich's announcement of the policy there, but honestly, people if these are the arguments y'all are making in good faith, academia is irretrievably broken.
Which I've long contented anyway, but... damn...