Researchers published an October 2025 study showing just 250 poisoned documents can backdoor large language models. Models from 600 million to 13 billion parameters became vulnerable regardless of dataset size. Attackers plant specific trigger phrases that make the model behave incorrectly on command. Researchers found absolute document count matters more than percentage of training data. That lowers the barrier to attacks because creating 250 documents is trivial. The result raises urgent supply chain and data governance risks for model developers and users. You should assume public web sources can be weaponized and require provenance and auditing.

Recent news