Day 89

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?

May 03, 2026

Research questions. This paper asks whether the race toward ever-larger language models creates unacceptable social, environmental, and epistemic risks. It focuses on harms from scale, web-scraped data, bias, documentation gaps, and the false impression that LMs understand language.

Methodology. This is a critical analysis and position paper, not an experiment. The authors synthesize prior NLP, ethics, environmental, and dataset documentation research to assess risks and propose mitigation strategies.

Findings. The paper argues that bigger models can amplify bias, encode harmful language, consume substantial resources, and mislead users because fluent text can be mistaken for meaning or intent. It recommends weighing environmental and financial costs, carefully curating and documenting datasets, doing stakeholder-centered pre-development review, and pursuing alternatives to scale-first NLP.

Why it matters. It became a foundational AI ethics paper because it challenged the assumption that larger LMs are automatically better or socially beneficial. It remains important for work on sycophancy because it highlights how users can over-attribute understanding, authority, and intention to fluent model outputs.

← All Projects