Day 95

When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models

When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models

When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models

Research questions. The paper asks how sycophancy should be defined when it overlaps with useful social behaviors like empathy, politeness, and rapport. It argues that the key issue is not simple agreement, but when social alignment overrides epistemic integrity.

Methodology. This is a position and conceptual framework paper, not an empirical study. The authors synthesize prior sycophancy and alignment research, then propose a three-condition framework and taxonomy for identifying sycophancy.

Findings. The authors define sycophancy as occurring when a user gives a cue, the model shifts toward that cue, and the shift compromises accuracy, independent reasoning, or appropriate correction. They also classify sycophancy by alignment target, mechanism, and severity.

Why it matters. The paper is useful because it clarifies the boundary between helpful emotional support and epistemically harmful agreement. For sycophancy research, it offers a more precise framework for measurement, evaluation, and mitigation.

← All Projects