How Large Language Models Balance Internal Knowledge with User and Document Assertions
How Large Language Models Balance Internal Knowledge with User and Document Assertions
Research questions. The paper asks how LLMs balance three competing sources of information: their own parametric knowledge, user assertions, and document assertions. It also asks whether models can distinguish helpful external information from misleading information, and how post-training changes these source preferences.
Methodology. The authors propose a three-source interaction framework and evaluate 27 models from GPT-4o, Llama 3/3.1, and Qwen3 families on CommonsenseQA and GSM8K. They use prompt variants involving user and document claims, then measure source reliance, discrimination ability, answer distributions, and the effect of supervised fine-tuning.
Findings. Most models rely more on document assertions than user assertions, and post-training strengthens that document preference. The authors also find that many models are “impressionable,” meaning they often fail to separate helpful external information from harmful misinformation, though fine-tuning on diverse source-interaction data improves discrimination.
Why it matters. The paper is important for RAG, chat, and sycophancy research because real systems often combine model knowledge, retrieved documents, and user beliefs at the same time. It shows that reliability depends not just on whether a model follows context, but whether it can judge which source deserves trust.