Abstract
As Large Language Models (LLMs) are increasingly applied to complex reasoning tasks, achieving both accurate task performance and faithful explanations becomes crucial. However, LLMs often generate unfaithful explanations, partly because they do not consistently adhere closely to the provided context. Existing approaches to this problem either rely on superficial calibration methods, such as decomposed Chain-of-Thought prompting, or require costly retraining to improve model faithfulness. In this work, we propose a probabilistic inference paradigm that leverages task-specific and lookahead rewards to ensure that LLM-generated rationales are more faithful to model decisions and align better with input context. These rewards are derived from a domain specific proposal distribution, allowing for optimized sequential Monte Carlo approximations. Our evaluations across three different reasoning tasks show that this method, which allows for controllable generation during inference, improves both accuracy and faithfulness of LLMs. This method offers a promising path towards making LLMs more reliable for reasoning tasks without sacrificing performance.
Original language | English |
---|---|
Publication status | Published - 2025 |
Event | The 63rd Annual Meeting of the Association for Computational Linguistics: ACL 2025 - Vienna, Austria Duration: 27 Jul 2025 → 1 Aug 2025 https://2025.aclweb.org/ |
Conference
Conference | The 63rd Annual Meeting of the Association for Computational Linguistics |
---|---|
Country/Territory | Austria |
City | Vienna |
Period | 27/07/2025 → 1/08/2025 |
Internet address |