Abstract
Large Language Models (LLMs) struggle with complex reasoning due to limited diversity and inefficient search. We propose Soft Reasoning, an embedding-based search framework that optimises the embedding of the first token to guide generation. It combines (1) embedding perturbation for controlled exploration and (2) Bayesian optimisation to refine embeddings via a verifier-guided objective, balancing exploration and exploitation. This approach improves reasoning accuracy and coherence while avoiding reliance on heuristic search. Experiments demonstrate superior correctness with minimal computation, making it a scalable, model-agnostic solution.
Original language | English |
---|---|
Title of host publication | Forty-second International Conference on Machine Learning. ICML 2025. |
Publisher | PMLR |
Publication status | Published - 2025 |
Event | 2025 International Conference on Machine Learning: ICML25 - Duration: 13 Jul 2025 → … |
Conference
Conference | 2025 International Conference on Machine Learning |
---|---|
Period | 13/07/2025 → … |