TY - JOUR
T1 - Adaptive strategy templates using deep reinforcement learning for multi-issue bilateral negotiation
AU - Bagga, Pallavi
AU - Paoletti, Nicola
AU - Stathis, Kostas
N1 - Publisher Copyright:
© 2025
PY - 2025/3/28
Y1 - 2025/3/28
N2 - Negotiating in uncertain environments, where user preferences are only partially known, poses a challenge for traditional negotiation models that rely on rigid, pre-defined strategies. These models struggle to adapt to changing conditions or transfer knowledge across different negotiation contexts, making them ineffective in dynamic environments. To address this research gap, we propose a novel negotiation model that uses deep reinforcement learning (DRL) to enable agents learn adaptable, generalizable strategies through the notion of “strategy templates”. These templates include (a) choice parameters to select tactics, (b) time parameters to control when tactics are activated, and (c) attribute-value parameters to guide acceptance and inform bidding decisions. As a result, we enable negotiation agents dynamically adapt their strategies, through pre-training on teacher strategies and refining them via online learning in diverse environments. Our agents also derive a user model to approximate partially specified user preferences, thus handling preference uncertainty more effectively. We developed a proof-of-concept prototype using an actor-critic architecture based on DRL, supplemented by stochastic search techniques for the estimation of user model and multi-objective optimization for making mutually beneficial offers. Experimental evaluations show that our model outperforms state-of-the-art approaches in terms of both individual and social-welfare utilities, demonstrating its ability to transfer experience across domains and excel in previously unseen scenarios. This work provides a robust framework for dynamic, adaptable strategy formation, bridging the gap in current negotiation models by addressing uncertainty in user preferences and strategy flexibility.
AB - Negotiating in uncertain environments, where user preferences are only partially known, poses a challenge for traditional negotiation models that rely on rigid, pre-defined strategies. These models struggle to adapt to changing conditions or transfer knowledge across different negotiation contexts, making them ineffective in dynamic environments. To address this research gap, we propose a novel negotiation model that uses deep reinforcement learning (DRL) to enable agents learn adaptable, generalizable strategies through the notion of “strategy templates”. These templates include (a) choice parameters to select tactics, (b) time parameters to control when tactics are activated, and (c) attribute-value parameters to guide acceptance and inform bidding decisions. As a result, we enable negotiation agents dynamically adapt their strategies, through pre-training on teacher strategies and refining them via online learning in diverse environments. Our agents also derive a user model to approximate partially specified user preferences, thus handling preference uncertainty more effectively. We developed a proof-of-concept prototype using an actor-critic architecture based on DRL, supplemented by stochastic search techniques for the estimation of user model and multi-objective optimization for making mutually beneficial offers. Experimental evaluations show that our model outperforms state-of-the-art approaches in terms of both individual and social-welfare utilities, demonstrating its ability to transfer experience across domains and excel in previously unseen scenarios. This work provides a robust framework for dynamic, adaptable strategy formation, bridging the gap in current negotiation models by addressing uncertainty in user preferences and strategy flexibility.
KW - Deep reinforcement learning
KW - Interpretable strategy templates
KW - Multi-issue bilateral negotiation
KW - User preference uncertainty
UR - http://www.scopus.com/inward/record.url?scp=85215208256&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2025.129381
DO - 10.1016/j.neucom.2025.129381
M3 - Article
AN - SCOPUS:85215208256
SN - 0925-2312
VL - 623
JO - NEUROCOMPUTING
JF - NEUROCOMPUTING
M1 - 129381
ER -