Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability, Stability and Fairness

Amma, Krishna Kumar Neelakanta Pillai Santha Kumari

doi:https://dx.doi.org/10.21275/SR25927034247

Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability, Stability and Fairness

Krishna Kumar Neelakanta Pillai Santha Kumari Amma

Abstract: Dynamic pricing in competitive retail markets requires strategies that adapt to fluctuating demand and competitor behavior. In this work, we present a systematic empirical evaluation of multi-agent reinforcement learning (MARL) approaches-specifically MAPPO and MADDPG-for dynamic price optimization under competition. Using a simulated marketplace environment derived from real-world retail data, we benchmark these algorithms against an Independent DDPG (IDDPG) baseline, a widely used independent learner in MARL literature. We evaluate profit performance, stability across random seeds, fairness, and training efficiency. Our results show that MAPPO consistently achieves the highest average returns with low variance, offering a stable and reproducible approach for competitive price optimization, while MADDPG achieves slightly lower profit but the fairest profit distribution among agents. These findings demonstrate that MARL methods-particularly MAPPO-provide a scalable and stable alternative to independent learning approaches for dynamic retail pricing.

Keywords: Centralized Training Decentralized Execution (CTDE), Competitive Pricing, Dynamic Pricing, E-Commerce Optimization, IDDPG Baseline, MADDPG, MAPPO, MASAC, Multi-Agent Reinforcement Learning (MARL), Performance Evaluation, Retail Market Simulation, Stability Analysis

How to Cite?: Krishna Kumar Neelakanta Pillai Santha Kumari Amma, "Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability, Stability and Fairness", Volume 14 Issue 9, September 2025, International Journal of Science and Research (IJSR), Pages: 1418-1424, https://www.ijsr.net/getabstract.php?paperid=SR25927034247, DOI: https://dx.doi.org/10.21275/SR25927034247