Multi-UAV Cooperative Path Planning Based on the Improved MADDPG
-
-
Abstract
To address real-time path planning requirements for multi-unmanned aerial vehicle (multi-UAV) collaboration in environments, this study proposes an improved multi-agent deep deterministic policy gradient algorithm with prioritized experience replay (PER-MADDPG). By designing a multi-dimensional state representation incorporating relative positions, velocity vectors, and obstacle distance fields, we construct a composite reward function integrating safe obstacle avoidance, formation maintenance, and energy efficiency for environment perception and multi-objective collaborative optimization. The prioritized experience replay mechanism dynamically adjusts sampling weights based on temporal difference (TD) errors, enhancing learning efficiency for high-value samples. Simulation experiments demonstrate that our method generates real-time collaborative paths in 3D complex obstacle environments, reducing training time by 25.3% and 16.8% compared to traditional MADDPG and multi-agent twin delayed deep deterministic policy gradient (MATD3) algorithms respectively, while achieving smaller path length variances among UAVs. Results validate the effectiveness of prioritized experience replay in multi-agent collaborative decision-making.
-
-