TY - GEN
T1 - Temporally-Structured PID Reward Shaping in Reinforcement Learning for Robust Path Planning of Mobile Robots in Mining Environments
AU - Camacho, Christian
AU - Alcayaga, José
AU - Herrera, Marco
AU - Camacho, Oscar
AU - Prado-Romo, Alvaro Javier
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Achieving reliable and adaptable motion controllers in dynamic environments remains a key challenge for robot navigation, particularly due to unpredictable changes and obstacle-rich scenarios. This paper investigates the application of Reinforcement Learning (RL) techniques for adaptable path planning with obstacle avoidance of Skid-Steer Mobile Robots (SSMRs), incorporating reward shaping based on feedback control techniques. In particular, a key component of the reward function is a weighted PID controller that penalizes deviations of the robot with respect to the positional goal and discourages abrupt motion changes, while the other component promotes obstacle avoidance using range-based safety criteria. The RL approaches are implemented using Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms. The proposed strategies were evaluated through simulations and real-mapped mining scenarios involving both static and dynamic obstacles. Experimental results demonstrate improvement of motion performance, with reductions in cumulative goal tracking error and control input effort of up to 49.35% for DDPG, 11.29% for TD3, 2.22% for SAC, and 2.41% for PPO compared to baseline RL techniques that use reward functions based solely on proportional terms. These findings offer a practical foundation for deploying more efficient motion systems in robotic settings where adaptability and precision are critical.
AB - Achieving reliable and adaptable motion controllers in dynamic environments remains a key challenge for robot navigation, particularly due to unpredictable changes and obstacle-rich scenarios. This paper investigates the application of Reinforcement Learning (RL) techniques for adaptable path planning with obstacle avoidance of Skid-Steer Mobile Robots (SSMRs), incorporating reward shaping based on feedback control techniques. In particular, a key component of the reward function is a weighted PID controller that penalizes deviations of the robot with respect to the positional goal and discourages abrupt motion changes, while the other component promotes obstacle avoidance using range-based safety criteria. The RL approaches are implemented using Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms. The proposed strategies were evaluated through simulations and real-mapped mining scenarios involving both static and dynamic obstacles. Experimental results demonstrate improvement of motion performance, with reductions in cumulative goal tracking error and control input effort of up to 49.35% for DDPG, 11.29% for TD3, 2.22% for SAC, and 2.41% for PPO compared to baseline RL techniques that use reward functions based solely on proportional terms. These findings offer a practical foundation for deploying more efficient motion systems in robotic settings where adaptability and precision are critical.
KW - PID reward shaping
KW - mining environment
KW - path planning
KW - reinforcement learning
KW - skid steer mobile robots
UR - https://www.scopus.com/pages/publications/105037994861
U2 - 10.1109/CHILECON66915.2025.11476595
DO - 10.1109/CHILECON66915.2025.11476595
M3 - Contribución a la conferencia
AN - SCOPUS:105037994861
T3 - Proceedings - IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, ChileCon
BT - 2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025
A2 - Lefranc, Gaston
A2 - Cubillos, Claudio
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025
Y2 - 28 October 2025 through 30 October 2025
ER -