Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

Temporally-Structured PID Reward Shaping in Reinforcement Learning for Robust Path Planning of Mobile Robots in Mining Environments

  • Christian Camacho
  • , José Alcayaga
  • , Marco Herrera
  • , Oscar Camacho
  • , Alvaro Javier Prado-Romo
  • Universidad Católica del Norte

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

Achieving reliable and adaptable motion controllers in dynamic environments remains a key challenge for robot navigation, particularly due to unpredictable changes and obstacle-rich scenarios. This paper investigates the application of Reinforcement Learning (RL) techniques for adaptable path planning with obstacle avoidance of Skid-Steer Mobile Robots (SSMRs), incorporating reward shaping based on feedback control techniques. In particular, a key component of the reward function is a weighted PID controller that penalizes deviations of the robot with respect to the positional goal and discourages abrupt motion changes, while the other component promotes obstacle avoidance using range-based safety criteria. The RL approaches are implemented using Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms. The proposed strategies were evaluated through simulations and real-mapped mining scenarios involving both static and dynamic obstacles. Experimental results demonstrate improvement of motion performance, with reductions in cumulative goal tracking error and control input effort of up to 49.35% for DDPG, 11.29% for TD3, 2.22% for SAC, and 2.41% for PPO compared to baseline RL techniques that use reward functions based solely on proportional terms. These findings offer a practical foundation for deploying more efficient motion systems in robotic settings where adaptability and precision are critical.

Idioma originalInglés
Título de la publicación alojada2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025
EditoresGaston Lefranc, Claudio Cubillos
EditorialInstitute of Electrical and Electronics Engineers Inc.
ISBN (versión digital)9798350357363
DOI
EstadoPublicada - 2025
Evento2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025 - Valparaiso, Chile
Duración: 28 oct. 202530 oct. 2025

Serie de la publicación

NombreProceedings - IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, ChileCon
ISSN (versión impresa)2832-1529
ISSN (versión digital)2832-1537

Conferencia

Conferencia2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025
País/TerritorioChile
CiudadValparaiso
Período28/10/2530/10/25

Huella

Profundice en los temas de investigación de 'Temporally-Structured PID Reward Shaping in Reinforcement Learning for Robust Path Planning of Mobile Robots in Mining Environments'. En conjunto forman una huella única.

Citar esto