Skip to main navigation Skip to search Skip to main content

Temporally-Structured PID Reward Shaping in Reinforcement Learning for Robust Path Planning of Mobile Robots in Mining Environments

  • Christian Camacho
  • , José Alcayaga
  • , Marco Herrera
  • , Oscar Camacho
  • , Alvaro Javier Prado-Romo
  • Universidad Católica del Norte

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Achieving reliable and adaptable motion controllers in dynamic environments remains a key challenge for robot navigation, particularly due to unpredictable changes and obstacle-rich scenarios. This paper investigates the application of Reinforcement Learning (RL) techniques for adaptable path planning with obstacle avoidance of Skid-Steer Mobile Robots (SSMRs), incorporating reward shaping based on feedback control techniques. In particular, a key component of the reward function is a weighted PID controller that penalizes deviations of the robot with respect to the positional goal and discourages abrupt motion changes, while the other component promotes obstacle avoidance using range-based safety criteria. The RL approaches are implemented using Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms. The proposed strategies were evaluated through simulations and real-mapped mining scenarios involving both static and dynamic obstacles. Experimental results demonstrate improvement of motion performance, with reductions in cumulative goal tracking error and control input effort of up to 49.35% for DDPG, 11.29% for TD3, 2.22% for SAC, and 2.41% for PPO compared to baseline RL techniques that use reward functions based solely on proportional terms. These findings offer a practical foundation for deploying more efficient motion systems in robotic settings where adaptability and precision are critical.

Original languageEnglish
Title of host publication2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025
EditorsGaston Lefranc, Claudio Cubillos
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350357363
DOIs
StatePublished - 2025
Event2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025 - Valparaiso, Chile
Duration: 28 Oct 202530 Oct 2025

Publication series

NameProceedings - IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, ChileCon
ISSN (Print)2832-1529
ISSN (Electronic)2832-1537

Conference

Conference2025 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2025
Country/TerritoryChile
CityValparaiso
Period28/10/2530/10/25

Keywords

  • mining environment
  • path planning
  • PID reward shaping
  • reinforcement learning
  • skid steer mobile robots

Fingerprint

Dive into the research topics of 'Temporally-Structured PID Reward Shaping in Reinforcement Learning for Robust Path Planning of Mobile Robots in Mining Environments'. Together they form a unique fingerprint.

Cite this