Optimization of shovel and truck allocation and scheduling in open pit mines using simulation based on reinforcement learning algorithms

Document Type : Research Note

Authors

1 Tarbiat Modares University

2 Prof. of Industrial and Systems Engineering / Tarbiat Modares University

10.24200/j65.2025.66883.2436

Abstract

The intricate nature of material transportation systems in extensive open-pit mines presents significant challenges for conventional planning methodologies. These traditional approaches often fall short in effectively managing the dynamic interplay between large fleets of trucks and shovels, navigating complex routes, and adapting to ever-changing operational conditions. Consequently, this can lead to suboptimal resource allocation, decreased productivity, and increased operational expenditures. To overcome these limitations and boost productivity, a simulation system leveraging the power of reinforcement learning (RL) algorithms has been developed to optimize the crucial aspects of shovel and truck allocation and scheduling within open-pit mining environments. By employing RL, an intelligent agent undergoes a training process to make informed decisions regarding the most efficient vehicle movement pathways, establish precise schedules for loading and unloading operations, and strategically allocate available resources across the mine site. This agent learns through continuous interaction with a simulated mine environment, receiving feedback on its actions to progressively refine its decision-making policies.The findings of this research underscore the effectiveness of the implemented RL model, which utilizes the Q-learning algorithm as its core learning mechanism. The model demonstrated a significant ability to learn the optimal assignment of trucks to shovels, consistently outperforming simpler, more conventional models in terms of efficiency and productivity. The success of the Q-learning approach lies in its iterative process of updating Q-values – which represent the expected future reward for taking a specific action in a particular state and its capacity to learn from the accumulated experience gained through simulated operations. Through this continuous learning and adaptation, the RL model gradually converges towards an optimal operational policy. This optimized policy results in a higher frequency of successful material downloads, minimizing idle times and bottlenecks within the transportation network. Ultimately, this leads to a substantial enhancement in the overall efficiency of the mine's transportation system. The implications of this research highlight the transformative potential of integrating reinforcement learning techniques into the planning and management of mining operations, paving the way for the development and deployment of more intelligent, adaptive, and ultimately more productive autonomous and semi-autonomous systems in the mining sector.

Keywords

Main Subjects