TY - GEN
T1 - A Machine Learning Approach for Intervention Recommendation and Production Forecasting in Mature Ecuadorian Oil Fields
AU - Batallas-Riofrio, Estefi
AU - Flores-Moyano, Ricardo
AU - Baldeon-Calisto, Maria
AU - Grijalva, Felipe
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - A machine learning (ML) approach is presented to optimize decision-making in well intervention planning for mature oil fields in Ecuador. Supervised classification and regression are used to automate two critical tasks: (1) recommending the most appropriate intervention type, and (2) forecasting the expected fluid production increment (triangle BFPD) postintervention. Drawing on four integrated historical databases encompassing over 76,000 records of production, pressure, and reservoir attributes, a structured pipeline was implemented, including pre-processing, feature selection, and class balancing using SMOTE + TomekLinks. For activity classification, Random Forest paired with SelectKBest yielded the best performance, with an F1-macro score of 0. 7 9. For production forecasting, XGBoost achieved an R2 of 0.728 and MAE of 219.11 for REPERF interventions. Because of the poor predictive performance of MATRIX STIMULATION (R2<0), clustering (K-means + PCA) was explored, revealing internal heterogeneity, but insufficient to improve accuracy, highlighting the need for enhanced data granularity. This dual-model approach reduces the manual candidate screening time by over 2,000 hours annually, introduces data-driven prioritization in campaign planning, and lays the foundation for scalable, real-time recommendation systems in high-density oil field operations.
AB - A machine learning (ML) approach is presented to optimize decision-making in well intervention planning for mature oil fields in Ecuador. Supervised classification and regression are used to automate two critical tasks: (1) recommending the most appropriate intervention type, and (2) forecasting the expected fluid production increment (triangle BFPD) postintervention. Drawing on four integrated historical databases encompassing over 76,000 records of production, pressure, and reservoir attributes, a structured pipeline was implemented, including pre-processing, feature selection, and class balancing using SMOTE + TomekLinks. For activity classification, Random Forest paired with SelectKBest yielded the best performance, with an F1-macro score of 0. 7 9. For production forecasting, XGBoost achieved an R2 of 0.728 and MAE of 219.11 for REPERF interventions. Because of the poor predictive performance of MATRIX STIMULATION (R2<0), clustering (K-means + PCA) was explored, revealing internal heterogeneity, but insufficient to improve accuracy, highlighting the need for enhanced data granularity. This dual-model approach reduces the manual candidate screening time by over 2,000 hours annually, introduces data-driven prioritization in campaign planning, and lays the foundation for scalable, real-time recommendation systems in high-density oil field operations.
KW - Oil well intervention
KW - candidate selection
KW - machine learning
KW - production forecasting
UR - https://www.scopus.com/pages/publications/105035993978
U2 - 10.1109/CAIT68620.2025.11424823
DO - 10.1109/CAIT68620.2025.11424823
M3 - Contribución a la conferencia
AN - SCOPUS:105035993978
T3 - 2025 6th International Conference on Computers and Artificial Intelligence Technology, CAIT 2025
SP - 63
EP - 68
BT - 2025 6th International Conference on Computers and Artificial Intelligence Technology, CAIT 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 6th International Conference on Computers and Artificial Intelligence Technology, CAIT 2025
Y2 - 12 December 2025 through 14 December 2025
ER -