Skip to main navigation Skip to search Skip to main content

Scoring with Data: An Ensemble of Machine Learning Models for Premier League Match Predictions

  • Universidad San Francisco de Quito
  • Wingate University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Predicting the outcomes of football matches remains a complex task due to the numerous dynamic and unpredictable factors that can influence the results. This study proposes a datadriven approach to classify English Premier League matches as wins, draws, or losses using machine learning and deep learning techniques applied to data from the last eight seasons. A web scraping tool was developed to systematically collect relevant match statistics and team information. The performance of Random Forest, XGBoost, and TabNet models was evaluated, along with an ensemble model that combines their complementary strengths. Results show that the ensemble model achieves higher predictive accuracy, especially when recent team performance metrics are included. A feature importance analysis highlights variables such as recent form, expected goals, and possession as critical for accurate prediction. Lastly, the ensemble model is benchmarked against external sources, including an AIbased predictor and a professional betting house, providing a comparative assessment of its practical applicability.

Original languageEnglish
Title of host publication2025 6th International Conference on Computers and Artificial Intelligence Technology, CAIT 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages69-74
Number of pages6
ISBN (Electronic)9798331558826
DOIs
StatePublished - 2025
Event2025 6th International Conference on Computers and Artificial Intelligence Technology, CAIT 2025 - Huizhou, China
Duration: 12 Dec 202514 Dec 2025

Publication series

Name2025 6th International Conference on Computers and Artificial Intelligence Technology, CAIT 2025

Conference

Conference2025 6th International Conference on Computers and Artificial Intelligence Technology, CAIT 2025
Country/TerritoryChina
CityHuizhou
Period12/12/2514/12/25

Keywords

  • Machine learning
  • Random Forest
  • TabNet
  • XGBoost
  • deep learning
  • football analytics
  • match results

Fingerprint

Dive into the research topics of 'Scoring with Data: An Ensemble of Machine Learning Models for Premier League Match Predictions'. Together they form a unique fingerprint.

Cite this