This notebook predicts the outcome of football matches using historical match data. The test season is the 2019/2020 English Premier League (the prior 10 seasons across the top 5 European leagues are used for training).
I'm using demo_nb.ipynb as the front end of this project. This explains in great detail the approach and displays the latest round of results that I had on the test set. The underlying models and data processing functions are in the common folder.
To install the dependencies,
pip install -r requirements.txt`
I have not packaged everything up yet to enable installation via pip. Please clone the repo and explore the demo notebook it explains thoroughly what is going on.
To get the match result data go to this site. Here you will find multiple seasons of football data covering pretty much all the leagues in the world. The data mainly consists of match results, total shots and betting odds. Download the data and store in a folder within the project. I have not included the raw data on the repo as its bad etiquette to do so! It doesn't matter how you store the data (i.e folder structure), just use the load_all_matches function in the data_methods.py module and pass your data folder location as the argument (it will search all the subfolder by default).
Then go ahead and run the notebook and make predictions!
Contact Fraser Ewing
See License file