Sports Analytics – Using Python to create Spain’s ideal starting XI for the World

Header Image

Objective:

Model Spain’s most optimal starting XI based on the video game FIFA 22 data.

FIFA 22 Player's Dataset Python Code – Spain's XI

Study:

We will download the FIFA 22 video game dataset for this study. The dataset provides numerous data points and interesting facts and statistics about the players on an individual level, fortunately for this analysis we will be using only a few of the columns on the pertinent dataset… All the transformations and mining of the dataset has been already inputted into the code (So that you don’t have to do it). Some of the mining includes filtering out by the nations that will participate on the tournament as well as the players who have been injured recently.

Image 1

Once the cleansing of data is done we will proceed to the formatting of the teams as well as for the creation of the most optimal 26-Man squad, once that is done we will dive into the determination of the best formations or the ones that would fit Spain the better.



Now that we’ve defined the criteria we will use a series of functions that will retrieve the best players on each position, we will determine this by a series of algorithms based around the overall ranking of the players in Spain (Unfortunately the dataset is way too extensive and defining more criteria would just make things messy, hence the decision to take the overall of the player’s as the main metric).

Image 2

The final lines to determine the team are pretty basic as we are just selecting the team we want to have a line u pictured along with the dictated formation. I invite you to recycle it as this code can be used for all the teams in the World Cup.



Once the hard coding is done and you can find your ideal XI on the console there is a tool I found on the internet that uses the package turtle (A python package that allows you to ¨paint ¨ into plots). I found this piece of code from a YouTube tutorial in case you want to use it for your future projects. I’m also linking the documentation of the package used to create the soccer pitch on an external screen.

Link to YouTube Video Documentation of mplsoccer package

The last part of the code, I haven’t been able to automate this yet but you must input the names of the players that the model selected manually. As this is a ¨painting ¨ over the pitch and it is done by vectors you must write down the name of the players by position and you will be done with your ideal XI for the World Cup!

Image 3

This is what the final product should look like, it will look different depending on the formation and players… But this is it! As a fun fact, I ended this project on November 27th, 2022 while Spain was playing against Germany… The model selected 6 out of the 11 players that started on that game (Though I considered 3 players that the national coach decided not to call for the World Cup)… I encourage you to try it for other nations and see the results for yourself!

Image 4

*This code was created inspired from the internet and different sources, the dataset was obtained from FIFA (The video game, which can be found online). If there’s any intend for you to use the code for your personal projects or learning I encourage you to do so, If you have any questions send me an email to aserranofigueroa@su.suffolk.edu

-Alfredo S.