100 Days of ML — Day 17 — Feature Selection: What does the University of Maryland have to do with…



A few years ago, the University of Maryland did something that made me and a number of other hardcore fans of the campus and sports cry. Like ugly cry. The University of Maryland elected to leave the ACC to join the Big 10. My sports viewing has never been the same. Now I have to click the dropdown twice to get all the information that’s relevant to me.

I remember a college writer had a very awesome article in the Testudo Times about this move and the writing always stuck with me. Ben Broman said “The University itself is its state’s premier institution, a large land-grant public in an urban environment with a very large enrollment.” I was amazed by his ability to deconstruct and compare things and connect it to the larger picture.

So what does this have to do with Germany and the World Cup?

Artificial Intelligence and Machine Learning are going to forever change the face of sports in a constant cat and mouse chess match of who’s using the most relevant data at the right time (or someone will wise up and hire Siraj and always win championships).

AI was used to predict that Germany would steamroll the competition in the World Cup. I’m not sure they made it out of the knockout round. The big issue with this prediction and other bracketology predictions is the steady-state, Markov chain nature of it. A neural network is great for data is series agnostic. Playoffs and tournaments are entirely dependent on outcomes. Germany doesn’t get a chance to win the tournament if it doesn’t pass that first round.

Second, and this is the most important part of the article, you have to be really careful about feature selection. Feature selection is the data you choose as imperative to the functioning of the neural network to get the best weights and outputs consistently. (And there’s probably a better way to say that, I’m just on a time crunch.)

There are many ways to approach this in sports and the wonderful thing about sports is that there’s always new data to play with and optimize the neural networks. I’d point them out here, but I’m trying to pitch them to team owners. (Fingers crossed).

My guess is that the AI that predicted Germany as the World Cup champions did not (or, at the time, could not) account for changes in coaching strategies and individual player decisions. I imagine they had a huge spreadsheet duct taped together among SQL and Excel and threw darts at the features. I imagine the 2020 Olympics and 2022 World Cup will make fantastic use of Reinforcement Learning and whatever new tech is discovered or proven in the mean time.


Source: Deep Learning on Medium