Original article can be found here (source): Deep Learning on Medium
Predicting the Winning Side in Esports Using Tensorflow (2)
We all know the unreasonable effectiveness of neural networks in classification when there is big data in hand. Once a significant amount of data can be gathered, labelled, and pre-processed, then the simple process of importing the tensorflow library and fiddling around with the most basic neural network structure can give surprising results.
This series of articles will go through the process I went through to succeed in making a neural network model that constantly gives about 70% accuracy in predicting the winning side in “League of Legends” game, only using information about the match before the game starts. And it will continue to discuss my research into increasing that prediction accuracy.
This is by no means a refined article, and at times I’ll blabber on and you’ll get distracted.
The features I have used for my model are win-rates.
An obvious fact about a League of Legends game is that a team of good players controlling strong champions would win. Here being a “good player” means more specifically that the player is good at controlling that specific champion, as the player has to be able to match the champion’s play-style to maximize the champion’s potential.
So the problem becomes, what data shows the following two features in numbers?
- How well a player controls a certain champion
- How strong a champion is in the current meta
The first feature is captured by the player’s win-rate with that specific champion. For example, a player who only plays a champion named Nidalee will most likely have a high win-rate when playing that champion.
The second feature is captured by the win-rate of that specific champion in the current meta/patch. A strong champion is known to have a high win-rate (usually equal to or bigger than 53% when counting from all tiers combined).
We collect this data for every player in each team, so that each input vector to the neural net would be 20 data points large, like the following:
[0.5114, 0.52, 0.5275, 0.619, 0.5074, 0.727, 0.4999, 0.517, 0.5187, 0.659, 0.5034, 0.0, 0.5005, 0.5, 0.4448, 0.257, 0.5065, 0.286, 0.5199, 0.544]
Every 2*i | 0≤i≤9 element is the champion’s win-rate in the current meta/patch, and the 1+i | 0≤i≤9 is the player’s win-rate while controlling that specific champion.
Model and Result
I won’t go too deeply into the model as it is the most basic deep neural network that you could imagine. I used the tflearn library to build a fully-connected neural network with four hidden layers. In retrospect, I probably did not need as many as four hidden layers; maybe one or two might have sufficed. Batch training was used. I’m not going to uncover the parameters and hyper-parameters, as they should be pretty easy to adjust if trying to reproduce the results. Various parameters and hyper-parameters were tried, but the performance of the network remained consistent throughout.
The performance of the neural network depends on the parameters/hyper-parameters, but the most representative numbers were as follows:
Training accuracy: 0.7255
Validation accuracy: 0.6905
Test accuracy: 0.7033
Also, as softmax activation was used at the output layer, the output vector shows the “confidence” the network has in the match being a victory or a loss as a vector of size two (for example [0.317, 0.683] would show that the model believes the match to be a loss). If we gather all the matches where the model had 0.8 or more “confidence” that the match will result in a match, then we get 76.76% accuracy.
Big Room for Improvement
What remains so promising about this model is that there is a lot of room for improvement, and therefore, a possibility for bigger accuracy.
- Red side / Blue side
When placed into a League of Legends match, each player is assigned to either a red side, or a blue side. Traditionally, there has been a difference in win-rate depending on which side you’re positioned on, and while in the current patch there is no big difference (50.8% win-rate for the blue side), it is a number that changes from patch to patch and can be taken into consideration.
- Champion Synergy
Synergy is one of the features included in previous studies, and is solely responsible for the 74% prediction accuracy recorded by To win or not to win. Coming up with a way to accurately reflect champion synergy and input it as an input feature could enhance the model’s accuracy.
- Refinement of Data
In the current data used for train/dev/test set, some data were impossible to find and therefore were left to chance. These loan to the fact that I heavily used op.gg and other League of legends analytics websites to gather champion statistics specific to players.
For example, if when per-processing the data, the python crawling script could not find a champion specific win-rate for a player, that win-rate got automatically assigned a 0.5 value. This was a reasonable choice because in most cases, the champion specific win-rate does not appear because of the lack of that specific champion’s usage, which means that it is just as likely for that player to be bad at using the champion as he/she is good at it.
By devoting more time crafting my own League of Legends database, I could gather data with less holes in them, and therefore improve the performance of the NN model.
- Form of a Player
A common phenomenon between players is that sometimes players get in a “roll.” This means that when a player has been winning several matches in a row, he feels that he is likely to win again the next match, and vice versa. In professional tournament settings, it is nearly a given fact that a team could have very different performance depending on the form of the players. KT Rolster in LCK Spring 2020, for example, performed extremely poorly for the majority of the season, but showed remarkable prowess in the last two weeks of the tournament, owing to the increased “form” of the players and the team. This could also be an additional feature that impacts match prediction.
- Champion Composition
Very similar to champion synergy, yet a bit different, champion composition refers to how a team is structured before a match. For example, the blue side team might have one healer champion, two tank champions, and two dealer champions, which depending on the patch/meta could influence that team’s win possibility.
Another distinction could be the “early-game” champions and “late-game-carry” champions. These labels differentiate champions based on when that champion has the most influence on the game, and therefore could be another way to represent team composition.
As I am in serving in the military right now, I do have limits in how fast/efficiently I can pursue this research. But I’ll do it nonetheless and keep posting on what I get.
Steps that I will take next include:
Learning and setting up DynamoDB on AWS.
Creating script files to set up my own database on DynamoDB.
Signing up for Development Api Key for RiotApi.
Learning how to represent champion synergies and compositions.