Is this calculation based on 10 genres? i.e. 40000 / 10 * 2 = 8000

Source: Deep Learning on Medium

Even we have an ideal movie-genre dataset, where all genres are equal in numbers. And each movie has an average of 2 genre. If the data size is 40000, then each genre will occur around 8000 times. We still have imbalanced dataset because the network is seeing that genre only 20% of the time. In …