this explanation is so ambiguous.

Original article was published on Deep Learning on Medium

…ot of the distribution for values ranging from 0–1 for the default threshold. As shown in the plot, the probability of a draw being greater than P increases as the frequency increases, and therefore, it’s probability of being discarded increases as the frequency does as well. This only applies to unsupervised models. Words are not discarded for a supervised model.