Deep Learning — Cross Entropy Loss Derivative

In this article, I will explain the concept of the Cross-Entropy Loss, commonly called the “Softmax Classifier”. I’ll go through its usage in the Deep Learning classification task and the mathematics of the function derivatives required for the Gradient Descent algorithm.


This blog post contains many Latex equations and unfortunately Medium does not support them yet. Cleaver workaround I found still does not look and have a bad reading experience.

Until Latex support is available, you can find the full article in my personal blog.