Neural Network training explained simply.

Source: Deep Learning on Medium

An untrained model would have the exact same infrastructure of tracks and junctures through each layer (there are many reusable models such as ResNet34, ResNet50, Inception, AHA, etc.), but every person that knew anything about trains had gone on strike and due to an unfortunate impasse, they eventually found work as Uber drivers and Amazon fulfillment drones, and now you had to literally train new employees to learn all the switches.

Training a model is like this: every train that left Stockholm for an entire day would simply end up in a random destination. At the end, an employee with a mustache and clipboard would come up to you and say “Is this where you wanted to go?” and you’d say “No, my ticket says Madrid. This is Kiev!” But, not to be bothered, you’d go about your business and order some Borscht and potato pancakes and listen to some Ukranian techno.

The employee would then take your feedback and send it back up the chain. Saying “For this particular ticket, he was supposed to end up in Madrid, everyone take note!” And he’d do the same for all the other trains that missed their marks. All the train switch operators would then take note, and they’d update their Ticket/Destination tables. They’d adjust their weights.

The following day, every train leaving Stockholm would try again. Because there are so many countries and so many junctions and destinations, maybe this time some of the trains would still end up at random places, but maybe some would end up closer than before, and maybe some would end up on target.

The companies would continue this training until every ticket ended up at its respective destination. And finally they’d consider it a trained network.

Using a trained network is like this: if your network is trained, you can feed data into it, and your network will know what output to give you. You can continually provide feedback that it was right, or that it made a mistake, and it will gradually adjust the weights with a goal of reducing the amount error between what was intended, and what was actually output. Alternatively, you can just leave it as is and consider it your finished product. Not all networks undergo continual updating and training once the training phase is completed.