An Intuitive Introduction Of Boltzmann Machine

Source: Deep Learning on Medium


Boltzmann Machine were first invented in 1985 by Geoffrey Hinton, a professor at the University of Toronto. He is a leading figure in the deep learning community and is referred to by some as the “Godfather of Deep Learning”.

  • Boltzmann Machine is a generative unsupervised models, which involve learning a probability distribution from an original dataset and using it to make inferences about never before seen data.
  • Boltzmann Machine have an input layer (also referred to as the visible layer) and one or several hidden layers (also referred to as the hidden layer).
  • Boltzmann Machine use neural networks with neurons that are connected not only to other neurons in other layers but also to neurons within the same layer.
  • Everything is connected to everything. Connections are bidirectional, visible neurons connected to each other and hidden neurons also connected to each other
  • Boltzmann Machine doesn’t expect input data, it generate data. Neurons generate information regardless they are hidden or visible.
  • For Boltzmann Machine all neurons are same, it doesn’t discriminate between hidden and visible neurons. For Boltzmann Machine whole things is system and its generating state of the system.

The best way to think about it is through an example nuclear power plant.

  • Suppose for example we have a nuclear power station and there are certain thing we can measure in nuclear power plant like temperature of containment building, how quickly turbine is spinning, pressure inside the pump etc.
  • There are lots of things we are not measuring like speed of wind, the moisture of the soil in this specific location, its sunny day or rainy day etc.
  • All these parameters together form a system, they all work together. All these parameters are binary. So we get a whole bunch of binary numbers that tell us something about the state of the power station.
  • What we would like to do, is we want to notice that when it is going to in an unusual state. A state that is not like a normal states which we had seen before. And we don’t want to use supervised learning for that. Because we don’t want to have any examples of states that cause it to blowup.
  • We would rather be able to detect that when it is going into such a state without even having seen such a state before. And we could do that by building a model of a normal state and noticing that this state is different from the normal states.
  • That what Boltzmann Machine represent.
  • The way this system work, we use our training data and feed into the Boltzmann Machine as input to help system adjust its weights. It resemble our system not any nuclear power station in the world.
  • It learns from input, what are the possible connections between all these parameters, how do they influence each other and therefore it becomes a machine that represent our system.
  • We can use this Boltzmann Machine to monitor our system
  • Boltzmann Machine learn how system work in its normal states through good example.

Boltzmann Machine consist of a neural network with an input layer and one or several hidden layers. The neurons in the neural network make stochastic decisions about whether to turn on or off based on the data we feed during training and the cost function the Boltzmann Machine is trying to minimize.

By doing so, the Boltzmann Machine discovers interesting features about the data, which help model the complex underlying relationships and patterns present in the data.

These Boltzmann Machine use neural networks with neurons that are connected not only to other neurons in other layers but also to neurons within the same layer. That makes training an unrestricted Boltzmann machine very inefficient and Boltzmann Machine had very little commercial success.

Conclusion

The Boltzmann Machine is a very generic bidirectional network of connected neurons. For instance, neurons within a given layer are interconnected adding an extra dimension to the mathematical representation of the network’s tensors. Consequently, the learning process for such network architecture is computationally intensive and difficult to interpret.

I hope this article helped you to get the Intuitive understanding Of Boltzmann Machine. I think it will at least provides a good explanation and a high-level architecture.