Catapulting Artificial Intelligence Ahead with Hierarchical Temporal Memory Models

Source: Deep Learning on Medium

The prime examples of Artificial Intelligence today — image recognition, voice, and speech recognition — are powered by neural networks.

However, neural networks face two major challenges –

  1. They require massive amounts of annotated training data that can slow deployments
  2. They are of limited use on rapidly changing data.

AI models leveraging Hierarchical Temporal Memory (HTM) are an emerging solution to this problem. HTM models don’t require large training data quantities, effectively mimic human learning, and are ideal for constantly changing real-time data.

Moreover, they can solve a range of business problems including fraud detection, cyber security, speech and image recognition, increasing AI’s overall enterprise utility.

Closer to Human Learning

Neural networks don’t learn quite the same way humans do. People don’t need a surplus of training data to learn. For instance, children can look at fruit they’ve never seen before (such as mangos) and recognize other mangos from the initial example by associating it with their other knowledge.

HTM models replicate this functionality — which is how the brain’s cortex works — with the concept of Sparse Distributed Representation (SDR). SDRs are how data are represented in HTM models. That representation is similar to how neurons function in the brain.

For example, when people think about something, certain neurons ‘fire up’. Neurons are connected by synapses: the more neurons that fire up connected to each other, the more we learn about a subject. SDRs emulate this functionality.

An SDR is similar to a large matrix consisting predominantly of zeros. When building an HTM model to recognize the word ‘dog’, for example, a few matrix cells turn on from 0 to 1. Only about two percent of the cells do so: hence the name SDR. In this manner SDRs represent almost anything, making them useful for text analytics, image recognition, and speech recognition.

Sparse Distributed Representation — Properties

The most difficult aspect of SDRs (and by extension, HTM models) is encoding them. Encoders convert whatever entity the model’s focused on (such as words or images) into the SDR.

Feeding encoders is challenging because of the properties of SDRs. For instance, SDRs represent the fruit apple, the corporation Apple, and the word ‘company’ with respective combinations of ones and zeros. By subtracting the combination for the term company from that of Apple Corporation, SDRs represent the fruit apple.

Although these properties enable a host of dynamic, flexible combinations that are optimal for rapidly changing streaming data, for example, they make encoding SDRs cumbersome.

HTM vs. Neural Networks

Let’s see what gives HTM models edge over the neural networks,

  1. While neural networks use sophisticated math for supervised learning based on large training datasets, HTM models use much smaller datasets for unsupervised learning based on biological principles.
  2. Neural networks need large datasets because they learn via back propagation, a mechanism in which the different layers of neural networks adjust their weights and biases based on labeled input data.

For example, a neural network devised to differentiate dog images from cat images will have two neurons on the uppermost layer, one for recognizing dogs and one for recognizing cats. If the neuron for a cat image fires when there’s an image of a dog, the network continues to analyze training data to adjust those weights and biases based on the animals’ features until the system consistently recognizes dogs. Those weights and biases are adjusted from the network’s top to its bottom: thus the term back propagation.

HTM models, however, rely on Hebbian Learning, an algorithm predicated on strengthening the synapse between neurons when the input and output neurons have extremely correlated outputs. Therefore, they require less amount of data.

Practical (Business) Value

Because Hebbian learning enables rapid learning on quick, unlabeled data of different sources, HTM models are well suited for anomaly detection use cases in addition to conventional image or speech recognition ones.

Network anomaly detection can encompass facets of fraud detection and cybersecurity. It’s also influential in supporting physical network security by analyzing video data images in each successive frame.

Additional use cases abound in the Internet of Things related to product monitoring of manufacturing lines, equipment asset management, and predictive maintenance. Although HTM models are relatively new, adoption rates are increasing.