Source: Deep Learning on Medium

(Originally published on my blog in 2015)

Huge money is being invested in quantum computing (QC), IBM, Intel, Microsoft, Google, China… However, I propose that it will not arrive (or at least arrive first) by the path these giants are pursuing. All the “mainstream” efforts to build QC are based on exotic physical mechanisms, e.g., extremely low temperature entities that can remain in superposition (i.e., do not decohere). But this is not needed to realize QC. As I’ve argued elsewhere, QC is physically realizable by a standard, single-processor Von Neumann computer, i.e., your desktop, your smart phone. It’s all a matter of how information is represented in the system, not of the physical nature of the individual units (memory bits) that represent the information. Specifically, the key to implementing QC lies in simply changing from representing information *localistically* (see below) to representing information using *sparse distributed representations* (SDR), a.k.a. *sparse distributed coding* (SDC). We’ll use “SDC” throughout. To avoid confusion, let me emphasize immediately that SDC is a completely different concept than the far more well-known and (correctly) widely-embraced concept of “sparse coding” (Olshausen & Field, 1996), though they are completely compatible.

A *localist* representation is one in which each item of information (“concept”) stored in the system, e.g., the concept, ‘my car’, is represented by a *single*, (or, atomic) unit, and that physical unit is disjoint from the representations of all other concepts in the system. We can consider that atomic representational unit to be a *word* of memory, say 32 or 64 bits. No other concept, of any scale, represented in the database can use that physical word (representational unit). Consequently, that single representational unit can be considered *the* physical representation of my car (since all of the information stored in the database, which together constitutes the full concept of ‘my car’, is reachable via that single unit). This meets the definition of a localist representation…the representations of the concepts are physically disjoint.

In contrast to *localism*, we could devise a scheme in which each concept is represented by a subset of the full set of physical units comprising the system, or more specifically, comprising the system’s memory. For example, if the memory consisted of 1 billion physical bits, we could devise a scheme in which the concept, ‘my car’, might be represented by a particular subset of, say, 10,000 of those 1 billion bits. In this case, if the concept ‘my car’ was active in that memory, that set of 10,000 bits, and only that particular subset, would be active.

What if some other concept, say, ‘my motorcycle’, needs to become active? Would some other subset of 10,000 bits that is completely disjoint from the 10,000 bits representing my car, become active? No. If our system was designed this way, it would again be a localist representation (since we’d be requiring the representations of distinct concepts to be physically disjoint). Instead, we could allow the 10,000 bits that represent my motorcycle to share perhaps 5,000 bits in common with my car’s representation. The two representations are still unique. After all, they each have 5,000 bits — half their overall representations — not in common with each other. But the atomic representational units, bits, can now be shared by multiple concepts, i.e., representations can physically overlap. Such a representation in which a) each concept is represented by a small subset of the total pool of representational units and b) those subsets can intersect, is called a *sparse distributed code *(S*DC)*.

With these definitions in mind, it is crucial (for the computer industry) to realize that to date, virtually all information stored electronically on earth, e.g., all information stored in fields of records of databases, is represented *localistically*. Equivalently, to date there has been virtually no commercial use of SDC on earth. Moreover, only a handful of scientists have thus far understood the importance of SDC, Kanerva (~1988), Rachkovskij & Kussul (late 90’s), myself (early 90’s, Thesis 1996), Hecht-Nielsen (~2000), Numenta (~2009), and a few others. Only in the past few years, have the first few attempts at commercialization begun to appear, e.g., Numenta. Thus, two things:

- The computer industry may want to at least consider (due diligence) that SDC may be the next major, i.e., once-in-a-century, paradigm shift
- it could be that SDC = QC

With SDC, it becomes possible for those 5,000 bits that the two representations (‘my car’ and ‘my motorcycle’) have in common to represent features (sub-concepts) that are common to both my car and my motorcycle. In other words, *similarity of objects in the world can be represented by physical overlap of the representations of those objects*. This is something that cannot be achieved with a localist representation (because localist representations don’t overlap). And from one vantage point, it is the reason why SDC is so superior to localist coding, in fact, exponentially superior to localist coding.

But, the deep (in fact, identity) connection of SDC and QC is not that more similar concepts will have larger intersections. Rather it is that if all representable (by a particular memory/system) concepts are represented by subsets of an overall pool of units and if those subsets can overlap, then any single concept. i.e., any single subset, can be viewed as, and can function as, a *probability (or likelihood) distribution over ALL representable concepts*. We’ll just use “probability”. That is, any ** single **active representation represents

**representable hypotheses in**

*all**superposition*. And if the model has enforced that similar concepts are assigned to more highly overlapping codes, then the

*probability*of any particular concept at a given moment is the

*fraction*of that concept’s bits that are active in the currently (fully) active code (making the reasonable assumption that for

*natural*worlds, the probabilities of two concepts should correlate with their similarities).

This has the following hugely important consequence. If there exists an algorithm that updates the probability of the currently active *single* concept in *fixed* time, i.e., in computational time that remains constant over the life of the system (more specifically, remains constant as more and more concepts are stored in the memory), then that algorithm can also be viewed as updating the probabilities of *all representable concepts* in fixed time. If the number of concepts representable is of exponential order (i.e. exponential in the number of representational units), then we have a system which ** updates an exponential number of concepts, more specifically, an exponential number of probabilities of concepts (hypotheses), in fixed time**. Expressed at this level of generality, this meets the definition of QC.

All that remains to do in order to demonstrate QC is to show that the aforementioned fixed time operation that maps one active SDC into the next — or equivalently, that maps one active probability distribution into the next — changes the probabilities of all representable concepts in a sensible way, i.e., in a way that accurately models the space of representable concepts (i.e., accurately models the semantics, or the dynamics, or the statistics, of that space). In fact, such a fixed time operation has existed for some time (since about 1996). It is the Sparsey® model (formerly TEMECOR, and see thesis at pubs). And, in fact, the reason why the updates to the probability distribution (i.e., to the superposition) can be sensible is that is, as suggested above, that similarity of concepts can be represented by degree of intersection (overlap) of their SDCs. My arXiv paper “A Radically New Theory of how the Brain Represents and Computes with Probabilities”, presents results showing this fixed-time update of the entire probability distribution.

I realize that this view, that SDC is identically QC, flies in the face of dogma, where dogma can be boiled down to the phrase “there is no classical analog of quantum superposition”. But I’m quite sure that the mental block underlying this dogma for so long has simply been that that quantum scientists have been thinking in terms of localist representations. I predict that it will become quite clear in the near future that SDC constitutes a completely plausible classical analog of quantum superposition.

..more to come, e.g., entanglement is easily and clearly explained in terms of SDC…