What’s wrong with a majority of research in A.I., Deep Learning and AGI?
First, let’s explore the latest research on “The social and cultural roots of whale and dolphin brains” published in Nature.
Encephalization, or brain expansion, underpins humans’ sophisticated social cognition, including language, joint attention,shared goals, teaching, consensus decision-making and empathy. These abilities promote and stabilize cooperative social inter-actions, and have allowed us to create a ‘cognitive’ or ‘cultural’ niche and colonize almost every terrestrial ecosystem. Cetaceans(whales and dolphins) also have exceptionally large and anatomically sophisticated brains. Here, by evaluating a comprehensive database of brain size, social structures and cultural behaviours across cetacean species, we ask whether cetacean brains are similarly associated with a marine cultural niche.
In summary, cetacean social and brain evolution represent a rare parallel to those in humans and other primates. We suggest that brain evolution in these orders has been driven largely by the challenges of managing and coordinating an information-rich social world. Although these challenges may increase with group size, it is not group size itself that imposes the challenges. In both primates and marine mammals, structured social organization is associated with higher levels of cooperation and a greater breadth of social behaviours. Thus, we propose reframing the evolutionary pressures that have led to encephalization and behavioural sophistication to focus on the challenges of coordination, cooperation, and ‘cultural’ or behavioural richness.
Cetaceans found in mid-sized social groups had the largest brains (in both absolute and relative terms), followed by those that form large communities (mega-pods); those predominantly found alone or in small groups had the smallest brains.
One of the unsolved problem of AGI research is our lack of understanding of the definition of “Generalization”. I’ve written about it several times. The problem is that most definitions of “Generalization” is incomplete. Said differently, the goal posts are defined not only in a too narrow sense but even worse in an incorrect way.
What I am going to suggest in this article is that our measure of intelligence be tied to our measure of social interaction.
As I write this, I am thinking of several thought provoking titles for this:
- Conversation Cognition in Intelligence Design
- Deep Conversational Learning
- Contextual Mechanics
- The Conversational Principle of Intelligence
- A New AGI Inspired Generalization
Either one of these are equally good, but I will settle with the current title.
Ultimately, to build improved artificial intelligence we need more automation that can generalize extremely well. Unfortunately, you can query researchers as to what generalization means and they will give you a litany of overly simplistic definitions. How then can we possibly achieve AGI when we have a very shallow understanding of what generalization means. In this article, I will propose an entirely new definition. I call this Conversational Cognition (I will have to find a more unique name and striking name later!). From this perspective, we can draft an opinionated roadmap as to how to achieve AGI.
To summarize, there are many ways that have been proposed to characterize generalization:
- Errors Response
- Sparsity of Model (Alternatively Simplicity of Model or Sherlock Holmes’ method)
- Fidelity in Generating Models
- Effectiveness in ignoring nuisance variables (Invariance based)
- Risk Minimization
I’m proposing Conversational Cognition be added to this list.
An ecological approach to cognition is based on an autonomous system that learns by interacting with its environment. Generalization in this regard is related to how effectively automaton is able to anticipate the contextual changes in the environment and perform the required context switches to ensure high predictability. The focus is not just in recognizing chunks of ideas, but also being able to recognize the relationship of these chunks with other chunks.
This notion is at one higher level of complexity as compared to risk minimization. Risk minimization demands a predictive model that is able to function effectively in the presence of imperfect information of the world. This implies that an automaton’s internal models of reality must be able to accommodate vagueness and unknown information. However, the definition seems to only implicitly allude to the need to support context switching.
In realistic environments, a system must be able to adjust appropriately when a context changes. These environments are sometimes referred to as “non-stationary”. It is not enough to have models that are able to model the world in a single context. The most sophisticated form of generalization that exists demands the need to perform conversations.
This conversation however is not confined to only an inanimate environment with deterministic behavior. The conversation must also be available for the three dimensions of cognition. Specifically, we need to explore conversation for computation, autonomy and social dimensions. In the computational dimension, the AlphaGo Zero self-play is an effective demonstration of adversarial agent play in a deterministic environment. In autonomous environments, we require models that are able to continue to perform their capabilities in different similar domains and if necessary, be able to adapt to context changes in environments. An example of this would be biological systems adjusting their behavior in response to cyclic changes in the seasons. The social environment will likely be the most sophisticated in that a system may demand the understanding the nuisances of human behavior. This may include complex behavior such as deception, sarcasm and negotiation.
The needs of survival will also require the development of cooperative behavior. It is only through recognizing that skilled conversation is a necessary ingredient to achieve this. Effective prediction of an environment is an insufficient skill to achieve cooperative behavior. The development of language however is a necessary skill.
Effective conversation is a two way street. It requires not only understanding an external party but it also requires the communication of an automaton’s inner model. This communication requires more than decompression, but rather it requires the appropriate contextualized communication that anticipates the cognitive capabilities of other conversing entities. Good conversation requires good listening skills as well as the ability to assess the current knowledge of a participant and performing the necessary adjustment to convey information that a participant can relate to. This is indeed an extremely complex requirement. However, if we are seeking out Artificial General Intelligence then it only does make sense that we should begin accepting a more complex measure of intelligence.
Conversation can also be considered as a kind of game play. That is, an agent may have goals that demand that it devise approaches to recruit other participants to aid it in achieving its goals. This however may seem to demand almost human like capabilities to even achieve. The question therefore is whether we can build primitives of cooperation that step us closer to this kind of generalization.
Conversational Cognition is related to Cognitive Synergy. In cognitive synergy different agents may pick up from where a conversation left off.
A focus on conversation is actually not a new thing in the study of intelligent system. However, this approach has recently taken a back seat with the development of effective machine learning techniques. I however would like to get a more serious exploration of this approach in the context of deep learning.
Research in primitives for cooperative agent system has been studied extensively in the past. Winograd’s Speech Acts and FIPA are excellent starting point to identify composable language elements to build more complex forms of cooperation. Furthermore, we can leverage our understanding of the Loose Coupled Principle (LCP). LCPhas at its core the assumption that the most likely method that nature will select will be the method that requires the least amount of assumptions (or preconditions).
Societies and large scale organizations also require effective coordination to scale.
What sort of measure can we use to evaluate the effectiveness of a conversation? I will leverage Lawrence embodiment metric, however is should measure at a minimum 3 legs of a conversation. This includes the initial communication, the response to the communication followed by a response to the final response. Conversations of course can go on forever, but a measure is based on:
(1) The ability to articulate an internal mental model of the external participant.
(2) Receive a response for a communication and re-assess the internal mental model.
(3) Finally, articulate a second response based on the newly changed mental model.
This measurement is applicable regardless of the intelligence of the external participant.
Conversational Cognition is the most abstract form of generalization. Let’s compare this new measure of generalization with alternative approaches frameworks for cognition.
Karl Friston’s Free Energy — This idea makes use of the ubiquitous concept in Physics known as the Principle of Least Action. All dynamics, even cognitive dynamics must take this idea into account. This is fundamental as to how nature works.
Wiessner Maximizing Options — This is a unique idea that explains intelligence as the maximizing of options. This like the principle of least action is a universal one. The idea is that rather than just blindly choosing the most efficient action, intelligence requires the action that maximizes survival.
Compression — There is overly simplified notion that compression implies intelligence. Perhaps it is because compression is a measurable attribute of language. However, are all notions of compression the same? Do all compressions allow for composition and grounding? Language is a kind of compression, however it has many attributes that are required outside that of compression.
Kolmogrov Complexity — This a favorite of many AGI researchers. AIXI is the ultimate theory. However, it is based on a belief set that aligns computational ability with that of general intelligence. There is no evidence that increased computational ability automatically leads to intelligence. Rather, certain cognitive skills such as language needs to be effectively used. One other problem with this approach is that it demands approaches that are incomputable and lead to broad conclusions that certain situations are not possible.
Integrated Information Theory — IIT defines a measure of consciousness that relates to how non-decomposable thought processes are within an entity. Integration may be a advanced cognitive ability, however, there are many animals with higher integration capabilities the humans. Pigeons for example are much better at multi-tasking than humans. Integration leads only to greater awareness of the environment, however it does not demand a requirement to develop strategies to cooperate with other entities. Babies learn very quickly how to manipulate their parents in pursuit of their own goals.
For many people who are involved in developing complex products require many agents to cooperate, this definition of generalization is entirely obvious. Surprisingly though, the AI community has mostly adopted overly simplistic notions of generalization. These notions have researchers focusing on entirely the wrong problem, and likely have nothing to do with achieving AGI. The ability to effectively perform a conversation with the environment is the essence of AGI.
Source: Deep Learning on Medium