I was recently posed the question, “how do we define standards for AI?” I am primarily focus in the space of Deep Learning Artificial Intelligence (AI). Deep Learning is a specific set of tribes in a much wider umbrella of what is known as AI.
The term Artificial Intelligence is quite ancient and was proposed over half a century ago:
In fact, the idea of understanding human thought goes even back much earlier in history:
“The design of the following treatise is to investigate the fundamental laws of those operations of the mind by which reasoning is performed; to give expression to them in the symbolical language of a Calculus, and upon this foundation to establish the science of Logic … and, finally, to collect … some probable intimations concerning the nature and constitution of the human mind.”
– George Boole (1850–1864)
In fact, we can go even further back to Rene Descartes of the 16th century and even all the way back to Aristotle of 322 BC.
Western Civilization has built up a ton of intellectual baggage in its understanding of how the human brain works. This accounts for the decades failure of GOFAI, where essentially the approach is to work top-down from formal logic into deriving intuition and instinct.
Alan Turing, the father of modern computing, had in fact anticipated the correct kind of computation for the brain. Unpublished papers were discovered, 14 years after his unfortunate death, that anticipated the development of connectionist architectures. The are architectures that are more well known today as Deep Learning:
The question therefore that I am seeking to understand regarding the standardization of AI is not standardization of all methods that are labeled under the massive AI umbrella. Rather I seek to understand the challenges of standardization for Deep Learning.
The first question to ask is “Why do we need standardization?” Standardization is associated with interoperability. So in the context of Deep Learning, what does interoperability mean and how can we achieve greater interoperability? Ever since 2012, the technology stack for Deep Learning has become significantly more complex.
Here’s a rough sketch for a DL stack in 2018:
This does not include all the other orchestration requirements that come from the data engineering or BigData world. It also does not include the application specific layers such as visualization and active learning that may also be required for a comprehensive solution. In other words, it is a vast landscape that is evolving at a rapid pace.
The insight however looking at the above stack is that there’s a lot of existing standards that are being adopted and can be leveraged. So one already has a considerable launching pad to explore DL standardization independently of standardization of other AI fields.
We can also speak about standardization at a level above the technology stack. That is from the perspective of industrial processes in the development of these new DL based systems. A paper in 2017 “BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain” by Tianyu Gu, Brendan Dolan-Gavitt and Siddharth Garg provides a good starting point in discussing the need to focus on data quality that is used in training DL systems. Perhaps ideas from the world of biotechnology manufacturing controls may be provide better insight of what needs to considered here.
A focus on process controls also brings into question what standardizations we need to put in place with regards to safety, performance, latency, correctness, bias and even privacy. In fact, there’s a lot to talk about with regards with how we handle data and data provenance that’s extremely important. It is in fact more important with machine learning methods like Deep Learning that derives its behavior directly from the data it is trained on.
It is always instructive to take a look at current standardization in the Automotive field. For this, we can learn from the Society of Automation Engineering (SAE). SAE has an international standard which defines six levels of driving automation (SAE J3016). This can be useful in classifying the levels of automation in domains other than self-driving cars. A broader prescription is as follows:
Level 0 (Manual Process)
The absence of any automation.
Level 1 (Attended Process)
Users are aware of the initiation and completion of the performance of each automated task. The user may undo a task in the event of incorrect execution. Users, however, are responsible for the correct sequencing of tasks.
Level 2 (Attended Multiple Processes)
Users are aware of the initiation and completion of a composite of tasks. The user however is not responsible for the correct sequencing of tasks. An example will be the booking of a hotel, car and flight. The exact ordering of the booking may not be a concern of the user. However, failure of the performance of this task may require more extensive manual remedial actions. An unfortunate example of a failed remedial action is the re-accommodation of United Airlines’ paying customer.
Level 3 (Unattended Process)
Users are only notified in exceptional situations and are required to do the work in these conditions. An example of this is in systems that continuously monitor security of a network. Practitioners take action depending on the severity of the event.
Level 4 (Intelligent Process)
Users are responsible for defining the end goals of automation, however all aspects of the process execution as well as the handling of in-flight exceptional conditions are handled by the automation. The automation is capable of performing appropriate compensating action in events of in-flight failure. The user however is still responsible for identifying the specific context in which automation can be safely applied to.
Level 5 (Fully Automated Process)
This is a final and future state where human involvement is no longer required in the processes. This of course may not be the final level because it does not assume that the process is capable of optimizing itself to make improvements.
Level 6 (Self Optimizing Process)
This is an automation that requires no human involvement and is also capable of improving itself over time. This level goes beyond the SAE requirements but may be required in certain high performance competitive environments such as Robocar races and stock trading.
Ethics and Benefit to Humanity
Ultimately however, any form of AI standardization should be framed on how we can best steer AI (or AGI development) for the maximum benefit of humanity. It does not help us if our standardization leads to more advance autonomous weaponry or more advanced way to predict human behavior and thus manipulate human behavior.
The challenges of AI standardization in fact cover extremely many levels of concerns. However, it should be ultimately guided by the need to accelerate the development of human beneficial AI and not the other kind.
Source: Deep Learning on Medium