Original article was published by Anisha Musti on Artificial Intelligence on Medium
What is Artificial Intelligence? Here’s What You Need to Know.
The fundamentals of artificial intelligence and how it’s going to change the world.
I’m a pretty big Marvel fan. I’ve watched all of their movies but one of my all-time favorites is Iron Man. If you’ve ever watched it, you’ll know that Iron Man has a suit that with a voice assistant inside named J.A.R.V.I.S.
J.A.R.V.I.S. is an example of a sophisticated A.I. assistant. It has the capability to control a home, manage a suit, talk to Iron Man… you know, the typical superhero’s assistant stuff.
Although J.A.R.V.I.S. is pretty cool, it doesn’t give us the full picture of what AI is. When popular science fiction movies are telling us these stories about AI, most of us associate the term “artificial intelligence” with a futuristic robot that has the potential to take over Earth. However, there’s more to AI than most us may imagine.
AI is science fiction. AI is also part of our everyday lives. It depends on how you look at it.
Today, AI is defined as machines that respond to stimulation in a similar manner to humans and make decisions that normally require a human level of expertise.
While this could eventually include machines like J.A.R.V.I.S., for right now, AI is used in a much more discreet way. For example, Google uses AI to predict your search query as you’re tying. Similarly, Netflix uses it to recommend movies and TV shows based on your viewing history.
AI has incredible capabilities, most of which we haven’t even began to tap into. It can be used in almost every walk of life: from your local coffee shop to nuclear weapons. That might be scary, but not if we understand what it is and how to use it.
The Three Faces of AI
At the time of the creation for the first general-use computer, Alan Turing, an English mathematician and computer scientist, was grappling with the question “can machines think?”
This was the first indication of a future where computers can think the way a human does. In his paper, Alan Turing published “Computing Machinery and Intelligence” in which he proposed the “Turing Test,” a way to empirically determine if machines can think.
Although the Turing test has many issues that render it irrelevant for intelligence testing today, it acted as the foundation of the philosophy of artificial intelligence.
In fact, less than a decade after the Turing test, in 1956, the term “artificial intelligence” was officially coined in a conference at Dartmouth College hosted by John McCarthy (Dartmouth College) and Marvin Minsky (Harvard University).
Beginning with these early questions about the potential of computation to surpass human intelligence, researchers have contemplated the different capabilities, ultimately settling on three faces of AI:
This is the part of artificial intelligence that most people don’t realize exists; however, it’s also the category that all of our current algorithms fall into.
Weak AI is an algorithm designed for completing a specific task. Weak AI simply acts upon and is bound by the rules imposed on it and it could not go beyond those rules.
A famous example of narrow AI is AlphaGo. The program competed in a game of Go against Lee Sudol, 18-time world champion, and won 4 out of 5 matches, making history. However, although it has enormous computational complexity, the program cannot do another task (like play chess or Monopoly) because it was created to play Go.
Even though narrow AI can only do it’s specific task, there is still an inconceivable number of impactful applications. There are many examples of narrow AI around us every day, including:
- Self-driving cars
- Facial recognition tools
- Voice assistants (Siri, Alexa, etc)
- Cancer detection
- Spam filters
This is when machines start to become human. J.A.R.V.I.S is a sneak peek into what artificial general intelligence (AGI) would look like.
AGI is defined as “a machine that can apply knowledge and skills in different contexts.” This more closely mirrors human intelligence by providing opportunities for autonomous learning and problem-solving.
The main issue with achieving AGI is computational capacity. Current computer hardware needs to perform more calculations per second. Quantum computers, which use the laws of quantum mechanics to process more data than classical computers, are positioned to be the solution to facilitate AGI.
Further, the AGI needs to be able to transfer learnings, enlist common sense, work collaboratively with other machines and humans, and attain consciousness–all of which are barriers to formation.
However, on the bright side, of over 350 surveyed AI experts, 50% estimate AGI to occur by 2060.
Remember Elon Musk’s tweet from above talking about AI’s ability to be more powerful than nuclear weapons? This is what he was talking about.
Superintelligence is defined by University of Oxford professor, Nick Bostrom, as “an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills.”
Today, we have nothing that even comes close to reaching artificial superintelligence. In fact, many scientists believe superintelligence is unlikely to happen. Some believe that computers won’t be able to replicate fundamental human emotion and others believe that humans will evolve or modify their biology to achieve greater intelligence.
However, philosopher David Chalmers argues that artificial general intelligence is a very likely path to superhuman intelligence. He makes the argument that if AI can achieve equivalence to human intelligence, then it can be extended to surpass human intelligence, and then it can be further amplified to completely dominate humans across arbitrary tasks.
The possibility of superintelligence is still being contemplated by researchers worldwide; yet, if it does fruition, this could be the Terminator.
The Umbrella of Artificial Intelligence
Even though AI is broken down into three general categories, there are more fields and studies at play. It’s important to recognize that AI isn’t one specific algorithm or coding language. The term, AI, is an umbrella term for the numerous fields involved and included in it.
Contrary to popular belief, most people don’t become a complete “AI expert.” Instead, researchers dedicate their careers to mastering a few specific aspects of artificial intelligence like natural language processing or machine learning.
From a business perspective, you’re never going to buy an “AI” solution. Rather, you’ll buy a statistical analysis package or implement a neural network through deep learning.
However, of the many routes you can take with AI, the core three underlying principles are artificial intelligence, machine learning, and deep learning.
Artificial Intelligence vs. Machine Learning vs. Deep Learning
Let’s clear things up: artificial intelligence (AI), machine learning (ML), and deep learning (DL) are three different things. Many people use the terms interchangeably, when in reality, they perform very disparate functions.
As you can see on the below image of three concentric boxes, DL is a subset of ML, which is also a subset of AI. AI is the initial concept that erupted, followed by ML that flourished later, and lastly DL breakthroughs drove AI to another level.
As the name implies, machine learning allows computers to take data and ‘learn’ for themselves.
From driving cars to translating speech, machine learning is responsible for the vast majority of the AI advancements and applications you hear about.
For example, machine learning can make predictions based on factors like fur, tail length, and size, about whether a given animal is a cat or dog.
The key distinction between traditional computer software and machine learning is that a human developer hasn’t written code that instructs the system how to tell the difference between the cat and the dog.
Machine Learning Methods
Machine learning, at it’s core, is divided into two methods: supervised and unsupervised learning.
Supervised Learning refers to teaching machines by example.
In a supervised learning algorithm, systems are exposed to large amounts of labelled data. For example, 10 images of cats and 10 images of dogs, knowing which ones are cats and which are dogs. Given sufficient examples, a supervised-learning system would learn to recognize the clusters of pixels and shapes associated with each animal and eventually be able to recognize them out of new pictures.
On the other hand, unsupervised learning uses algorithms to identify patterns in data, trying to spot similarities and differences to split data into separate categories.
Unsupervised learning algorithms often use clustering to find natural groups or clusters in a feature space and interpret the input data.
The algorithm isn’t designed to single out specific types of data, it simply looks for data that can be grouped by its similarities, or for anomalies that stand out.
An example might be Zillow clustering together houses available to rent by neighborhood, or Google Search grouping together stories on similar topics each day.
As the name suggests, semi-supervised learning mixes supervised and unsupervised learning. These algorithms operate on data that has a few labels, but is mostly unlabeled.
The algorithm partially trains based on the labelled data. Then, with it’s partial training, it begins to label the remaining data, a process called pseudo-labelling. Finally, the labelled and pseudo-labelled data are used to run the algorithm.
Semi-supervised learning algorithms have been gaining increased traction lately since the discovery of Generative Adversarial Networks (GANs), algorithms that can use labelled data to generate completely new data.
Reinforcement learning is a departure from the other three machine learning methods, utilizing a completely different way of learning.
The best way to understand reinforcement learning is to picture a child learning to walk. They stand up on their legs for the first time, unsure about how to hold their weigh or move their legs. While at first they won’t understand anything, eventually, by understanding the relationship between their body movements and walking, their performance will get better and better. This can also be thought of as learning through experimentation, or failure.
Reinforcement Learning has four core components:
- Agent. The program you’re training to complete the specified task.
- Environment. The world in which the actions are performed.
- Action. A move made by the agent that changes the status of the environment.
- Rewards. The evaluation of an action, whether it be positive or negative towards the task.
Over the process of many cycles of doing a task, the system builds a model of which actions will maximize their output and what they need to do to achieve the task.
In order to create any of these four machine learning models, there are three key components that you will need to know:
- The Algorithm: Each task can be solved with multiple different algorithms. There are various types of algorithms; however, some achieve better results, depending on their intended purpose. For example, to solve a classification task (differentiating between a cat and a dog), you can use a support vector machine or linear regression algorithm. But, one of them will be faster, depending on your exact scenario. It’s important to understand the task, then test various options or algorithms.
- The Dataset: To understand how to perform an action, the algorithm needs to be given training data (an initial set of data used to help a program understand how to apply a function) and then feed it testing data (data to ensure that the model is interpreting this data accurately). Both of these types of data are included in the dataset. For example, before your algorithm can differentiate between cats and dogs, you need to teach it everything about cats and everything about dogs, individually. Then, give it a picture of a cat or dog to see if it learned.
- The Features: A feature is an individual measurable property or characteristic of a phenomenon being observed. From our example about the cat and dog, the features would be the fur length, size, weight, etc. Features are the basic building block of your dataset. The chosen features and quality has a huge impact on the accuracy of your algorithm and the insights you will gain. For example, color would not be a good feature to classify whether an animal is a cat or dog.
Deep learning is machine learning on steroids. It is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.
We refer to the field as “deep learning” because the neural networks have various (deep) layers that enable it to learn. Any problem that requires “thought” to solve can eventually be solved with a deep learning algorithm.
Deep learning differentiates from machine learning by the lack of human intervention. For example, with machine learning, to differentiate a cat and dog, you’d have to tell the algorithm which features the animals can be separated by. However, with deep learning, the features are chosen by the neural network.
A neural network uses a network of functions to understand and translate the data input into a desired output, usually in a different form.
Each neural network consists of elements called neurons. They are the core processing unit of the network, connecting to other neurons through links called channels. The neural network feeds data into neurons in each layer until it reaches the output layer as a prediction.
Structure of A Neural Network
For example, let’s say you were attempting to classify a 28×28 image of the number 7. There would be 784 pixels that go into a neuron in the first layer– the input layer. This layer can be of any size. However, there needs to be one input neuron per relevant feature in your dataset.
Next, they must past through the hidden layers. The number of hidden layers can vary, depending on the problem and the architecture of your neural network. The hidden layers work to narrow down the input data to eventually reach a singular output.
The remaining pixels will eventually reach the output layer, in which the algorithm will predict that the input pixels made up the number 7. Each neuron in this layer represents the number of predictions you want to make.
How Do We Eliminate The Redundant Information?
To start with a large input and end with a few predictions requires the neural network to narrow the available data. This elimination of redundant information happens between the connecting channels.
Each channel has a value attached to it, hence receiving the name “weighted channel.” Further, each neurons has a number associated with it called “bias.”
This bias is added to the weighted sum of inputs then passes through the activation function. The result of the activation function determine if the neuron in the next layer gets activated. If the neuron does, then information passes through. If it doesn’t that information is rendered irrelevant and doesn’t contribute to the output.
As the network continues to learn and develop, the weights and biases are constantly adjusted to produce better results.
Applications of Deep Learning
Deep learning has been applied to fields including computer vision, speech recognition, natural language processing, bioinformatics, drug design, and medical image analysis, producing results similar to or exceeding a human’s.
In fact, there are many different types of neural networks that are used for individual tasks. Here are a few of them:
- Convolutional neural networks (CNNs) — image and video recognition, natural language processing, and recommender systems
- Sequence-To-Sequence models — applied mainly in chatbots, machine translation, and question answering systems
- Multilayer perceptron — applied extensively in speech recognition and machine translation technologies
- Radial Radial Basis Function Neural Network — used extensively in power restoration systems in order to restore power in the shortest possible time
- Generative Adversarial Networks (GANs) — learning patterns in input data in such a way that the model can be used to generate or output new data
However, despite all of it’s applications, deep learning has some shortfalls. Deep learning, and neural networks, require massive amounts of data. This creates a myriad of issues. To begin, the data itself isn’t easy to find. Further, most computers do not have enough processing power to handle that data. Even if the computers could process it, the deep learning algorithms take hours, days, or even months to run, increasing with the amount of data.
Hopefully, by now you’re convinced that AI has the power to change the world. It’s already changing the world and we’ve barely even scratched the surface of what it can do.
In the coming decades, researchers will continue to bring artificial intelligence closer and closer to human-level performance. AI algorithms and tools will become increasingly accessible and available over the cloud. This trend will be driven by the breakthroughs of high-bandwidth connectivity, neural networks, quantum and cloud computing.
Every industry and individual will be impacted.