Becoming (artificially) More Intelligent



Although artificial intelligence plays a big role in our daily experiences, many of us don’t know what it really is and how it’s related to other buzzwords such as machine learning and deep learning. If you always wanted to know what the relationship and differences between artificial intelligence, machine learning, deep learning and big data are — then this article is for you.

___________________________________________________________________

At DKdL (Die Krieger des Lichts, part of the fischerAppelt group) we have decided to take a novel and holistic approach to developing intelligent data-driven concepts and products. We call this approach “human centered AI”, where AI stands for artificial intelligence. Our approach integrates the research & development (R&D) needed to develop AI systems into a human centered design process. In simple words, we understand that the intelligent products (that is, products that use AI at their core) we develop are designed to be used by humans in a way that adds value to their daily experiences. Therefore, we start by understanding the people who are going to use the product and the needs it is intended to meet. We then keep those needs at the focus of our considerations throughout the entire development process and the design iterations.

We run the process by combining the R&D required to develop an AI-driven product, carried by the data-science team, with teams of UX designers and service designers. Doing that, we create solutions that hit the sweet spot between business goals, customer experience and technology. The goal is that the end product relies on state-of-the-art and innovative AI solutions and is tailor made to satisfy the requirements and expectations of the people who are going to use it. Ideally, we want to design intelligent products that are simply fun and intuitive to use.

The integration of AI into everyday experiences is going to grow significantly

The term AI is widely used in today’s technology driven world. The reason is that AI is involved in many aspects of our everyday lives. It is used to improve our web searches, answer short messages for us, sort and tag our photos and recommend to us which content to consume and buy. We also know that the integration of AI into our everyday experiences is going to expand even further in the future, where intelligent cars will take us to our destination, medical systems will predict and prevent diseases and personal assistants will schedule appointments for us over a phone call. The limit is only set by our imagination.

Nevertheless, many times it is not clear what AI precisely means and what is its relationship to machine learning, deep learning and big data. Often, these terms are used interchangeably in the media, adding to the general confusion. Although these words are all closely related, they are still not quite the same thing. In the following we’re going to explain what each of these terms stands for and make the differences clearer.

So what is AI?

AI — Artificial Intelligence — refers to the ability of a system to perform tasks that can be associated with intelligent behavior. Of course, it is not simple to define what intelligent behavior exactly means. But for the scope of this article and to get a reasonable idea, we can say that intelligent behavior represents tasks such as the ability to learn from past experience, the ability to have an open interaction with humans and the ability to solve problems that were not seen before. In short, AI is a broad and general concept that describes machines that we would consider as intelligent.

Because AI systems rely on rather complicated mathematical computations to display intelligence, they require a written software code and a computer to run it. Therefore, AI is generally considered to be a branch of computer sciences.

Combining what we described above, a computer running an AI software is able to respond independently, that is without a human being involved, to different problems and situations. The solution is not given to the computer, but instead it can come up with a solution on its own in real-time.

A major impact of having a computer that can choose the best course of action to a given situation without human instruction is that it can fully automate processes and thereby making them faster, more efficient and less error prone.

Machine learning is a way to make a computer intelligent

So how do we make a computer intelligent? This is where machine learning comes into play. Machine learning is a set of methods and algorithms that give computers the ability to learn how to perform a task. In principle, the learning is done by figuring out statistical patterns in data. By learning we mean that the computer tries to complete the task many times and after each attempt it improves its performance a little bit. This way, the computer progressively improves its performance on executing the task until it gets it right on every try (at least ideally…).

The tasks could be recommending a movie to watch or a product to buy, understanding the meaning of sentences from text or audio in order to translate them to other languages or to execute a request, or even control a traffic light system to improve the flow of traffic in a certain region. All of these tasks can be improved by using machine learning methods.

Machine learning is therefore a way to make a computer intelligent. AI refers to the intelligence of the machine: it defines how we want it to behave and react. Machine learning is the implementation of the computational methods that enable the AI. Another way to describe this is that machine learning makes computers able to perform tasks they were not explicitly programmed to do.

The last sentence represents the difference between the traditional way of writing a software and a machine learning program. Both methods could be applied to execute the same task or solve the same problem. However, the ways in which they are applied are vastly different.

Machine learning vs. traditional programming

To write a traditional software, we first find a solution that we believe to be accurate and general enough to satisfy our needs. The solution has to be able to deal with every edge and extreme case of the problem domain as well, otherwise the program will break when encountering an unexpected situation. Then, a software developer would translate the solution into software code that is a series of instructions and algorithms that the computer executes each time we run the program. In simple words, we have to find a solution to the problem and instruct the computer in a very explicit way how to execute our solution when a user runs the program.

On the other hand, a data scientist would approach the problem differently. As the name suggests, data science is concerned with data and hence we begin by collecting input data for our algorithms. These data could be images with visible faces for a face recognition problem, or it could be stock prices from the last few years if we want to predict stock prices.

Next, we would clean and explore the data. We clean the data by making sure there are no mistakes, errors or missing values in it. It would be very difficult to learn correctly by observing mistakes or non-existing values. Data exploration is done to gain insights that will help us understand the problem better and the possible ways to solve it using machine learning — which relies on the data. For example, the most straightforward way to explore the data would be by simply visualizing it.

Based on the insights we find in the data, we do two things. First, we define a machine learning model that we believe can solve the problem. A model could be a neural network, which we describe later in this text. For now, you could think of the model as a mathematical function. Second, we define a way to evaluate how good the model is. After choosing a model and an evaluation measure, we would run an algorithm that optimizes the model.

Computers need to be trained — kind of like our brains

The optimization process is where the machine learning algorithm actually learns on its own how to solve the problem. For example, to learn the difference between images of cars and planes, it would figure out that one important difference is that planes have wings while cars don’t. The important point is that we never told the computer anything about the objects in the image, but the algorithm learned on its own which differences are important and which are not (color is probably not a helpful feature in this case).

In machine learning jargon, we refer to the optimization process as “training” the model; the data that the algorithm uses to optimize the model is referred to as “training data”. Training a machine learning model, when done well of course, saves us from the need to explicitly refer to each possible case in the problem domain — which is the case in traditional programming.

An important outcome of using machine learning is the predictive nature of the trained model. A machine learning program that was successfully trained, has learned to make predictions based on the training data. For example, such programs can send an alert prior to a failure of a part of system they monitor or a playlist of songs that it predicts we will enjoy.

A machine learning model is predicting results and outcomes based on statistical patterns that it recognized in the training data. When the training is done well enough, the ability to output a result based on patterns rather than explicit cases is a big advantage over traditional programming, where every possible case must be included in the programming of the software in order for it to work well. As we mentioned before, AI systems can operate also in situations that they never observed before.

Another advantage of machine learning over traditional programs is that it can be applied to highly complicated problems, that would otherwise demand many resources to be solved by traditional methods, particularly when the problem involves a huge number of parameters and variables.

Because most of the recent breakthroughs in AI were developed using machine learning, machine learning is currently the fastest growing part of AI. More and more often, applications of machine learning are replacing many of the “traditional” computation systems we have been using so far, from automated translations to improved excel sheets.

The deeper part of machine learning

The fast growth of machine learning applications leads to rapid development of new and more advanced algorithms and hardware that can support the execution of these (usually) more computationally demanding algorithms. This is how deep learning became so popular in recent years.

Deep learning is a generic name that refers to a sub-set of machine learning algorithms. These algorithms usually rely on “Artificial Neural Networks” (ANN), which are inspired (loosely) by the way our brain cells, called neurons, work.

Like a brain that contains a huge network of neurons that make us think, understand and react, an ANN consists of a huge number of inter-connected artificial neurons. The latter are an artificial mathematical entity on a computer that, to some extent, mimics the way in which brain neurons are activated. The ANN is trained by processing the training data and adjusting the connections between the different parts of the network. Once the ANN is trained it can interpret new data it hasn’t seen before. The principles of the training procedure we described in the machine learning part hold for the training of deep learning models as well.

As seen in the figure below, the neurons in the ANN are arranged in layers. There is an input layer that receives the input from which the computer learns. The output layer is where we receive the result of the ANN calculation. So, when the input is an image the output could be the probability that the network recognizes a dog in the image. Between the input and output layers, we have the so-called hidden layers. The more hidden layers an ANN has, the more complex the things it can “understand”. Therefore, many modern problems — such as autonomous driving — require ANNs with many hidden layers. In other words, we need deep networks. This is where the name deep learning comes from.

An illustration of an artificial neural network with two hidden layers.

It might be a bit surprising, but deep learning methods have actually been around for quite some time. However, the deeper a network is the more computation power it requires. Before the recent advancements in computer hardware, particularly in graphic cards (GPUs, that are able to process many calculations in parallel), running deep learning models was simply not practical, or even not possible.

Many of the AI systems we use every day rely on deep learning models. For example, when facebook recognizes our faces in photos, it is using a “convolutional neural network” — an ANN that is designed to “see” information in photos. Google uses “recurrent neural networks” — another type of ANN that learns from sequences of inputs (words in this case) — to make its assistant understand our voice commands. And a model that combines these two types of ANNs can be trained to automatically generate image captions.

Now, after describing the meaning of AI, machine learning and deep learning, we can summarize the differences between them in the image below. AI is the general term that refers to a computer system that is able to act in a way that would be considered intelligent. Machine learning is a part of AI that includes the algorithms that enable machines to become intelligent. Finally, deep learning is a collection of more complex machine learning methods that are able to deal with highly complicated tasks.

How is big data related to AI then?

The last thing we shall discuss in this post, albeit very briefly, is big data. As we mentioned before, in order to adjust the model and be able to make accurate predictions, the machine learning algorithm needs to complete the task many times until it gets it right. But for that to work, we also need a large number of examples in our training data.

If we would train the model with a dataset that contains one example, the algorithm will adjust itself very quickly to give the exact solution for that example. But it did not actually learn anything. In a sense, it simply memorized that example and is now able now to reproduce the solution to it very accurately.

To be able to maximize its potential, machine learning requires a lot of data. And the data need to be rich and not biased, meaning that they include many different examples and avoid showing merely examples of a single type. To get this we can think of teaching a young child how to multiply whole numbers. You can show them over and over again that 2*2 = 4 until they simply memorise this example. But based only on that, they are probably not able to get 3*4 right. They need more training examples to figure out the actual pattern that is the relation between multiplication and repeated addition.

The image below [following Martin & Lopez, 2011] depicts the increasing capacity of storing data worldwide. Since ultimately the data we collect is processed by a computer, the transition to digital data storage is also constantly growing. In 2002, the beginning of the digital age, half of the worldwide stored data was in digital form. Today, almost all of the stored data is in a digital form.

Growth rate and digitization ratio of global information-storage capacity, following Martin & Lopez, 2011.

We should also note that not only the amounts of data are challenging to handle, but also the fact that many different types of data need are necessary to be handled by a data management system. It could be text messages or emails, location, images and so on. The problem is how to store and manage all these different types of data in a way that would be easy and quick to read and write.

Big data is a general term for strategies and technologies that are needed to gather, organize, process and gain insights from very large datasets. Basically, when we need to solve issues with data that are too diverse and massive for conventional infrastructures (e.g. have to be stored on several different servers) to handle efficiently, we are starting to touch big data concepts.

The details of big data technologies are kept for a future publication. There is quite a lot of information in this article as it is and we imagine it is not easy to digest all of this at once. But we do hope that after reading this, you have a better understanding of the meaning of each of the terms we described and feel more comfortable using them correctly. In the future, we plan to write a follow-up article to describe how we make use of these exciting technologies within the fischer-Appelt group.

Written by Idan Shilon. 
Many thanks to Manuel Bug for preparing the graphics, Nina Pietropoli for editing the manuscript and all other Krieger for their feedback that made this article better.

Source: Deep Learning on Medium