Original article was published by Abrar Ahmed on Artificial Intelligence on Medium
What are the main types of Machine learning?
To put it simply, we train an algorithm and at the end pick the model that best predicts some well-defined output based on the input data.
Supervised techniques adapt the model to reproduce outputs known from a training set (e.g.
recognize car types on photos ). In the beginning, the system receives input data as well as output data. Its task is to create appropriate rules that map the input to the output. The training process should continue until the level of performance is high enough. After training, the system should be able to assign an output objects which it has not seen during the training phase. In most cases, this process is really fast and accurate.
There are two types of Supervised Learning techniques: Regression and Classification. Classification separates the data, Regression fits the data.
Reinforcement Machine Learning Algorithms
Reinforcement learning represents what is commonly understood as machine learning artificial intelligence.
In essence, reinforcement learning is all about developing a self-sustained system that, throughout contiguous sequences of tries and fails, improves itself based on the combination labeled data and interactions with the incoming data.
Reinforced ML uses the technique called exploration/exploitation. The mechanics are simple — the action takes place, the consequences are observed, and the next action considers the results of the first action.
In the center of reinforcement learning algorithms are reward signals that occur upon performing specific tasks. In a way, reward signals are serving as a navigation tool for the reinforcement algorithms. They give it an understanding of right and wrong course of action.
Two main types of reward signals are:
Positive reward signal encourages continuing performance a particular sequence of action
Negative reward signal penalizes for performing certain activities and urges to correct the algorithm to stop getting penalties.
However, the function of the reward signal may vary depending on the nature of the information. Thus reward signals may be further classified depending on the requirements of the operation. Overall, the system tries to maximize positive rewards and minimize the negatives.
Most common reinforcement learning algorithms include:
Temporal Difference (TD);
Monte-Carlo Tree Search (MCTS);
Asynchronous Actor-Critic Agents (A3C).
Use Cases for Reinforced Machine Learning Algorithms Reinforcement Machine Learning fits for instances of limited or inconsistent information available. In this case, an algorithm can form its operating procedures based on interactions with data and relevant processes.
Modern NPCs and other video games use this type of machine learning model a lot. Reinforcement Learning provides flexibility to the AI reactions to the player’s action thus providing viable challenges. For example, collision detection feature uses this type of ML algorithm for the moving vehicles and people in the Grand Theft Auto series.
Self-driving cars also rely on reinforced learning algorithms as well. For example, if the self-driving car ( Waymo , for instance) detects the road turn to the left — it may activate the “turn left” scenario and so on.
The most famous example of this variation of reinforcement learning is AlphaGo that went head to head with the second-best Go player in the world and outplayed him by calculating the sequences of actions out of current board position.
On the other hand, Marketing and Ad Tech operations also use Reinforcement Learning. This type of machine learning algorithm can make retargeting operation much more flexible and efficient in delivering conversion by closely adapting to the user’s behavior and surrounding context.
Also, Reinforcement learning is used to amplify and adjust natural language processing ( NLP ) and dialogue generation for chatbots to:
mimic the style of an input message
develop more engaging, informative kinds of responses
find relevant responses according to the user reaction.
With the emergence of Google DialogFlow building, such bot became more of a UX challenge than a technical feat.
I like to think of supervised learning with the concept of function approximation, where basically we train an algorithm and in the end of the process we pick the function that best describes the input data, the one that for a given X makes the best estimation of y (X -> y). Most of the time we are not able to figure out the true function that always make the correct predictions and other reason is that the algorithm rely upon an assumption made by humans about how the computer should learn and this assumptions introduce a bias, Bias is topic I’ll explain in another post.
Here the human experts acts as the teacher where we feed the computer with training data containing the input/predictors and we show it the correct answers (output) and from the data the computer should be able to learn the patterns.
Supervised learning algorithms try to model relationships and dependencies between the target prediction output and the input features such that we can predict the output values for new data based on those relationships which it learned from the previous data sets.
Classification for predicting class labels
Classification is a subcategory of supervised learning where the goal is to predict the categorical class labels of new instances, based on past observations. Those class labels are discrete, unordered values that can be understood as the group memberships of the instances. The previously mentioned example of email spam detection represents a typical example of a binary classification task, where the machine learning algorithm learns a set of rules in order to distinguish between two possible classes: spam and non-spam emails.
However, the set of class labels does not have to be of a binary nature. The predictive model learned by a supervised learning algorithm can assign any class label that was presented in the training dataset to a new, unlabeled instance. A typical example of a multiclass classification task is handwritten character recognition. Here, we could collect a training dataset that consists of multiple handwritten examples of each letter in the alphabet. Now, if a user provides a new handwritten character via an input device, our predictive model will be able to predict the correct letter in the alphabet with certain accuracy. However, our machine learning system would be unable to correctly recognize any of the digits zero to nine, for example, if they were not part of our training dataset.
The following figure illustrates the concept of a binary classification task given 30 training samples; 15 training samples are labeled as negative class (minus signs) and 15 training samples are labeled as positive class (plus signs). In this scenario, our dataset is two-dimensional, which means that each sample has two values associated with it:
X and X Now, we can use a supervised machine learning algorithm to learn a rule the decision boundary represented as a dashed line that can separate those two classes and classify new data into each of those two categories given its X1 and X2 values:
In 2012, co-founder of Sun Microsystems Vinod Khosla , predicted that 80% of medical doctors’ jobs would be lost in the next two decades to automated machine learning medical diagnostic software. (en.wikipedia.org)
OpenAI estimated the hardware compute used in the largest deep learning projects from AlexNet (2012) to AlphaZero (2017), and found a 300,000-fold increase in the amount of compute required, with a doubling-time trendline of 3.4 months. (en.wikipedia.org)