Source: Deep Learning on Medium
Most used jargon’s in Machine Learning
Machine learning and data science are one of the hot topics. We hear lots of terms related to these fields. Let’s learn about few popular terms.
Supervised learning typically begins with an established set of data and a certain understanding of how that data is classified. Supervised learning is intended to find patterns in data that can be applied to an analytics process. This data has labeled features that define the meaning of data.
For example, you can create a machine-learning application that distinguishes between fruits based on colours.
Unsupervised learning is the training of machine using information that is neither classified nor labeled and allowing the algorithm to act on that information without guidance. Here the task of machine is to group unsorted information according to similarities, patterns and differences without any prior training of data.
Example, image of lots of fruits which machine has never seen and hence can’t categorise them.
Bias is the difference between the average prediction of our model and the correct value which we are trying to predict. Model with high bias pays very little attention to the training data and oversimplifies the model. It always leads to high error on training and test data. This leads to under-fitting
Variance is the variability of model prediction for a given data point or a value which tells us spread of our data. Model with high variance pays a lot of attention to training data and does not generalise on the data which it hasn’t seen before. As a result, such models perform very well on training data but has high error rates on test data. This leads to overfitting.
This model function classifies the data into one of numerous already defined definite classes. Classification is the process of learning a model that elucidate different predetermined classes of data. It is a two-step process, comprised of a learning step and a classification step. In learning step, a classification model is constructed and classification step the constructed model is used to pre define the class labels for given data.
Regression is the special application of classification rules. Regression is useful when the value of a variable is predicted based on the tuple rather than mapping a tuple of data from a relation to a definite class. Some common classification algorithms are decision tree, neural networks, logistic regression, etc.
Clustering is a technique of organising a group of data into classes and clusters where the objects reside inside a cluster will have high similarity and the objects of two clusters would be dissimilar to each other. Some common clustering algorithms are k-mean, k-medoids.
Differences Between Classification and Clustering
- Classification is the process of classifying the data with the help of labels. where as, Clustering is similar to classification but there are no predefined class labels.
- Classification is known as supervised learning where as clustering is known as unsupervised learning.
- Training sample is provided in classification method while in case of clustering training data is not provided.
Decision tree is the most popular tool for classification and prediction problems. A Decision tree is a flowchart with tree like structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node holds a class label.
A neural network is a series of algorithms that endeavours to recognise underlying relationships in a set of data through a process that mimics the way the human brain operates. Neural networks can adapt to changing input; so the network generates the best possible result without needing to redesign the output criteria.
Deep learning is an artificial intelligence function that imitates the workings of the human brain in processing data and creating patterns for use in decision making. Deep learning is a subset of machine learning in artificial intelligence (AI) that has networks capable of learning unsupervised from data that is unstructured or unlabelled. Also known as deep neural learning or deep neural network.
Is this Enough??
No, we still have lots and lots of terms and concepts in machine learning but above are few terms which will help us in learning more about machine learning and data science.