What is the difference between Data Analytics, Data Analysis, Data Mining, Data Science, Machine…

Original article was published on Artificial Intelligence on Medium

What is the difference between Data Analytics, Data Analysis, Data Mining, Data Science, Machine Learning, and Big Data?

It’s quite normal to confuse these terms with each other, but I will try to give a clear explanation.

here is the link

They are closely related, as are the terms “Web”, “Internet” and “HTML”.These three terms refer to totally different fields or practices in reality.

Data Analytics

It is an automated data collection, for example the “Google Analytics” service which provides a javascript framework for automated collection of user activity on a website.


In general, the collection is done in an industrial way, the purification is automatic and this on all channels (Mobile, Web, Server, etc.)

Data Analysis

This is the analysis of the data that has been collected. The data collected is presented in a certain way, digestible or not. With Google Analytics we have a data analysis already built: “Average time spent on the site”, “Average number of pages visited” etc…


Data Mining

This is data mining. We extract the data that comes from the automated collection, we cross them with other data and we look for a pattern or a correlation between these data via standard “Regression” methods etc …

For example: What is the correlation factor for 18–25 at Pull & Bear and Zara?


There is a trend that shows that type X products are strongly bought when there are type Y events and especially during the N period

Data Science

Data Science is the field that brings together data sciences including Data Analysis, Data Analytics, and Data Mining among others.


By comparison additions, subtractions are part of a larger set called “Arithmetic”.This set is included in a larger set called “Mathematics”.

Machine Learning

It is a technique that involves giving data to a neural network so that it is able to “learn” patterns automatically in the data and is able to produce a response accordingly.


For example I give 1 Million lines in Excel format to my neural network, with 500,000 lines which describe black mice, and 500,000 which describe white mice. Each line contains the weight, the size, the number of mustaches, etc…

The engine will automatically detect the correlation factors that identify white mice compared to black (Weight, Sizes, etc.).

I can now ask the machine to guess if a mouse is white or black by entering parameters similar to those given to training it.

Big data

The name was given to the issues surrounding large volumes of data processing. It basically revolves around the Hadoop ecosystem which is a distributed file system.


When we talk about Big Data, it’s really huge volumes (at least 30GB per day which would make around 10TB per year)
Having a database of 120 Million rows is not big data!
All of these areas are extremely interrelated.


Those are my personal research, if you have any comments please reach out to me.

Welcome to my medium page

Github, LinkedIn, Zahra Elhamraoui, Upwork