Original article was published by SHAIK SAMEERUDDIN on Artificial Intelligence on Medium
Recommended For You:
Although supervised learning, from autonomous vehicles to voice assistants, has driven tremendous progress in AI over the past decade, it has significant limitations.
The method of marking thousands or millions of data points manually can be incredibly costly and cumbersome. A big bottleneck in AI has been the fact that humans have to mark data by hand before machine learning models can consume it.
Supervised learning reflects a small and circumscribed type of learning at a deeper stage. Instead of being able to explore and absorb all the latest data, relationships, and implications in a given dataset, supervised algorithms focus only on the concepts and categories identified in advance by researchers.
In comparison, unsupervised learning is an AI method in which algorithms learn from data without labels or guidance provided by humans.
In artificial intelligence, many AI leaders see unsupervised learning as the next great frontier. “UC Berkeley Professor Jitendra Malik put it even more colorfully:” Labels are the heroin of the machine learning researcher, “in the words of AI legend Yann LeCun:” The next AI revolution will not be overseen.
Unsupervised learning more closely mirrors the way people think about the world: without the need for “training wheels” in supervised learning, through open-ended experimentation and inference. One of its fundamental benefits is that much more unlabeled data will still be available in the world than branded data (and the former is much easier to come by).
In the words of LeCun, who prefers the closely related term “self-supervised learning”: “A portion of the input is used as a supervisory signal in self-supervised learning to estimate the remaining portion of the input …. More knowledge of the world’s structure can be acquired from self-supervised learning than from [other AI paradigms] since the data is infinite and the sum of fees can be acquired.”
Unsupervised learning has also had a transformative effect on the production of natural languages. Thanks to a modern unsupervised learning framework known as the Transformer, which emerged about three years ago at Google, NLP has seen tremendous progress recently. (For more on Transformers, see # 3 below.)
At an earlier point, attempts to extend unsupervised learning to other areas of AI continue, but rapid progress is being made. To take one example, a company called Helm.ai aims to leapfrog the pioneers in the autonomous vehicle industry using unsupervised learning.
As the key to improving human-level AI, many researchers see unsupervised learning as the key. According to LeCun, “the biggest challenge of the next few years in ML and AI is mastering unsupervised learning.”
2. Federated Instruction
Data privacy is one of the overarching problems of the modern age. Since data is the lifeblood of modern artificial intelligence, problems with data privacy play an essential (and sometimes limiting) role in the trajectory of AI.
Methods that allow AI models to learn from datasets without compromising their privacy, privacy-preserving artificial intelligence, is thus becoming an increasingly important pursuit. Federated learning is perhaps the most promising path to privacy-preserving AI.
In early 2017, researchers at Google first proposed the notion of federated learning. Interest in federated learning has exploded over the past year: in the first six months of 2020, over 1,000 research papers on federated learning were released, compared to just 180 in all of 2018.
Collecting all training data in one place, mostly in the cloud, and then training the model on the data is the traditional approach to creating machine learning models today. But for most of the world‘s data, which can not be transferred to a central data repository for privacy and security purposes, this approach is not feasible. This causes traditional AI methods to be off-limits.
By flipping the conventional approach to AI on its head, Federated learning solves this issue.
Federated learning leaves the data where it is, spread through multiple devices and servers on the edge, instead of requiring one single dataset to train a model. Instead, several iterations of the model are sent out and locally trained on each subset of data, one to each computer with training data.