Google Brain can Sum up a Compendium of Information to Perfection

Source: Deep Learning on Medium

Google Brain can Sum up a Compendium of Information to Perfection

Google Brain’s AI achieves state-of-the-art text summarization performance

With Pegasus, a super powerful AI model, Google Brain surpasses the results of all previous experiments.

The deep learning research project led by Google and dubbed Google Brain, has developed an artificial intelligence model capable of summarizing a set of data in an extremely precise and fluid way. Google promises that automatic summary systems will save us a lot of time in some of our business tasks.

Google Brain surpasses all text summarization systems

Google Brain is collaborating with researchers at Imperial College London to build an artificial intelligence system called “Pre-training with Extracted Gap-sentences for Abstractive Summarization Sequence-to-sequence”, or more simply Pegasus. The latter is able to summarize texts in totally different fields such as science, history, e-mails, patents or even bills. According to Google, Pegasus shows “surprising” performance on the low-resource summary, surpassing the results of previous experiments.

The researchers say Pegasus is capable of generating precise, concise summaries of exceptional quality. Currently, the techniques used consist in selecting fragments of different texts to make a summary. In short, classical techniques draw pieces from several texts to compose a single summary. Google Brain goes much further because its artificial intelligence is able to generate new words, to link different parts, to create a linguistically fluid summary. Google goes far beyond just copying and pasting a dataset.

Pegasus: a complex model

The Google Brain team trained their artificial intelligence model with a particularly complicated task for them to develop new skills. In official texts, whole and supposedly important sentences were hidden. To solve this problem, Pegasus had to fill in the gaps by going to find information on other articles on the web.

Pegasus is a particularly complex model, it includes 568 million parameters. Its training required 750 GB of text extracted from 350 million web pages. A total of 3.8 TB was collected from websites to form the algorithm. Today Pegasus has reached a high level of summary, both in its fluidity and in its consistency. Researchers know this: the prospects for development are enormous.

In this way, artificial intelligence was able to complete the missing sentences and learn to find new information to make the necessary links. This task is reminiscent of another Google Brain experiment. In 2017, researchers presented an AI system capable of automatically completing a sketch. All with an amazing skill following the “digestion” of millions of examples generated by users.