Deep Learning-Driven Text Summarization & Explainability with Reuters News Data

Original article was published by ODSC – Open Data Science on Artificial Intelligence on Medium

Deep Learning-Driven Text Summarization & Explainability with Reuters News Data

Image credit: REUTERS/Dominic Ebenbichler for Reuters news data

Editor’s note: At ODSC West 2020, Nadja Herger, Nina Hristozova, and Viktoriia Samatova will hold a workshop focused on text summarization and that will allow you to automatically generate news headlines powered by Reuters News, and learn about the power of transfer learning and explainable AI.

Natural Language Processing (NLP) is one of the fastest-moving fields within AI and it encompasses a wide range of tasks, such as text classification, question-answering, translation, topic modeling, sentiment analysis, and summarization. Here, we focus on text summarization, which is a powerful and challenging application of NLP.

Summarization & Transfer Learning

When discussing summarization, an important distinction to make is between extractive and abstractive summarization. Extractive summarization refers to the process of extracting words and phrases from the text itself to create a summary. Abstractive summarization more closely resembles the way humans write summaries [link]. The key information of the original text is maintained using semantically consistent words and phrases. Due to its complexity, it relies on advances in Deep Learning to be successful [source].

Here, we investigate the automatic generation of headlines from English news articles across all content categories based on the Reuters News Archive, which is professionally produced by journalists and strictly follows rules of integrity, independence and freedom from bias [source]. The headlines themselves are considered fairly abstractive, with over 70% of bi-grams, and over 90% of 3-grams being novel.

We see a trend towards pre-training Deep Learning models on a large text corpus and fine-tuning them for a specific downstream task (also known as transfer learning) [source]. This has the advantage of reduced training time, as well as needing less training data to achieve satisfactory results. Due to the democratization of AI, we observe a leveling of the playing field where everyone can get hold of these models and adapt them for their use cases. We finetuned a state-of-the-art summarization model on Reuters news data, which significantly outperformed the base model itself. An example of a tokenized, unformatted article text and associated machine-generated headline is shown below. The original article text was published by Reuters in October 2019 [link].


Do you trust this automatically generated news headline? Researchers commonly rely on the ROUGE score to evaluate the model’s performance for such a task [source]. In its most basic form, it essentially measures the overlap of n-grams between the machine-generated and human-written summaries. If I told you that the model has a ROUGE score of around 45 on the hold-out set, is that sufficient for you to trust the prediction on a previously unseen article text?

How can we increase trust in what the model generated? The move towards more complex models for NLP tasks makes the need for explainable AI more apparent. Explainable AI is an umbrella term for a range of techniques, algorithms, and methods, which accompany outputs from AI systems with explanations [source]. As such, it addresses the often undesired black-box nature of many AI systems, and subsequently allows users to understand, trust, and manage AI solutions. The desired level of explainability depends on the end user [source]. Here, we are interested in making the model output explainable to a potential reviewer rather than for example an AI system builder, who would have different expectations in terms of technical details.

Let us take a look at how adding an explainability feature can support us in our task of verifying if the machine-generated headline is factually accurate. In addition to just generating the headline, we can gain insights into the most relevant parts of the news article. The illustration below builds upon the example shared earlier, by adding highlights to the article text in Reuters news data.

The darker the highlights, the more important a given word for the resulting headline text. This makes it significantly easier to verify the headline itself. Particularly the first sentence seems to have the largest impact on the generated headline. Interestingly, it refers to “until early next year” rather than “until early 2019”. The year “2019” never occurs in the article text. For this particular example, we actually have access to the human-written headline, see screenshot of the original article from the Reuters site below.

Knowing that the article was published in 2019, it is evident that “early next year” refers to the year 2020 rather than 2019, and thus renders the machine-generated headline partially inaccurate. We believe that verifying machine-generated headlines with an extra layer of explainability leads to increased trust and easier detection of biases or mistakes.

For more details on text summarization and Reuters news data, the power of transfer learning, as well as adding explainability for increased trust, join us for our hands-on workshop at ODSC West in October. You will walk away with an interactive notebook to get a head start in applying these concepts to your own challenges!