Source: Deep Learning on Medium
This week, Deepnews.ai is releasing its first public demonstrator, a newsletter called Deepnews Digest. (This is our Progress Report #4)
Deepnews Digest is a weekly newsletter aimed at showing the capabilities of the scoring/ranking algorithm we have been developing for 18 months.
The principle is straightforward: we pick a topic that is in the news and scan a selection of 50 sources ranging from large publications to small, specialized outlets. We collect a few hundred articles that are parsed through our home-brewed scoring algorithm. The model returns a spreadsheet that contains the URL of the story, the headline, the source, the word-count of the article, and the score. Then, our editor, Christopher Brennan manually removes any “noise,” typically false positive articles that are off-topic. He will also check for oddities, like a 3,000-word piece scoring only 1.8 (it is usually a large multi-topic news wrap up), or a 500-word article scoring a 4.1 (it could be a well-angled short piece from Quartz or Axios; it took us months to remove the misleading correlation of length and quality…). Finally, he will write a short text introducing the topic of the week and after a few checks from the team, we hit the “send” button.
The product looks like this:
The newsletter is purposely stripped of anything useless. It boils down to a list of clickable headlines beneath mention of the news topic that inspired them.
We do not provide the score and do not intend to do so. The reason is twofold:
- We don’t want to appear as the arbiter of journalistic quality.
- The actual score is for internal purposes only, either for us or our clients. It is a measure that is used for classification and analysis.
At first, we wanted to produce an editorialized newsletter, with a small list of links with our comments. But while doing some tests, we were amazed by the ability of our system to spotlight a large number of good stories. Then we wondered: why not lean towards exhaustivity and not providing a large selection of stories?
We set up the following rules:
- Maximum relevance of the articles with respect to the selected subject (that’s the purpose of the algorithm).
- Specific stories/unique angle. By this, I mean no endless duplicates of articles saying the same thing — which is the biggest hassle of most aggregators.
The first versions of the newsletter (archived here) had 100 links. It turned out to be too many (according to a survey we made on a few dozens of alpha testers). We are now down to fifty.
What we measure
The Deepnews Scoring Model (DSM as we call it internally), is built on the detection of syntax and structure patterns that are associated with quality journalism. We fed the algorithm with hundreds of thousands of articles that the model uses as a reference to assess incoming stories.
The Technology behind the DSM
We built about 55 versions of the underlying deep learning model of the DSM. It is based on a convolutional neural network. ConvNets are mostly used for image recognition, but we repurposed it to fit our goal. In our Deepnews Progress Reports #3 from February 25, I gave some details about the structure of our deep learning model. In the coming month, with our lead engineer, Victor d’Herbemont, we will release the methodology. But not the code. Right now, the model is nearly impossible to reverse engineer (even for us), and we intend to keep it as tamper-proof as possible.
What’s next with Deepnews Digest
First, we want to further refine our capability to retrieve and process stories on a wide variety of topics and provide reliable scoring on a consistent basis. The Deepnews Scoring Model yields satisfactory results under certain circumstances. For instance, due to the way the model was trained, it works well on business, societal and political stories, but not so well on sports articles for instance.
We will fine-tune the newsletter by probing our beta-testers to see how the concept can be improved and scaled, for example, to produce a series of bespoke newsletters on any topics of interest.
A dedicated version of the Deepnews-powered listing will also be included in a revamped version of this Monday Note scheduled for the Fall.
Stay tuned. This is just the beginning.
➜ In the meantime go to Deepnews.ai
AND SUBSCRIBE TO THE DEEPNEWS DIGEST
PS: next week at the GEN Summit in Athens, I’ll be speaking about Deepnews.ai. Contact me if you want to meet up.