NLP News Cypher | 05.31.20

Original article was published on Artificial Intelligence on Medium

GPT-3, the Sureshot

OpenAI has swag. Why? Because they have yet to publicly announce GPT-3 and it’s already on 🔥🔥. However, this didn’t stop the AI community from noticing on Thursday night when the planet’s largest language model was introduced to the world.

Well, just the paper was released, the model is still parked in Greg Brockman’s garage.

At the extreme end, the model is modest in size 😁, it’s ONLY 175 billion parameters 🤪, 10X more than any previous non-sparse language model. If you are thinking of using this model, if it ever comes out, think again. Here’s a thread from GPT-3’s repo on the challenges that developers will face with the largest sized model:

How does it perform?

On some tasks like COPA and ReCoRD, GPT-2 it achieves nearly SOTA performance with zero-shot (meaning it wasn’t fine-tuned for downstream tasks, it’s how it performs out of the box).

On its limitations:

“GPT-3 appears to be weak in the few-shot or one-shot setting at some tasks that involve comparing two sentences or snippets, for example, whether a word is used the same way in two sentences (WiC), whether one sentence is a paraphrase of another, or whether one sentence implies another,”



Zero-Shot from 🤗

Hugging Face didn’t waste any time, the following day they released their zero-shot demo for you to play with:

Hey, remember when a 1.5B param model was “too dangerous” for release? 😂

Dev Overflow

Stack Overflow came out with their annual developer survey. Here are a few highlights:

Top 3 ‘Loved’ Languages: Rust, TypeScript, & Python

Most Wanted Platform: Docker

Most Loved Platform: Linux (duh)

What Do Developers Do When They Are Stuck? 😂11% panic 😂:

Full Survey Results:

DeepPavlov Update

Framework update from the great developers at DeepPavlov:

They have a new knowledge base QA model that can answer these question types:

Complex questions with numerical values:

  • “What position did Angela Merkel hold on November 10, 1994?”

Complex question where the answer is number or date:

  • “When did Jean-Paul Sartre move to Le Havre?”

Questions with counting of answer entities:

  • “How many sponsors are for Juventus F.C.?”

Questions with ordering of answer entities by ascending or descending of some parameter:

  • “Which country has highest individual tax rate?”

Simple questions:

  • “What is crew member Yuri Gagarin’s Vostok?”

NLP Resources

Thank you to Mr. Vollet, a new educational NLP site went up with various resources for you to grapple with! Covers everything from beginner to advanced difficulty in various mediums such as articles, books, tutorials, notebooks, and even YouTube videos. You may find a few familiar resources that I’ve previously discussed 🧐😁.

Emoji Automata

In this matplotlib blog, they show how you can create emoji-art from images using Python. Yay! (Code included)

What is emoji art? 👇 Her face is a bunch of emojis.

From RAGs to Answers

Just when we thought T5 & BART were awesome for QA, Facebook AI released their RAG model that “achieves state-of-the-art results on open Natural Questions, WebQuestions, and CuratedTrec.”

The takeaway is that instead of purely storing knowledge in pre-trained parameters (like T5 & BART), it is best to use a hybrid model (where one combines parametric and non-parametric (Wikipedia) approaches) to yield better results.

Open domain question answering is on fire in 2020. As far as I know, code is not out yet.

CMUs Low Resource NLP Repo

For those interested in low-resource NLP, you should check out CMUs bootcamp held earlier this month. Their bootcamp shared slides on several topics that you don’t see everyday for low-resource languages like speech synthesis and speech recognition! Videos and more materials coming soon!


NLP Viewer

Visualize the structure of your favorite dataset from Hugging Face’s NLP viewer. In their demo, they have displayed a healthy amount of NLP datasets and how you can visualize them by split and other filters. This is awesome!



If this question interests you: “When do the benefits of using large pre-trained models outweighs the increases in training time and compute resources?” then please read this article from RASA showing how they used BERT and other model pipelines for NLU purposes.

Dataset of the Week: Quda

What is it?

Dataset contains 14,035 diverse user queries annotated with 10 low-level analytic tasks that assist in the deployment of state-of-the-art machine/deep learning techniques for parsing complex human language.


Where is it?