NLP News Cypher | 06.07.20

Original article was published on Artificial Intelligence on Medium

GPT-2 Lyrics Aftermath

Not sure if there has ever been a survey fielded to understand the quality of text generation from GPT-3s little bro: GPT-2. TickPick surveyed 1,003 respondents to find out how much humans enjoyed text generated lyrics and how it benchmarked against real-human lyrics.

Top AI lyric:

“I got my rig in the back of my Beemer. Professional when I graze, I’m professional when I argue. 40 glass, I’m laughing at that s***, I’ma be roaring at that s***.” 🤟🤟

They provide more AI-generated lyrics by genre as well:

Bulletpoints Demo

HAIMKE is an awesome text generation model. It allows you to generate text from bulletpoints. What’s even cooler is that they can interweave these statements throughout the generated text. You can give it a drive here:

This Word Does Not Exist

Continuing with the GPT-2 theme, check out this repo where you can use a text generation model for creating definitions and words that don’t exist (in a similar structure found in your Merriam-Webster). They provide saved weights for inference and the ability to train your own model if you wish.

GitHub:

Here’s their Twitter bot.

Altair

When you are not using matplotlib for your visualizations, try out Altair. The API has a clean and simple syntax. (it’s a declarative library 🙈)

GitHub:

Gallery of Visuals:

TensorFlow TTS

Hey, now speech synthesis is at your fingertips. Dathudeptrai releases an awesome library and it seems it was built for production:

“we can speed-up training/inference progress, optimizer further by using fake-quantize aware and pruning, … and be able to deploy on mobile devices or embedded systems.”

The library allows you to use several different models:

  1. MelGAN released with the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
  2. Tacotron-2 released with the paper Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
  3. FastSpeech released with the paper FastSpeech: Fast, Robust and Controllable Text to Speech
  4. Multi-band MelGAN released with the paper: Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech

GitHub:

DeepTube

DeepMind released new lectures on YouTube, and as of today, six of them are up with at least six more on the way! For NLP folks, lecture 6 on recurrent networks is the one for you:

AI Training Costs

An ARK Invest analyst is saying that by Dec 2020, the cost to train a neural network on ResNet50 will be less than a $1. Apparently the “cost to train an artificial intelligence (AI) system is improving at 50x the pace of Moore’s Law.”

My bank account says otherwise. 😁

Something to chew on 👇

Dataset of the Week: InfoTabs

What is it?

Dataset contains human-written textual hypotheses based on premises that are tables extracted from Wikipedia info-boxes.

Sample:

Where is it?