Original article was published by Synced on Artificial Intelligence on Medium
OpenAI’s 175 million parameter language model GPT-3 has gone viral once again, with a flurry of tech tweets celebrating the many innovative new applications — ranging from automatic code and short story generators to fully functioning search engines — that have leveraged the GPT-3 API OpenAI released in June. But not everyone in the ML community is impressed.
Open AI’s first GPT (Generative Pre-Training) model was introduced in June 2018. The then-novel idea was to take advantage of the huge supply of unlabelled text corpuses and the Transformer generative deep learning architecture to train a powerful general language model. In February 2019, the San Francisco-based AI company rolled out a much larger GPT-2 model with key technical updates such as pre-activation, zero domain transfer, and zero task transfer. With 1.5 billion parameters, GPT-2 was 12 times larger than the initial GPT. OpenAI unveiled the third version, GPT-3, which scaled up the model architecture, data and compute, in their May research paper Language Models are Few-Shot Learners.
GPT-3 delivered SOTA performance across a variety of NLP tasks and benchmarks in zero-shot, one-shot, and few-shot settings. For example, when fed the prompt: “Close your eyes and, with detail, describe the sounds and smells around you right now. Create a picture that I can clearly see in my mind,” a GPT-3-powered writing assistant developed by ShortlyRead generated the following dark tale, which reads like the product of a creative writing class:
(This place seemed much smaller when she’d first walked in. It was probably the concrete walls — too bare, too harsh, like a cell. They always made her want to burrow into the corners.)
It was getting cold. She shivered. The sounds continued. More rapid now. She coughed.
Raphaël Millière, a Philosopher of Mind & Cognitive Science at Columbia University’s Center for Science and Society, asked GPT-3 to compose a response to the philosophical essays written about it. The generated text includes an advanced argument and even a bit of self-reflection: “Human philosophers often make the error of assuming that all intelligent behavior is a form of reasoning. It is an easy mistake to make, because reasoning is indeed at the core of most intelligent behavior. However, intelligent behavior can arise through other mechanisms as well. […] I lack long-term memory. Every time our conversation starts anew, I forget everything that came before.”
Millière employed AI Dungeon’s GPT-3 based “Dragon” model instead of the GPT-3 API, along with some custom prompts, explaining, “there’s cherry-picking at two levels: within each complete response, some sentences were not GPT-3’s first output (although they were still written by GPT-3!); and I shared only the two most interesting complete responses I obtained through this process.” Millière tweeted that even taking the cherry-picking process into account, the results were “quite remarkable!”