Source: Deep Learning on Medium
Google Released a Conversational Agent they Claim Better than Others
This research summary is just one of many that are distributed weekly on the AI scholar newsletter. To start receiving the weekly newsletter, sign up here.
Conversational agents which are software programs intended to converse with humans have skyrocketed. Many of them are now highly specialized and provide high performance as long as users don’t deviate too far from their anticipated usage. For them to be able to handle a wide variety of conversational topics and to help the development of chatbots that are not specialized but can still chat about virtually anything, open-domain dialog research explores a complementary approach.
It’s an interesting research problem, that could lead us to more general conversational with interesting applications. State of the art chatbots have a challenge in that they don’t often make sense, and are sometimes inconsistent due to lack of common sense and real-world knowledge. Google has released a chatbot that’s more general and can converse in a more general context.
Meet Meena, Google’s AI Chatbot that Can Chat About Anything
Google Research Brain Team has presented Meena, an AI-based chatbot that can conduct conversations that are more sensible and specific compared to existing state-of-the-art chatbots.
Meena has 2.6 billion parameters and is trained on 341 GB of text, filtered from public domain social media conversations. Compared to an existing state-of-the-art generative model, OpenAI GPT-2, Meena has 1.7x greater model capacity and was trained on 8.5x more data. For the improvements, the researchers applied a new human evaluation metric that they propose for open-domain chatbots, called Sensibleness and Specificity Average (SSA), which captures basic, but important attributes for human conversation.
Researchers have long sought for an automatic evaluation metric that correlates with more accurate, human evaluation to enable faster development of the dialogue model. But it has been challenging. Surprisingly, with the Meena model, they discover that perplexity, an automatic metric that is readily available to any neural seq2seq model, exhibits a strong correlation with human evaluation, such as the SSA value.
Potential Uses and Effects
Meena achieves a perplexity of 10.2 and that translates to an SSA score of 72%. Compared to the SSA scores achieved by other chatbots, an SSA score of 72% is not far from the 86% SSA achieved by the average person. The full version of Meena, which has a filtering mechanism and tuned decoding, further advances the SSA score to 79%.
With such fascinating results, it tells you we are getting closer and closer to engaging in even human-like conversations with contemporary bots. Meena is no doubt a captivating research work that could lead to many interesting applications in human-computer interactions.