Google Released a Conversational Agent they Claim Better than Others

Source: Deep Learning on Medium

Google Released a Conversational Agent they Claim Better than Others

This research summary is just one of many that are distributed weekly on the AI scholar newsletter. To start receiving the weekly newsletter, sign up here.

Conversational agents which are software programs intended to converse with humans have skyrocketed. Many of them are now highly specialized and provide high performance as long as users don’t deviate too far from their anticipated usage. For them to be able to handle a wide variety of conversational topics and to help the development of chatbots that are not specialized but can still chat about virtually anything, open-domain dialog research explores a complementary approach.

It’s an interesting research problem, that could lead us to more general conversational with interesting applications. State of the art chatbots have a challenge in that they don’t often make sense, and are sometimes inconsistent due to lack of common sense and real-world knowledge. Google has released a chatbot that’s more general and can converse in a more general context.

Meet Meena, Google’s AI Chatbot that Can Chat About Anything

Google Research Brain Team has presented Meena, an AI-based chatbot that can conduct conversations that are more sensible and specific compared to existing state-of-the-art chatbots.

Meena has 2.6 billion parameters and is trained on 341 GB of text, filtered from public domain social media conversations. Compared to an existing state-of-the-art generative model, OpenAI GPT-2, Meena has 1.7x greater model capacity and was trained on 8.5x more data. For the improvements, the researchers applied a new human evaluation metric that they propose for open-domain chatbots, called Sensibleness and Specificity Average (SSA), which captures basic, but important attributes for human conversation.

Interactive SSA vs. Perplexity. Each blue dot is a different version of the Meena model. A regression line is plotted demonstrating the strong correlation between SSA and perplexity. Dotted lines correspond to SSA performance of humans, other bots, Meena (base), our end-to-end trained model, and finally full Meena with filtering mechanism and tuned decoding

Researchers have long sought for an automatic evaluation metric that correlates with more accurate, human evaluation to enable faster development of the dialogue model. But it has been challenging. Surprisingly, with the Meena model, they discover that perplexity, an automatic metric that is readily available to any neural seq2seq model, exhibits a strong correlation with human evaluation, such as the SSA value.

Potential Uses and Effects

Meena achieves a perplexity of 10.2 and that translates to an SSA score of 72%. Compared to the SSA scores achieved by other chatbots, an SSA score of 72% is not far from the 86% SSA achieved by the average person. The full version of Meena, which has a filtering mechanism and tuned decoding, further advances the SSA score to 79%.

With such fascinating results, it tells you we are getting closer and closer to engaging in even human-like conversations with contemporary bots. Meena is no doubt a captivating research work that could lead to many interesting applications in human-computer interactions.

Read more: Towards a Conversational Agent that Can Chat About…Anything

Thanks for reading, comment, share & let’s connect on Twitter, LinkedIn, and Facebook. Stay updated with the latest AI research developments, news, resources, tools, and more by subscribing to our weekly AI Scholar Newsletter for free! Subscribe here Remember to 👏 if you enjoyed this article. Cheers!