Conversational AI — The potential for start-ups

Original article was published by YNOS Venture on Artificial Intelligence on Medium

Conversational AI — The potential for start-ups

Dr. Vijay Raman, SVP products at [24], alumnus of IITM

Over the years conversational AI has become a very powerful capability but what it has become is not something we can fully appreciate. The nature of conversation in today’s world is a little different from what is traditionally called conversation. The AI element to conversation complicates things, but also makes it a lot of fun. We will cover what AI is and its growth as well as the importance of the players and white space in the industry.


Everyone is familiar with conversations and it is generally associated with words. Animals on the other hand are considered unique in their verbal capability. There is also vocalization i.e. using the voice as well and there are many interesting pieces of research on the relationship between vocalization and the words. There are also other facets to communication like emotions and there are prosody and emphasis that are introduced with various cues going on other than just the words. Over and above this, there are also bodily cues i.e. non-verbal cues. The perception of the listeners also impacts the communication of the speaker.

Businesses started playing a role in this conversation and with call centers, different factors are to be understood from the caller. The agent does not have all the context but understanding the customer can be complicated.

The next stage in the evolution of the conversation, in the last many years, is the presence of digital chat. With this, other cues like emotion, prosody, and cadence are all lost. This makes the job of the agent even harder. Emojis are barely sufficient to convey the emotions across. This is interesting because the nature of the conversation has changed but something has also been potentially gained. People can now go back to reference older communications while also communicating with images and videos. This mechanism may not have been present during purely verbal conversations even with the ability to speak.

Communication today, between two entities, has been enhanced even without words.

Messaging is really what has emerged as a means of digital conversation. It’s a new form of conversation, in the sense that it adds the element of asynchronous communication i.e. continuous communication even with a lag in time. This is a new element and brings a lot of power to it. Messaging apps now have the feature of getting the conversation history, adding more people to the chat, multi-party communications, the luxury of responding after some time, sending media and, using QR codes to communicate information and even make phone calls. Now you can even make payments, support questions, games, and everything else that comes with it. This is where businesses and startups get interested because this conversation is about satisfying a consumer and making money in the process.

This is a really powerful arsenal and businesses are tapping in. Businesses have always been eyeing people as a consumer and the means of communication are many but the holy grail in many ways is a continuous channel where they can communicate with the consumer and vice versa. If there is a continuous channel of communication, they can share information proactively or reactively. The interesting part is that the conversation has shifted from a verbal conversation to a digital conversation. This introduces new complexities and there is a very powerful digital channel that offers a lot of features.

Combining all these channels like video and messaging, we can get a powerful set of signals that every company would like to decipher and exploit to the fullest.

Messaging Wars

There is a war going on between the big companies on who gets to own the pipe between the consumer and the enterprise. They want to make sure that when a consumer contacts an enterprise, it is through their channel. This is good because they have access to what goes in between, the power to support the enterprise, and influence the consumers as well. Security is another big white space for startups because the big players are a part of this conversation.

As they battle with each other, the startup founders can take over this space and provide a facility that applies across a whole set of different platforms.

But this conversation between a business and a consumer is taking on all kinds of new forms, new form factor, new channels, new modalities and this is creating a complex problem because of the various applications that a consumer can use to reach out to an enterprise. The enterprise has to piece together all these conversations to understand the context.

Conversational AI

The machine is very powerful and in the last few years, we have been learning how incredibly more powerful it has become in its ability to digest data and know more about the consumer and serve them in different ways, even in ways they do not want. It has become explosive and the combination of the richness of conversation which is a very unique means of communication and the power of the data processing engine (AI), has led to an explosion of different types of work-related to speech recognition, natural language processing, biometrics, big data processing, context, dialogue, and user profiles. There is a whole proliferation of means in which AI and machine learning is being used to conduct better conversations. It is interesting that with all the years of work, we are still in the very early stage of this.

Not to say that we have not made progress, but we are still in the infancy of this, which is good news especially for startups.

Gartner Hype Cycle

The Gartner Hype Cycle in 2019, said that conversational AI has gone past the hype cycle and is beginning to deliver real value which I think is accurate, but still a long way to go because it becomes mature or can serve. Most meaningful transactions take 5–6 exchanges at least to understand the whole situation for both parties.

This is why digital communication is complex.

Natural language understanding is one of the big elements and is broader than natural language understanding as it includes text processing, document processing, etc. so this map is not purely to be interpreted as just about conversations but rather talks about the 20 billion in 2025 is what it says the market would look like. The general contact market as estimated in 2021, is around 20 billion which would put together all the different forms of communication beyond the raw contact center. It can be bumped up by some billions and it is a growing market as well. There is an enormous investment that is validated by all the players in the market.

There is now an idea to solve the communication making and completing transactions which are called conversation commerce, which is supremely attractive to everyone and there is a lot to be done in this space.

Conversational AI adoption is accelerating and is very use case driven and this opens up more space for opportunities that people have not yet solved. A single source of truth to make sure that all the consumer contact with the machine is connected and can make the connection, that is important but not yet a solved problem. There are three players in the case: consumer, the brand, and provider of the technology platform. AI and HI both play a role here and both have to collaborate. Customer satisfaction and experience is king here.


The media that the devices enable is very powerful and interesting. Static images, videos, clips, etc. can be sent but the power of this has not been exploited till today. C2C has some traction but C2B not so much because a machine cannot understand the image as well as a human might. A lot of these powerful artifacts are not yet playing a strong role so there is room for consumers to become an expert on how to use these to improve the user experience. There are also augmented reality kits and VR kits, these can be powerful in improving communication in terms of problem-solving and shopping. There are a lot of communication devices and as the analysts have said, getting all this information together is difficult and none of the companies have solved that. There is the challenge of taking all these disparities essentially software entities with different ways of tracking history and background, business chats, etc.

For a large company, this might be an issue to knit things together but for a startup, it is a huge green field.

There are a lot of startups building tools around AI including building flows, building natural language models but there is always room for people to re-think to see the best user experience. A combination of visual and non-visual tools for modeling, for CRM integration, are all areas to build conversations and are in early stages with a lot of room for improvement here.

Analytics is also very important to a conversational journey. Knitting the information and projecting to customers is important. The ability to trace the journey is a problem that companies are trying to solve. The gap here is the lack of data in the area.

In terms of security, there are technologies to identify the person, to secure the data and communications, regulations, and to identify the person there is now multi-factor authentication and biometrics. All these have been playing a role but moving from different channels, the challenge persists. Identity management is still complicated and yet to find the right audiences. The limitations for a lot of these hold them back and there is a requirement for them to be synchronous. There is room for people to try to bridge the gap between various channels as well.

I believe that this is the most important way to move forward while using the platforms that other people have built and explore other regions that have not been explored so far. Focusing and building on a vertical will automatically build a product with more vertical and more layers. Conversational AI is still in the early stages and still has a long way to go forward.