Where are the bots?
“Technology does not automatically improve, it only improves if a lot of people work very hard to make it better.” — Elon Musk
Ol’ Musky is right. We are not going to be a multi-planet space-faring civilization unless a lot of people work very hard on it. In fact if they don’t, technology will degrade by itself. In 1969, we went to the moon. Ask anyone from 1969 where we would be on the space frontier in 2018, they will draw a picture very different from the 2018 we see today.
The case of Conversational AI (aka bots) is no different. In 1966, we had ELIZA, a chat-bot that mimicked a psychotherapist:
Again, ask someone from the late 1960s what talking machines would look like in 2018. They would have never expected it to be like this:
In fact, the AI/ML community often complaints about how the common man (and media) over estimates progress in conversational AI. It is time to consider the possibility that they are not at fault here. Maybe their expectations are not misplaced. Maybe it is just that you failed to deliver. We were good at it, but then we had a shortage of “lot of people working very hard”. Don’t get me wrong, there are *a lot* of people working on it, it’s the working hard part that is in short supply. More on that later.
The golden age of bots (Late 1990s — Early 2000s)
Bots used to work. Because they were engineered. Wanted basic chit-chat? Use a fuzzy look up table. Want to extract entities? Use regular expressions. Want to handle synonyms and abbreviations? String replace. Need logical inference? Prolog. Hold complex conversations? AIML. In fact AIML (built during 1995–2002) is so good that bots built using it such as Mitsuku have won multiple Loebner Prizes, even in 2018. Whether or not the Loebner Prize should be taken seriously — lets have that conversation later. But my point stands — bots used to work. Surprisingly well. Because the were engineered.
Deep learning — The bot slayer
Even though bots worked, the fact that they were still a bunch of if-else ladders always took a toll on bot developers’ ego. Imposter syndrome en masse. And here comes a shiny new toy, that claims to:
- work similar to the way the human brain works,
2. be universal function approximators. (turing complete)
3. perform better with more data.
All of these were tempting for bot developers, and boy did they fall for it. They forgot all about abstraction, reasoning, maintaining state of conversation and building a world model, or rather they expected their neural networks to do all these. They sacrificed deterministicity, interpretability and debuggability. They couldn’t apply their software engineering techniques anymore. And they failed, miserably. Let’s inspect each of the claims that pulled bot developers into deep learning:
- Claim 1 is hilariously false. It’s a lie. No, the brain doesn’t do backprop. No, the brain is not an LSTM.
- Claim 2 is misleading. People often read it as “can solve any problem”. That’s not what it means. Sorry.
- Claim 3 is irrelevant. Conversations are stateful. They are not a simple mapping from one space to another — which is the case for machine translation, and that’s why neural networks are good at translating from English to French despite being bad at holding a conversation in English or French. Even if you have long conversations as data, neural networks will still have trouble attending over long sequences (this is not about vanishing gradients).
Not only that deep learning is a bad tool for bot building, but it brought a bunch of bad practices along with it:
- Obsession with end to end training. Though end to end training is not seen in conversational AI products (all the something.ai ones), there are a lot of people pursuing this, and I think their time can be spent on better things.
- Thinking of data as a bunch of numbers, ignoring any order and hierarchy, or expecting the neural network to figure them out. While this works for spatially correlated data (images), we are throwing away a lot of information when applied to conversational data.
- Not having meaningful metrics to evaluate models.
- The “just add more layers” mentality.
Or in short, people got lazy. They lost rigour. We no longer have a lot of people working very hard on conversational AI.
What happened to conversational AI is also a good example of throwing away stuff that works for stuff that sounds cooler. So what now? Here are some suggestions:
- Just going back and using older tools will give you much better results. So go back.
- Identify what works and what doesn’t. Most of the pioneers in deep learning work on images. Don’t expect the success of their models to immediately translate to better conversational AI.
- Word embeddings work. Really well. You can do a whole bunch of stuff with them. See if you can build the whole thing entirely on word embeddings. (no POS tagging, or any language specific pre-processing other than tokenization + spell check). Want your bot to work on another language? Just swap the embeddings!
Very recently, there has been a push to fix the issues of interpretability (NNs are black boxes) and flexibility (difficulty in modelling complex non differentiable problems) within the deep learning space using something called Differentiable programming. Think of it as small neural networks tied together using a python script. The python script does basic “if-else”, reasoning, accessing external databases etc. The neural networks itself would be pre-trained on atomic tasks, but an ever learning algorithm would “learn” how to use the neural networks to do new complex tasks. Checkout this post from Yann LeCun and this talk by François Chollet.
While this is certainly in the right direction for AI in general, bot builders shouldn’t get excited. First, even though there is significant overlap in the terms used (interpretability, abstraction, reasoning), we want totally different things. Second, we had these things, we just stopped using them.
So yeah, don’t throw away stuff that work, but constantly innovate. And we will go to the moon, again.
Source: Deep Learning on Medium