Original article was published by Cobus Greyling on Artificial Intelligence on Medium
How IBM Watson Assistant Is Correcting User Input For Chatbots
And Why Auto Correction & Fuzzy Matching of Entities Don’t Clash
There are various methods and approaches to managing a conversation from a chatbot perspective. These can include:
- Fallback Prompts & Dialogs
- Graphic Conversational Components
- Forms / Slot Filling
These elements have something in common…
All of these elements are employed as part of an active dialog exchange between the user and the chatbot. Hence it involves the user.
One element which can be implemented which does not require user involvement is detecting anomalies in user input and automatically correcting user input prior to issuing the input to the NLU.
In this story I am looking at autocorrection and fuzzy matching. And why the two do not clash…
Correcting User Input
One element the chatbot can use to manage and advance the conversation is by correcting user input without consultation. Obviously there is a fine line here; correcting user input with a relative level of confidence that the correct intent and entity will be targeted.
Two ways of correcting user input are:
- Auto Correction
- Fuzzy Matching
Autocorrection, as we all know, corrects words misspelled when a user enters an utterance to the chatbot. Hence, when autocorrection is enabled, the misspelled words in the utterance are corrected on the fly; without user intervention or consultation.
Autocorrection needs to be explicitly activated in your skill and secondly, it needs to be noted that this feature is only supported in certain languages.
There might be instances where autocorrection is not desirable.
You will see from the example here, in the test pane, a sentence is entered with two spelling errors.
When the words from the input are misspelled, they are corrected automatically, and an “A” icon is displayed.
The corrected utterance is underlined and can be reviewed.
Below is a list of input types which is not corrected, to avoid overcorrection.
Also, words that belong in this skill, meaning words that have implied significance because they occur in entity values, entity synonyms, or intent user examples are also not corrected.
To avoid overcorrection, your assistant does not correct the spelling of the following types of input:
~ Capitalized words
~ Location entities, such as states and street addresses
~ Numbers and units of measurement or time
~ Proper nouns, such as common first names or company names
~ Text within quotation marks
~ Words containing special characters, such as hyphens (-), asterisks (*), ampersands (&), or at signs (@), including those used in email addresses or URLs.
Because the the following sentence is an intent example for the
"text": "can you please cacnel it"
"description": "Cancel the current request"
When entering the misspelled word “cacnel”, it is not corrected.
Fuzzy matching needs to be activated within individual entities. This increases the ability of Watson to recognize misspelled entity values.
Fuzzy matching within Watson Assistant uses a dictionary lookup approach to match a word from the user input to an existing entity value or synonym in the skill’s training data.
Fuzzy matching can be deactivated or activated on specific entities based on the likelihood that the chatbot’s performance will be enhanced by it.
In this example, the entity is
@holiday with a few examples. Amongst which the utterance of
christmas day exist.
So, for example, if the user enters
christmass daay fuzzy matching picks up that the two utterances mean the same thing.
When fuzzy matching is toggled off for this entity; Watson will automatically retrain the model.
After training, if the same input is given,
christmass daay. And no entity is detected.
It is advisable to test a sample of data and make small iterative changes. You will notice that the spelling is not corrected, yet correctly matched.
Can Both Be Used In A Skill?
The short answer, is yes…you can use both.
But two things need to be noted.
If autocorrection and fuzzy matching are both enabled, priority is given to fuzzy matching and thereafter autocorrection is actioned.
Fuzzy matching will be employed to identify the most appropriate entity, but the word or words are not corrected. I have found this true to some extend, if the word’s spelling is too far off; autocorrection kicks in.