Bridging the inherent gaps of AI-powered products

Source: Artificial Intelligence on Medium

How do you design the product and bridge the gaps to avoid frustration and lack of trust?

1. Recall or Precision

Recall and precision are a way to examine the relevancy of results returned by the selected algorithm. Are we, or more specifically our customers and users, are more tolerant for false-positives (type-1) or false-negative (type-2) errors? Are we going to optimize our model for recall or precision? Usually a trade-off is required, picking one over the other.

There are many factors that may influence our decision what to optimize including psychological, financial, cost of being wrong, cost of missing, reputation, time and more.

Choosing what to optimize affects the overall user experience when interacting with the product. The decision directs our data teams what to focus on to generate such an experience.

When you optimize recall you say that missing an element is much more painful and expensive than surfacing wrong ones. On the opposite, when you optimize precision, you want to be absolutely correct with what you show, and the cost of missing something is negligible. When precision is required, the experience of getting results which are not what you expected ruins the product experience.

For example, Google search optimizes for precision — what you see on the first page are the most important results. You shouldn’t care about results in the second page, unless you are looking to hide a dead body.

Health products usually optimize for recall — such products would not want anyone having an acute disease to be missed.

For more examples and detailed explanation read my previous article on recall and precision.

2. Set the Right Expectations

The product, the service and everything around it is how the users get their experience. They are not aware of the algorithms behind the scenes and to tell the truth they don’t really care. They care of what they see and what they experience.

Let’s focus on the model for a minute, if it is new and still learning the data, share that information with the user. Set the right expectations for what they are about to experience.

If there is not enough data, and the results are still poor — communicate this. If the model failed to find an answer — be transparent, share what prevents it from finding an answer. Make the right disclaimer where applicable.

Users are more tolerant to mistakes when you are transparent and accountable for them. The right message can totally change the experience a user gets from the product, even if they didn’t get exactly what they expected from the product.

For example, we built a model that predicts what is the expected behavior of a client in the next 3 months. Due to the nature of the data we had, in some cases, the prediction was based on a relatively small data set. For such cases, we had 3 options:

  • Do not show a prediction at all, saying we don’t have enough data
  • Show the prediction like any other prediction
  • Show the prediction with a user friendly disclaimer saying it is based on a small data set, and should be taken with a grain of salt

You can guess that we chose the third option and set the right expectation with our customers which they valued.

3. Explainable AI (xAI)

As mentioned before, users are not interested in understanding machine learning and algorithms. But users want to understand what lead to a decision or a prediction. What is the motivation behind it. We need to un-box the black box model. We need to have the capability to share some information on how the decision was made. We need an xAI — explainable AI.

xAI was initially introduced and described by Darpa as: “new machine-learning systems will have the ability to explain their rationale, characterize their strengths and weaknesses, and convey an understanding of how they will behave in the future”.

The road to the decision is more important than the decision itself. We won’t gain trust from our users if we build it totally as a black box.

You can start by sharing some key indicators, maybe key features, that may help the users understand what led to that decision or recommendation.

Amazon recommendation engine

For example, Amazon’s recommendation engine is much more complex than the single line that explains it to their users (“Customers who viewed this item also viewed”). But this simple sentence encapsulates the main features driving the recommendation. Now the users understand enough and trust it.

4. Flexibility to Explore & Discover

As product managers we learned how to work with the development teams and what are the essential skills required for this. We define the problem, collect the requirements and then we start the negotiation, scoping, grooming and prioritization. This is a process, that when done professionally, is very fruitful and helps to spot issues that were not taken into consideration. Can this work in the same way with machine learning-based products? It can, but requires some adaptations.

When approaching to solve a problem using machine learning, there is a thesis that we want to check. There is a lot of data preparation (generation, cleansing, normalization, etc.) that needs to be done. Don’t underestimate this part, it’s crucial for success and it requires resources (time + people). Only then the thesis can be examined.

Don’t assume you have to start from scratch. There are probably similar projects that did some research. Look for them, learn what they did, adapt and use what may fit your project.

The process usually requires more iterations, as the team starts to test the thesis with different mathematical tools and algorithms to see whether it solves the problem that was defined. This is a lot of trial and error. You know where you start but you are not sure where and when is the finish line.

I am not saying that ML should be totally artistic and unmanageable, but you have to give a bit more room and flexibility to the team to explore and iterate. How can you still manage this?

  • Prioritize (with the team) which algorithms they are going to test this week/sprint and review together the results (retrospective).
  • Define what is the acceptance criteria and what are the KPIs that are good enough to role a first alpha/beta version. Like in standard application, you release something and improve it later. Get things done, ship things, don’t wait for Mrs. perfect.
  • Define what is the benchmark that you compare your model to. Remember, use a simple algorithm to set the benchmark, than you can really know if and how you improve it with your model.
  • Understand the data that is going into the model and brainstorm whether there is more data that is missing and should be obtained.
  • Sit with the data scientists, try to understand the model or parts of it. Understand how the data looks like, the distribution, the variables and their correlation. Learn what is the data required for each model and what is the expected output. As the problem owner, you would not believe how fruitful such encounters can be.
Photo by Jason Dent on Unsplash

5. Feedback Collection

Another way to mitigate the uncertainties and unknowns is to collect feedback from the users. Many analytical tools help to collect a lot of data, but quantitative data here may not be enough, as it may still hide the real problem and spot the unknown unknowns. Qualitative data is required as well. That is to talk to your users after they used the product, record their sessions while using the product and give them the ability to submit their feedback in a very easy way, so they can let you know what’s not working for them. Setting the right environment can help you then reproduce the situation, although it is not always that straight forward.