Setting the record straight on explainable AI : (2nd out of N) Are ML models really black boxes?

Original article was published on Deep Learning on Medium

Setting the record straight on explainable AI : (2nd out of N) Are ML models really black boxes?

A black box is defined as a system, which can be viewed in terms of its inputs and outputs (or transfer characteristics/function), without any knowledge of its internal workings. The opposite of a black box is a system where the inner components or logic are available for inspection (commonly referred to as a glass box or white box). I remember, when studying electrical engineering, it was quite common to deal with such terms in courses such as control theory, system identification, and digital signal processing, just to name a few; in all these topics, a system was considered a black box if it could be viewed solely in terms of its inputs and outputs.

In this post, as a follow up to an earlier post that I wrote on explainable AI (XAI), I would like to argue that ML models are not really black box; quite the opposite in fact, as they are glass box systems whose core logic and underlying computational logic and components are fully visible to its owners, regulators and anyone who is supposed to understand and interrogate them. Of course, in some cases, due to IP and other considerations/concerns, the model can stay black box to some; this is different from ML models such neural networks being black box by nature and/or in general. The common associations of ML models to black-box systems can be attributed to circumstances where one:

  1. Sees the model as equal to the actual system it attempts to model, or
  2. Lacks the necessary technical skills to benefit from the model’s glass-box transparency, or
  3. Finds the natural-language explanation complicated/hard to understand (regardless of technical ability) due to the model’s complexity.
ML models are usually glass box models of some black-box systems/phenomena; given that our ultimate curiosity is about the modelled systems, despite ML models being glass box, many tend to view them as black box systems unless they address our questions about the back box.

I described the first issue in this article. Imagine that there is a system that we attempt to model; due to the system being a black box, we employ ML (a neural network, for instance) to model its transfer characteristics; a good ML model that we developed ourselves and can mimic the system’s behaviour is not a black box. While the former (the system) can be assessed only in terms of its transfer characteristics (or, I/O logic), for the latter (the model), we know everything from the assumptions and training data, to the mathematical/statistical logic and more. We knew enough to code every detail of the model, and use its software implementation.

The second issue tends to be dependent on the audience, i.e., rather than ML models being objectively black box, their opacity is subjective; while it is glass box to some, it can be perceived as black box by others. Let’s consider a simple hypothetical model: As input, it takes daily systolic blood pressure measurements for this and last week (i.e., [b¹¹, b¹², …, b¹⁷] and [b²¹, b²², …, b²⁷]), calculates their correlation; if the correlation has a P-value<0.05, the model recommends a certain action. This is an extremely simple model — hence glass box to anyone familiar with basic computations — that is not accessible/explainable to many. Therefore, one can argue that any model beyond a small number of if-then rules, has the risk of being “subjective black box” to some, despite being glass box to others.

“Simple is better than complex. Complex is better than complicated.”

Related to the third issue, in most XAI discussions, what people really mean are partial explanations, and simple (or more appropriately, simplified) explanations. That is, regardless of the type of ML models used, when dealing with many input variables (i.e., high-dimensional input spaces), and hence many more possible combinations of these variables that it takes to make accurate predictions, any complete explanation of the model will be too lengthy and too complicated. In other words, in most cases, such simple explanations will not qualify as complete explanations. This is why scientists in the domain have a different approach in understanding their models: They first understand and trust the maths behind their models, and then train and validate them on appropriate data, understand their models’ edge cases (e.g., where do they perform well and poorly?), see if their models’ assumptions hold, attempt to break their models / be their own models’ biggest critic, attempt to interrogate some of their models’ inner workings (a la partial explanations), and more; rather than explaining it to themselves in (and getting onboard based on) a few simple natural-language sentences about the model.

In summary, ML models are not black box; rather, they are likely to be models of black box systems. When attempting to explain the ML models, one can have varying degrees of difficulties communicating them depending on their audiences’ technical skills (and their appetite for getting to know a very complex system and/or complicated concept). A simple — yet not complete/perfect — solution to bridge this gap might be such simplified/partial explanations. The ultimate solution, however, is trust; it can be built by many approaches, including, but not limited to model explanation.