- User-friendly LLMs
- Posts
- Navigating AI with Interpretability vs. Explainability
Navigating AI with Interpretability vs. Explainability
Both interpretability and explainability share a common goal: to assist users in making sense of AI.
But they use two very distinct pathways – interpretability digs into the nitty-gritty of a model's inner mechanics, while explainability focuses on the core idea behind its predictions.
In ancient Crete, King Minos crossed the gods by keeping a sacred bull for himself instead of sacrificing it to the gods. Little did he know, the consequences of his actions would be monstrous.
His wife, cursed by the gods, fell madly in love with the majestic bull and gave birth to the Minotaur; half man, half bull. Ashamed and terrified, King Minos ordered the construction of an enormous, intricate labyrinth to imprison the monstrous offspring. Daedalus crafted an unparalleled labyrinth with endless twisting corridors, dead ends, and stairways to nowhere. It was a maze so convoluted that no mortal could navigate it. The Minotaur was condemned to wander the labyrinth for eternity.
Year after year, Athenian youths were sacrificed to the Minotaur's insatiable appetite until a hero arose: Theseus, son of Aegeus, the king of Athens. Armed with nothing but courage and cunning, he ventured to the labyrinth, determined to slay the beast and end the terror.
Desperate, he looked to Daedalus for help. But Daedalus wouldn’t provide Theseus a map. As the Fates would have it, Theseus found an ally in Ariadne, the daughter of King Minos. Smitten by the young hero, Ariadne (with a little hin by Daedalus) offered Theseus a lifeline—a ball of thread, which he unraveled as he ventured deeper into the labyrinth, creating a path to retrace his steps.
Guided by Ariadne's thread, Theseus navigated the bewildering maze, and in the heart of the labyrinth, he faced the Minotaur. A ferocious battle ensued, with the very walls of the maze seeming to tremble with each powerful blow. But in the end, Theseus emerged victorious, slaying the beast and freeing the people of Athens from the gruesome tribute.
Navigating the complex world of AI can feel like walking through a labyrinth, with each turn revealing new surprises and challenges.
Interpretability and explainability are two different paths through the labyrinth for two different kinds of users (more on that below).
Picture interpretability as an architectural map, detailing every twist and turn of a model's inner workings. Experts can scrutinize every corner, ensuring the structure is sound, while marveling at the intricate design.
Envision explainability as Ariadne’s thread not giving up all the secrets of how the AI works but allowing users to understand it. Users can take away easily digestible explanations of AI decisions, making the technology accessible without getting lost in the intricate details.
The Two Axes: Understandability and Fidelity - Unraveling the Labyrinth
Let’s get to the core. Why are we interested in explaining the model in the first place?
In the end, every model interacts with a user. If the model isn’t used, it has failed. No matter the accuracy.
In ML, most of the time we are dealing in predictions. To gain a user's acceptance, we must make clear the "why" behind a specific prediction.
Think of an explanation as an answer to a "why" question. This understanding is the key to building trust and faith in the technology.
When we use interpretable models, we're often simplifying the problem at hand to make it easier for humans to understand. It comes at a cost: a loss in fidelity. A high-fidelity model can capture the essence of the model, and capture the patterns and relationships in the data. Think of it like a map; a high-fidelity map represents the terrain accurately, while a low-fidelity map may be oversimplified or miss crucial details.
“ML models are apparently the antithesis; by normal design, they excel at prediction but remain black boxes impervious to interpretation or explanation.”
- Jessy Lin

The fidelity understandability tradeoff.

Assuming complex patterns in the data.
A Primer on Interpretability
Interpretability is for the folks who know the ins and outs of AI: developers, engineers, and data scientists. They're the ones who can take apart an AI system and really see how it ticks, looking at its structure, logic, and decision-making.
But as algorithms get more complex, fewer people can really get a handle on interpretability.

The inner workings equal how the user thinks the AI works (his mental instruction manual)
There's a big debate among scientists about whether interpretability is truly necessary. Sure, more interpretable models, like regressions, might be easier to get, but some important insights might slip through the cracks.
And interpretable models might just be a half-measure. When you have a regression model with loads of variables and interactions, it's like trying to untangle a gigantic ball of yarn. Things get messy, and it's hard to see the individual threads. Even AI experts can have a hard time figuring out what's happening.
In my view, you don't always need to know every little detail about how AI makes decisions to use it well. It's like doctors don't need to know every single chemical reaction when they give you a prescription. Heck, for many drugs, we don't even know how they work. As long as the AI is accurate and helps people, most folks won't think twice about using it.
The trick is finding the sweet spot between trust and results.
And that's when explainability comes into play.
A Primer on Explainability
Explainability acts as a translator, taking the complex language of AI and transforming it into something everyday users can understand and relate to.
They bridge it into a space where no AI knowledge is necessary.
The user needs to comprehend the model's outcomes and make informed choices based on its predictions, without necessarily understanding the underlying mechanisms.

The inner workings approximately equal how the user thinks the AI works (his mental instruction manual)
A common way is by using examples.
Let's bring back our favorite AI analogy: Understanding AI can be much like trying to figure out a recipe. The whole dish might seem complicated, but if we focus on just one ingredient, we can better understand how it affects the overall taste. For instance, what difference would 2 grams of salt make to the flavor?
Instead of tackling the entire model, we can zero in on a single observation to help the user see how the model works.
Picture a model that decides whether a bank customer should get a loan. We want to examine how Max Mustermann's income influences the model's prediction of his loan eligibility. By tweaking his income up or down, we can observe how the AI's decision changes.
(Stay tuned for a deeper dive into different methods of explainability)
Tailor it to the Audience: Dependence of Explainability and Interpretability
In the quest to make AI systems understandable, we always have to go back to the user. We encounter a broad set of characters - model developers, business owners, decision-makers, those directly affected, and even regulatory bodies. They each bring their unique perspective, but we can divide them into two main groups: experts and laymen.
The most valuable tools are often those that serve both experts and novices, enabling them to comprehend a system. If given the chance, opt for explainability.
To-Do
Interpretability and explainability are two paths through the labyrinth of AI, one for experts and one for laymen, but the most valuable tools are those that serve both.
Assess your user’s expertise:
Does the user need to understand the model? If the user needs to extend or fine-tune the model, interpretability techniques may be more appropriate.
Does the user have enough knowledge to comprehend the model at all?
Assess the task:
How accurate (fidelity) must the model be to deliver value?
Does fidelity trump understandability for the task? For example, if the decision being made is critical, such as in the case of cancer diagnosis, higher accuracy may be more important than understandability.
Are there any regulations in place that require a level of explanation?
Based on the required performance and understandability, pick an appropriate model from the above graph.
Learning Lab
A. Zheng - Explained- How to tell if artificial intelligence is working the way we want it to
H. Cheng - Explaining Decision-Making Algorithms through UI- Strategies to Help Non-Expert Stakeholders
J. Lin - Rethinking Human-AI-Interaction
Google - People + AI Research - Patterns
E. Kamar - Mental Models in Human-AI Team Performance
Q. Liao - Questioning the AI- Informing Design Practices for Explainable AI User Experiences
Q. Liao - Human-Centered Explainable AI (XAI)- From Algorithms to User Experiences
Reply