Outline
� What is Explainable AI?
� Desiderata of an Explainable AI technique � Uses of Explainable AI
� Methods for Explainable AI
� Activation Maximization
� Shapley Values
� Taylor Expansions
� Layer-wise Relevance Propagation
1/24
What is Explainable AI? Standard machine learning:
� The function f is typically considered to be a “black-box” whose parameters are learned from the data using e.g. gradient descent. The objective to minimize encourages the predictions f (x) to coincide with the ground truth on the training and test data.
x1
x x2 f(x)
xd
Machine learning + Explainable AI:
� We do not only look at the outcome f (x) of the prediction but also at the way the prediction is produced by the ML model, e.g. which features are used, how these features are combined, or to what input pattern the model responds the most.
2/24
…
What is Explainable AI?
Example 1: Synthesize an input pattern that most strongly activates the output of the ML
model associated to a particular class.
Image source: Nguyen et al. (2016) Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks
3/24
What is Explainable AI?
Example 2: Highlight features that have contributed for a given data point to the ML prediction.
Image source: Lapuschkin et al. (2016) Analyzing Classifiers: Fisher Vectors and Deep Neural Networks
heatmap image (bike)
image
heatmap
(person) image
heatmap (cat)
image
heatmap (person)
image
heatmap (train)
heatmap image (train)
image
heatmap (dining table)
4/24
What is Explainable AI?
Example 3: Concept activation vectors (TCAV). Highlight the mid-level concepts that explain,
for a given data point, the ML prediction.
Source: Google Keynote’19 (URL: https://www.youtube.com/watch?v=lyRPyRKHO8M&t=2279s)
5/24
Desiderata of an Explanation
In practice, we would like the explanation technique to satisfy a number of properties:
1. Fidelity: The explanation should reflect the quantity being explained and not something else.
2. Understandability: The explanation must be easily understandable by its receiver.
3. Sufficiency: The explanation should provide sufficient information on how the model came up with its prediction.
4. Low Overhead: The explanation should not cause the prediction model to become less accurate or less efficient.
5. Runtime Efficiency: Explanations should be computable in reasonable time.
see also Swartout & Moore (1993), Explanation in Second Generation Expert Systems.
image
heatmap (train)
6/24
Uses of an Explanation Verify (and improve?) a ML model
� Verify that the model is based on features which generalize well to examples outside the current data distribution (this cannot be done with standard validation techniques!).
� Reliance of the ML models on wrong features is often encountered when there are spurious correlation in the data.
� From the explanation, the model’s trustworthiness can be reevaluated, and the flawed ML model can be potentially retrained based on the user feedback.
7/24
Uses of an Explanation
Example: The classifier is right for the wrong reasons
image heatmap (horse)
average precision of the Fisher Vector model on the PascalVOC dataset
aer 79.08
bic 66.44
bir 45.90
boa 70.88
bot 27.64
bus 69.67
car 80.96
cat 59.92
cha 51.92
cow 47.60
din 58.06
dog 42.28
hor 80.45
mot 69.34
per 85.10
pot 28.62
she 49.58
sof 49.31
tra 82.71
tvm 54.33
� In this example, the classifier accurately predicts the horse class, but based on the wrong features (some copyright tag in the corner).
� This incorrect decision strategy cannot be detected by just looking at the test error.
cf. Lapuschkin et al. (2019) Unmasking Clever Hans Predictors and Assessing What Machines Really Learn. Nature Communications
8/24
Uses of an Explanation Learn something about the data
(or about the system that produced the data)
� Step 1: Train a ML model that predicts well the data.
� Step 2: Apply XAI to the trained ML model to produce explanations of the ML decision strategy.
� Step 3: Based on the XAI explanations, the user can compare his reasoning with that of the ML model, and can potentially refine his own domain knowledge.
Image source: Thomas et al. (2019) Analyzing Neuroimaging Data Through Recurrent Deep Learning Models
9/24
Part II: Methods of XAI
Presented methods
� Activation maximization
� Shapley values
� Taylor expansion
� Layer-wise relevance propagation
Other methods
� Surrogate models (LIME)
� Integrated gradients / expected gradients / SmoothGrad � Influence functions
� …
10/24
Activation Maximization
Assume a trained a ML model (e.g. a neural network), and we would like to understand what concept is associated to some particular output neuron of the ML model, e.g. the output neuron that codes for the class ‘cat’. Activation maximization proceeds in two steps:
� Step 1: Think of the ML model as a function of the input
� Step 2: Explain the function f by generating a maximally activating input pattern:
x� =argmaxf(x,θ) x
11/24
Activation Maximization
Problem: In most cases f (x) does not have single point
corresponding to the maximum.
E.g. in linear models, f (x) = w�x + b, we can keep moving the point x further along the direction w, and the output continues to grow).
Therefore, we would like to apply a preference for ‘regular’ regions of the input domain, i.e.
x� =argmaxf(x)+Ω(x) x
In practice, the preference can be for data points with small norm (i.e. we set Ω(x) = −λ�x�2 so that points with large norm are penalized.)
12/24
Activation Maximization: Examples
f (x) = w�x + b and Ω(x) = −λ�x�2 f (x) = max(x1, x2) and Ω(x) = −λ�x�2
13/24
Activation Maximization: Probability View
Assume the model produces a log-probability for class ωc :
f (x) = log p(ωc |x)
The input x� that maximizes this function can be inter- preted as the point where the classifier is the most sure about class ωc.
Choose the regularizer Ω(x) = logp(x), i.e. favor points that are likely.
The optimization problem becomes:
x� =argmax logp(ωc|x)+logp(x)
x
=argmax logp(x|ωc) x
where x� can now be interpreted as the most typical input for class ωc.
14/24
Attribution of a Prediction to Input Features
a�ribu�on
input
ML blackbox
predic�on
1. Thedatax∈Rd isfedtotheMLblack-boxandwegetapredictionf(x)∈R. 2. We explain the prediction by determining the contribution of each input feature.
Key property of an explanation: conservation (�di=1 φi = f (x)).
15/24
Attribution: Shapley Values
� Framework originally proposed in the context of game theory (Shapley 1951) for assigning payoffs in a cooperative game, and recently applied to ML models.
� Each input variable is viewed as a player, and the function output as the profit realized by the cooperating players.
TheShapleyvaluesφ1,…,φd measuringthecontributionofeachfeatureare: φi = � |S|!(d−|S|−1)!�f(xS∪{i})−f(xS)�
where (xS)S are all possible subsets of features contained in the input x.
d! S : i ∈/ S
16/24
Attribution: Shapley Values
Recall: φi = � |S|!(d−|S|−1)! �f (xS∪{i}) − f (xS)� d!
S: i∈/S � �� �� �� � αS ΔS
Worked-through example: Consider the function f (x) = x1 · (x2 + x3). Calculate the contri- bution of each feature to the prediction f (1) = 1 · (1 + 1) = 2.
17/24
Attribution: Taylor Expansions
� Many ML models f (x) are complex and nonlinear when taken globally but are simple and linear when taken locally.
� The function can be approximated locally by some Taylor expansion: f(x)=f(x�)+�d [∇f(x�)]i ·(xi −x�i)+…
� First-order terms φi of the expansion can serve as an explanation.
� The explanation (φi )i depends on the choice of root point x�.
i=1 � �� � φi
18/24
Attribution: Taylor Expansions
Example: Attribute the prediction f (x) = w�x with x ∈ Rd on the d input features.
19/24
Attribution: Taylor Expansions Limitations: Gradient information is too local-
ized.
� Cannot handle saturation effects and discontinuities e.g. cannot explain the function
f(x)=�d xi −max(0,xi −θ) i=1
at the point x = (2, 2).
This limitation can be overcome by looking at the structure of the model and decompose the problem of explanation in multiple parts (→ next slide).
20/24
Attribution: Look at the Structure of the Model Observation:
wij a vj
� The function implemented by a ML model is typically a composition of simple elementary functions.
� These functions are simpler to analyze than the whole input-output function.
Idea:
x1 x2
1 1
3 1 a4 -1
�
Treat the problem of explanation as propagating the prediction backward in the input-output graph.
The layer-wise relevance propagation (LRP) method implements this approach and can be used to explain ML models (→ next slide).
-1 1
1 a5
yout
1 -1 � -1 a6
21/24
∑
Attribution: The LRP Method
Example: Consider yout to be the quantity to explain: Rout ← yout
wij a vj 131
� Step 1: Propagate on the hidden layer 6 ajvj
x1
1 a4 -1
-1 1
-1
∀j=3 : Rj ← �6j=3 ajvj Rout � Step 2: Propagate on the first layer
x 1 a5 2 1
-1 a6
yout
2 �6 xiwij
∀i=1 : Ri ← j=3 �2i=1 xiwij Rj
Note: Other propagation rules can be engineered, and choosing appropriate propagation rules is important to ensure LRP works well for practical neural network architectures.
22/24
∑
Attribution: The LRP Method
Effect of LRP rules on the explanation (e.g. class ‘castle’ predicted by a VGG-16 neural network.)
VGG-16 Network
LRP-0 LRP-ε LRP-°
LRP-°
LRP-ε LRP-0
23/24
3×3 @ 64
3×3 @ 64
3×3 @ 128 3×3 @ 128
3×3 @ 256 3×3 @ 256 3×3 @ 256
3×3 @ 512 3×3 @ 512 3×3 @ 512
3×3 @ 512 3×3 @ 512 3×3 @ 512
7×1 @ 4096 1×1 @ 4096 1×1 @ 1000
Summary
� Explainable AI is an important addition to classical ML models (e.g. for validating a ML model or extracting knowledge from it).
� Many XAI methods have been developed, each of them, with their strengths and limitations:
� Activation maximization can be used to understand what a ML model has learned, but is unsuitable for explaining an individual prediction f (x).
� Shapley value has strong theoretical foundations, but is computationally unfeasible for high-dimensional input data.
� Taylor expansions are simple and theoretically founded for simple models, but the expansion does not extrapolate well in complex nonlinear models.
� LRP leverages the structure of the ML model to handle nonlinear decision functions, but requires to carefully choose the propagation rules.
24/24