9. Illustrative Example
n Our Idea
1. Given the model 𝑓, and the point 𝑥 to be explained.
2. Fit the boundary from inside using the box.
3. Measure the side length as the irrelevance of each feature.
9
𝑥%
𝑥&
decision boundary of 𝑓
𝑥
The irrelevance of the feature 𝑥%.
(Short = Highly relevant)
The irrelevance
of the feature 𝑥&.
(Long = Less
relevant)
Illustrative Example
10. [Mathematical Definition]
n Consider the box 𝑅 𝑢, 𝑣 .
• 𝑅 𝑢, 𝑣 ≔ −𝑢%, 𝑣% × −𝑢&, 𝑣& × ⋯×[−𝑢1, 𝑣1]
n Def. Maximal Invariant Perturbation
𝑢3, 𝑣3 = argmax;,<=> ∑ 𝑢@ + 𝑣@@
s. t. 𝑐 = argmaxG 𝑓G 𝑥 + 𝑟 , ∀𝑟 ∈ 𝑅(𝑢, 𝑣)
• Find the largest box that fits the boundary from inside.
n Def. Feature Relevance
Measure the relevance of the feature 𝑥@ by −(𝑢3@ + 𝑣3@).
10
𝑥
𝑣%𝑢%
𝑣&
𝑢&
Mathematical Definition
11. [Algorithms] LP w/ linear approximation.
n Difficulty
• The constraint 𝑐 = argmaxG 𝑓G 𝑥 + 𝑟 is highly complex.
n Idea: Use linear approximation.
• 𝑓G 𝑥 + 𝑟 ≈ 𝑓G 𝑥 + 𝛻𝑓G 𝑥 N 𝑟
n Approximate Problem
𝑢3, 𝑣3 = argmax;,<=> ∑ 𝑢@ + 𝑣@@
s. t. 𝑓O 𝑥 + 𝛻𝑓O 𝑥 N 𝑟 ≥ 𝑓G 𝑥 + 𝛻𝑓G 𝑥 N 𝑟,
∀𝑗 ≠ 𝑐, ∀𝑟 ∈ 𝑅 𝑢, 𝑣 ∩ {𝑟: 𝑟 V ≤ 𝛿}
cf. 𝑓O 𝑥 + 𝑟 ≥ 𝑓G 𝑥 + 𝑟 ⇔ 𝑐 = argmaxG 𝑓G 𝑥 + 𝑟
• Several extensions of the LP formulation in the paper.
11
Linear Programming
Linear objective + Linear constraints
Limit the perturbation
size to be smaller than 𝛿.
Algorithm
12. [Experiment] Evaluation on VGG16
n Model: VGG16 [Simonyan & Zisserman, ICLR’15]
n Dataset:COCO-animals (cs231n.stanford.edu/coco-animals.zip)
• Images of eight species of animals.
• Use 200 images in the validation set.
n Baseline methods
• saliency (https://github.com/PAIR-code/saliency)
- Gradient [Simonyan et al., arXiv’14]
- GuidedBP [Springenberg et al., arXiv’14]
- SmoothGrad [Smilkov et al., arXiv’17]
• DeepExplain (https://github.com/marcoancona/DeepExplain)
- LRP [Bach et al., PloS ONE’15]
- IntegratedGrad [Sundararajan et al., arXiv’17]
- DeepLIFT [Shrikumar et al., ICML’17]
- Occlusion
12
Experiment
13. [Experiment] Evaluation on VGG16
n Evaluation: Identifying relevant image patches.
1. Compute the sum of scores for each of 8 × 8 patches.
2. Flip top-K important patches to gray.
3. Observe whether the decision has changed to other classes.
n Result
13
Proposed Methods
Most effective in identifying
relevant image patches.
More than half of the images
resulted in class changes with
a few flips.
Experiment
16. n Let’s define “the solution.”
• Setting a goal is the driving force.
n Proposed a new definition of feature scoring/attribution.
• Measure the robustness of the model’s decision against the
input perturbation.
n Proposed an LP-based algorithm.
• Use linear approximation.
n The algorithm is still primitive.
• Cannot fully handle the non-linearity of 𝑓.
• Better algorithms under preparation.
16
Summary
Sample Code in GitHub: sato9hara/PertMap