Fooling Lime and Shap: Adversarial Attacks on Post Hoc Explanation Methods.

Explainable AI

Surrogate Explainers

Critique

Paper

Author

Published

January 1, 2020

Description

Slack et al. (2020) demonstrate that both LIME and SHAP are not reliable: their reliance on feature perturbations makes them susceptible to adversarial attacks.

References

Slack, Dylan, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. “Fooling Lime and Shap: Adversarial Attacks on Post Hoc Explanation Methods.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 180–86.