Fooling Lime and Shap: Adversarial Attacks on Post Hoc Explanation Methods.
Explainable AI
Surrogate Explainers
Critique
Paper
Description
Slack et al. (2020) demonstrate that both LIME and SHAP are not reliable: their reliance on feature perturbations makes them susceptible to adversarial attacks.
References
Slack, Dylan, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. “Fooling Lime and Shap: Adversarial Attacks on Post Hoc Explanation Methods.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 180–86.