‘Why Should I Trust You?’ Explaining the Predictions of Any Classifier
Explainable AI
Surrogate Explainers
Paper
Description
Ribeiro, Singh, and Guestrin (2016) propose Local Interpretable Model-Agnostic Explanations (LIME): the approach involves generating local perturbations in the input space, deriving predictions from the original classifier and than fitting a white box model (e.g. linear regression) on this synthetic data set.
References
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “"Why Should i Trust You?" Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–44.