Literature overview
This is a collection of interesting papers and thoughts around Trustworthy AI that I am gradually compiling over time. It also serves as my own interactive notebook. Descriptions of papers vary in length and some are very brief. A list of all references linked here can be found at the bottom.
Date | Title | Author |
---|---|---|
2020 | ‘How Do i Fool You?’ Manipulating User Trust via Misleading Black Box Explanations. | Lakkaraju and Bastani (2020) |
2020 | Fooling Lime and Shap: Adversarial Attacks on Post Hoc Explanation Methods. | Slack et al. (2020) |
2019 | Actionable Recourse in Linear Classification. | Ustun, Spangher, and Liu (2019) |
2019 | Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. | Rudin (2019) |
2019 | Explaining Explanations in AI. | Mittelstadt, Russell, and Wachter (2019) |
2017 | Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. | Wachter, Mittelstadt, and Russell (2017) |
2017 | A Unified Approach to Interpreting Model Predictions. | Lundberg and Lee (2017) |
2016 | ‘Why Should I Trust You?’ Explaining the Predictions of Any Classifier | Ribeiro, Singh, and Guestrin (2016) |
No matching items
References
Lakkaraju, Himabindu, and Osbert Bastani. 2020. “" How Do i Fool You?" Manipulating User Trust via Misleading Black Box Explanations.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 79–85.
Lundberg, Scott M, and Su-In Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 4768–77.
Mittelstadt, Brent, Chris Russell, and Sandra Wachter. 2019. “Explaining Explanations in AI.” In Proceedings of the Conference on Fairness, Accountability, and Transparency, 279–88.
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “"Why Should i Trust You?" Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–44.
Rudin, Cynthia. 2019. “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1 (5): 206–15.
Slack, Dylan, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. “Fooling Lime and Shap: Adversarial Attacks on Post Hoc Explanation Methods.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 180–86.
Ustun, Berk, Alexander Spangher, and Yang Liu. 2019. “Actionable Recourse in Linear Classification.” In Proceedings of the Conference on Fairness, Accountability, and Transparency, 10–19.
Wachter, Sandra, Brent Mittelstadt, and Chris Russell. 2017. “Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR.” Harv. JL & Tech. 31: 841.