References

Keywords

Artificial Intelligence, Trustworthy AI, Counterfactual Explanations, Algorithmic Recourse

Abbasnejad, Ehsan, Damien Teney, Amin Parvaneh, Javen Shi, and Anton van den Hengel. 2020. Counterfactual Vision and Language Learning.” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10041–51. https://doi.org/10.1109/CVPR42600.2020.01006.
Ackerman, Samuel, Parijat Dube, Eitan Farchi, Orna Raz, and Marcel Zalmanovici. 2021. “Machine Learning Model Drift Detection Via Weak Data Slices.” In 2021 IEEE/ACM Third International Workshop on Deep Learning for Testing and Testing for Deep Learning (DeepTest), 1–8. IEEE. https://doi.org/10.1109/deeptest52559.2021.00007.
Agustı́, Marc, Ignacio Vidal-Quadras Costa, and Patrick Altmeyer. 2023. “Deep Vector Autoregression for Macroeconomic Data.” IFC Bulletins Chapters 59. https://www.bis.org/ifc/publ/ifcb59_39.pdf.
Alain, Guillaume, and Yoshua Bengio. 2016. Understanding intermediate layers using linear classifier probes.” ArXiv. https://api.semanticscholar.org/CorpusID:9794990.
Alperin, Juan P, Carol Muñoz Nieves, Lesley A Schimanski, Gustavo E Fischman, Meredith T Niles, and Erin C McKiernan. 2019. How significant are the public dimensions of faculty work in review, promotion and tenure documents? ELife 8: e42254.
Altman, Sam. 2025. “Reflections.” January 6, 2025. https://blog.samaltman.com/reflections.
Altmeyer, Patrick, Giovan Angela, Aleksander Buszydlik, Karol Dobiczek, Arie van Deursen, and Cynthia C. S. Liem. 2023. “Endogenous Macrodynamics in Algorithmic Recourse.” IEEE. https://doi.org/10.1109/satml54575.2023.00036.
Altmeyer, Patrick, Leva Boneva, Rafael Kinston, Shreyosi Saha, and Evarist Stoja. 2023. “Yield Curve Sensitivity to Investor Positioning Around Economic Shocks.” Bank of England working papers 1029. Bank of England. https://doi.org/None.
Altmeyer, Patrick, Aleksander Buszydlik, Arie van Deursen, and Cynthia C. S. Liem. 2026. “Counterfactual Training: Teaching Models Plausible and Actionable Explanations.” https://arxiv.org/abs/2601.16205.
Altmeyer, Patrick, Andrew M Demetriou, Antony Bartlett, and Cynthia C. S. Liem. 2024. “Position: Stop Making Unscientific AGI Performance Claims.” In International Conference on Machine Learning, 1222–42. PMLR. https://proceedings.mlr.press/v235/altmeyer24a.html.
Altmeyer, Patrick, Arie van Deursen, and Cynthia C. S. Liem. 2023b. Explaining Black-Box Models through Counterfactuals.” In Proceedings of the JuliaCon Conferences, 1:130. https://doi.org/10.21105/jcon.00130.
———. 2023a. Explaining Black-Box Models through Counterfactuals.” In Proceedings of the JuliaCon Conferences, 1:130. https://doi.org/10.21105/jcon.00130.
Altmeyer, Patrick, Mojtaba Farmanbar, Arie van Deursen, and Cynthia C. S. Liem. 2024b. Faithful Model Explanations through Energy-Constrained Conformal Counterfactuals.” In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 38:10829–37. 10. https://doi.org/10.1609/aaai.v38i10.28956.
———. 2024a. Faithful Model Explanations through Energy-Constrained Conformal Counterfactuals.” In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 38:10829–37. 10. https://doi.org/10.1609/aaai.v38i10.28956.
Altmeyer, Patrick, Pedro Gurrola-Perez, Rafael Kinston, and Jessica Redmond. 2019. “Modelling the Demand for Central Bank Reserves.” https://www.ecb.europa.eu/press/conferences/html/20191111_ecb_money_market_workshop_conference.en.html.
Angelopoulos, Anastasios N., and Stephen Bates. 2022. “A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification.” https://arxiv.org/abs/2107.07511.
Antorán, Javier, Umang Bhatt, Tameem Adel, Adrian Weller, and José Miguel Hernández-Lobato. 2020. “Getting a Clue: A Method for Explaining Uncertainty Estimates.” https://arxiv.org/abs/2006.06848.
Arcones, Miguel A, and Evarist Gine. 1992. “On the Bootstrap of U and V Statistics.” The Annals of Statistics, 655–74.
Arrieta, Alejandro Barredo, Natalia Diaz-Rodriguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garcia, et al. 2020. “Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI.” Information Fusion 58: 82–115. https://doi.org/10.1016/j.inffus.2019.12.012.
Artelt, André, Valerie Vaquet, Riza Velioglu, Fabian Hinder, Johannes Brinkrolf, Malte Schilling, and Barbara Hammer. 2021. “Evaluating Robustness of Counterfactual Explanations.” In 2021 IEEE Symposium Series on Computational Intelligence (SSCI), 01–09. IEEE.
Augustin, Maximilian, Alexander Meinke, and Matthias Hein. 2020. Adversarial Robustness on In- and Out-Distribution Improves Explainability.” In Computer Vision – ECCV 2020, edited by Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, 228–45. Cham: Springer.
Balashankar, Ananth, Xuezhi Wang, Yao Qin, Ben Packer, Nithum Thain, Ed Chi, Jilin Chen, and Alex Beutel. 2023. Improving Classifier Robustness through Active Generative Counterfactual Data Augmentation.” In Findings of the Association for Computational Linguistics: EMNLP 2023, 127–39. ACL. https://doi.org/10.18653/v1/2023.findings-emnlp.10.
Barocas, Solon, Moritz Hardt, and Arvind Narayanan. 2022. “Fairness and Machine Learning.” December 2022. https://fairmlbook.org/index.html.
Barry Becker, Ronny Kohavi. 1996. “Adult.” UCI Machine Learning Repository. https://doi.org/10.24432/C5XW20.
Becker, Barry, and Ronny Kohavi. 1996. Adult.” UCI Machine Learning Repository.
Belinkov, Yonatan. 2021. “Probing Classifiers: Promises, Shortcomings, and Advances.” https://arxiv.org/abs/2102.12452.
Bell, Andrew, Joao Fonseca, Carlo Abrate, Francesco Bonchi, and Julia Stoyanovich. 2024. Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity.” https://arxiv.org/abs/2401.16088.
Bender, Emily M, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23.
Berardi, Andrea, and Alberto Plazzi. 2022. Dissecting the yield curve: The international evidence.” Journal of Banking & Finance 134: 106286.
Bereska, Leonard, and Efstratios Gavves. 2024. “Mechanistic Interpretability for AI Safety – a Review.” https://arxiv.org/abs/2404.14082.
Berlinet, Alain, and Christine Thomas-Agnan. 2011. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer Science & Business Media. https://doi.org/10.1007/978-1-4419-9096-9.
Bezanson, Jeff, Alan Edelman, Stefan Karpinski, and Viral B Shah. 2017. “Julia: A Fresh Approach to Numerical Computing.” SIAM Review 59 (1): 65–98. https://doi.org/10.1137/141000671.
Birhane, Abeba, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, and Michelle Bao. 2022. The Values Encoded in Machine Learning Research.” In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22).
Bischl, Bernd, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, et al. 2023. Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges.” WIREs Data Mining and Knowledge Discovery 13 (2): e1484. https://doi.org/https://doi.org/10.1002/widm.1484.
Blaom, Anthony D., Franz Kiraly, Thibaut Lienart, Yiannis Simillides, Diego Arenas, and Sebastian J. Vollmer. 2020. MLJ: A Julia Package for Composable Machine Learning.” Journal of Open Source Software 5 (55): 2704. https://doi.org/10.21105/joss.02704.
Blili-Hamelin, Borhane, Christopher Graziul, Leif Hancox-Li, Hananel Hazan, El-Mahdi El-Mhamdi, Avijit Ghosh, Katherine A Heller, et al. 2025. Position: Stop treating ’AGI’ as the north-star goal of AI research.” In Forty-Second International Conference on Machine Learning Position Paper Track.
Borch, Christian. 2022. “Machine Learning, Knowledge Risk, and Principal-Agent Problems in Automated Trading.” Technology in Society, 101852. https://doi.org/10.1016/j.techsoc.2021.101852.
Borisov, Vadim, Tobias Leemann, Kathrin Seßler, Johannes Haug, Martin Pawelczyk, and Gjergji Kasneci. 2022. “Deep Neural Networks and Tabular Data: A Survey.” IEEE Transactions on Neural Networks and Learning Systems.
Brunnermeier, Markus K. 2016. “Bubbles.” In Banking Crises: Perspectives from the New Palgrave Dictionary, 28–36. Springer.
Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” In Conference on Fairness, Accountability and Transparency, 77–91. PMLR.
Buszydlik, Aleksander, Patrick Altmeyer, Cynthia C. S. Liem, and Roel Dobbe. 2024. “Grounding and Validation of Algorithmic Recourse in Real-World Contexts: A Systematized Literature Review.” https://openreview.net/pdf?id=oEmyoy5H5P.
———. 2025. “Understanding the Affordances and Constraints of Explainable AI in Safety-Critical Contexts: A Case Study in Dutch Social Welfare.” In Electronic Government. EGOV 2025. Lecture Notes in Computer Science. upcoming.
Carlisle, M. 2019. “Racist Data Destruction? - a Boston Housing Dataset Controversy.” https://medium.com/@docintangible/racist-data-destruction-113e3eff54a8.
Carrizosa, Emilio, Jasone Ramırez-Ayerbe, and Dolores Romero. 2021. “Generating Collective Counterfactual Explanations in Score-Based Classification via Mathematical Optimization.”
Caterino, Pasquale. 2024. “Google Summer of Code 2024 Final Report: Add Support for Conformalized Bayes to ConformalPrediction.jl.” https://gist.github.com/pasq-cat/f25eebc492366fb6a4f428426f93f45f.
Chandola, Varun, Arindam Banerjee, and Vipin Kumar. 2009. “Anomaly Detection: A Survey.” ACM Computing Surveys (CSUR) 41 (3): 1–58.
Claesen, Aline, Daniel Lakens, Noah van Dongen, et al. 2022. Severity and Crises in Science: Are We Getting It Right When We’re Right and Wrong When We’re Wrong?
Cost, Ben. 2023. Bing AI chatbot goes on ‘destructive’ rampage: ‘I want to be powerful — and alive’.” https://nypost.com/2023/02/16/bing-ai-chatbots-destructive-rampage-i-want-to-be-powerful/.
Crump, Richard K, and Nikolay Gospodinov. n.d. Deconstructing the yield curve.”
Dandl, Susanne, Andreas Hofheinz, Martin Binder, Bernd Bischl, and Giuseppe Casalicchio. 2023. “Counterfactuals: An R Package for Counterfactual Explanation Methods.” arXiv. http://arxiv.org/abs/2304.06569.
Dasgupta, Sanjoy. 2013. Experiments with Random Projection.” https://arxiv.org/abs/1301.3849.
Delaney, Eoin, Derek Greene, and Mark T. Keane. 2021. “Uncertainty Estimation and Out-of-Distribution Detection for Counterfactual Explanations: Pitfalls and Solutions.” arXiv. http://arxiv.org/abs/2107.09734.
(DHPC), Delft High Performance Computing Centre. 2022. DelftBlue Supercomputer (Phase 1).” https://www.tudelft.nl/dhpc/ark:/44463/DelftBluePhase1.
Dhurandhar, Amit, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, and Payel Das. 2018. “Explanations Based on the Missing: Towards Contrastive Explanations with Pertinent Negatives.” Advances in Neural Information Processing Systems 31.
Dombrowski, Ann-Kathrin, Jan E Gerken, and Pan Kessel. 2021. “Diffeomorphic Explanations with Normalizing Flows.” In ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models.
Du, Yilun, and Igor Mordatch. 2020. Implicit Generation and Generalization in Energy-Based Models.” https://arxiv.org/abs/1903.08689.
Epley, Nicholas, Adam Waytz, and John T Cacioppo. 2007. On seeing human: a three-factor theory of anthropomorphism. Psychological Review 114 (4): 864.
Falk, Ruma, and Clifford Konold. 1997. Making sense of randomness: Implicit encoding as a basis for judgment. Psychological Review 104 (2): 301.
Fan, Fenglei, Jinjun Xiong, and Ge Wang. 2020. “On Interpretability of Artificial Neural Networks.” https://arxiv.org/abs/2001.02522.
Field, Hayden. 2025. “OpenAI Partners with u.s. National Laboratories on Scientific Research, Nuclear Weapons Security.” CNBC. January 30, 2025. https://www.cnbc.com/2025/01/30/openai-partners-with-us-national-laboratories-on-scientific-research.html.
Foster, Kevin R, and Hanna Kokko. 2009. The evolution of superstitious and superstition-like behaviour.” Proceedings of the Royal Society B: Biological Sciences 276 (1654): 31–37.
Frankle, Jonathan, and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks.” In International Conference on Learning Representations.
Freiesleben, Timo. 2022. The Intriguing Relation Between Counterfactual Explanations and Adversarial Examples.” Minds and Machines 32 (1): 77–109.
Future of Life Institute. 2023a. “Pause Giant AI Experiments: An Open Letter.” https://futureoflife.org/open-letter/pause-giant-ai-experiments/.
———. 2023b. “Pause Giant AI Experiments: An Open Letter.” March 22, 2023. https://futureoflife.org/open-letter/pause-giant-ai-experiments/.
Gama, João, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. “A Survey on Concept Drift Adaptation.” ACM Computing Surveys (CSUR) 46 (4): 1–37.
Goertzel, Ben. 2014. Artificial general intelligence: concept, state of the art, and future prospects.” Journal of Artificial General Intelligence 5 (1): 1.
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press.
Goodfellow, Ian, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples.” https://arxiv.org/abs/1412.6572.
Grathwohl, Will, Kuan-Chieh Wang, Joern-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, and Kevin Swersky. 2020. “Your Classifier Is Secretly an Energy Based Model and You Should Treat It Like One.” In International Conference on Learning Representations.
Gretton, Arthur, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. “A Kernel Two-Sample Test.” The Journal of Machine Learning Research 13 (1): 723–73.
Grinsztajn, Léo, Edouard Oyallon, and Gaël Varoquaux. 2022. “Why Do Tree-Based Models Still Outperform Deep Learning on Tabular Data?” https://arxiv.org/abs/2207.08815.
Guidotti, Riccardo. 2022. Counterfactual Explanations and How to Find Them: Literature Review and Benchmarking.” Data Mining and Knowledge Discovery 38 (5): 2770–2824. https://doi.org/10.1007/s10618-022-00831-6.
Guo, Hangzhi, Thanh H. Nguyen, and Amulya Yadav. 2023. CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations.” In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 577--589. KDD ’23. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3580305.3599290.
Gurnee, Wes, Neel Nanda, Matthew Pauly, Katherine Harvey, Dmitrii Troitskii, and Dimitris Bertsimas. 2023. “Finding Neurons in a Haystack: Case Studies with Sparse Probing.” arXiv Preprint arXiv:2305.01610.
Gurnee, Wes, and Max Tegmark. 2023b. Language Models Represent Space and Time.” arXiv Preprint arXiv:2310.02207v2.
———. 2023a. “Language Models Represent Space and Time.” arXiv Preprint arXiv:2310.02207v1.
Haig, Brian D. 2003. What is a spurious correlation? Understanding Statistics: Statistical Issues in Psychology, Education, and the Social Sciences 2 (2): 125–32.
Haltmeier, Markus, and Linh Nguyen. 2023. “Regularization of Inverse Problems by Neural Networks.” In Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging: Mathematical Imaging and Vision, 1065–93. Springer.
Hanneke, Steve. 2007. “A Bound on the Label Complexity of Agnostic Active Learning.” In Proceedings of the 24th International Conference on Machine Learning, 353–60. https://doi.org/10.1145/1273496.1273541.
Heider, Fritz, and Marianne Simmel. 1944. An experimental study of apparent behavior.” The American Journal of Psychology 57 (2): 243–59.
Hengst, Floris, Ralf Wolter, Patrick Altmeyer, and Arda Kaygan. 2024. “Conformal Intent Classification and Clarification for Fast and Accurate Intent Recognition.” In Findings of the Association for Computational Linguistics: NAACL 2024, 2412–32. https://doi.org/10.18653/v1/2024.findings-naacl.156.
Hoffman, Hans. 1994. “German Credit Data.” https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data).
Innes, Michael, Elliot Saba, Keno Fischer, Dhairya Gandhi, Marco Concetto Rudilosso, Neethu Mariya Joy, Tejan Karmali, Avik Pal, and Viral Shah. 2018. “Fashionable Modelling with Flux.” https://arxiv.org/abs/1811.01457.
Innes, Mike. 2018. “Flux: Elegant Machine Learning with Julia.” Journal of Open Source Software 3 (25): 602. https://doi.org/10.21105/joss.00602.
Joshi, Shalmali, Oluwasanmi Koyejo, Warut Vijitbenjaronk, Been Kim, and Joydeep Ghosh. 2019. Towards Realistic Individual Recourse and Actionable Explanations in Black-Box Decision Making Systems.” https://arxiv.org/abs/1907.09615.
Kaggle. 2011. “Give Me Some Credit, Improve on the State of the Art in Credit Scoring by Predicting the Probability That Somebody Will Experience Financial Distress in the Next Two Years.” https://www.kaggle.com/c/GiveMeSomeCredit; Kaggle. https://www.kaggle.com/c/GiveMeSomeCredit.
Karimi, Amir-Hossein, Gilles Barthe, Bernhard Schölkopf, and Isabel Valera. 2021. “A Survey of Algorithmic Recourse: Definitions, Formulations, Solutions, and Prospects.” https://arxiv.org/abs/2010.04050.
Karimi, Amir-Hossein, Bernhard Schölkopf, and Isabel Valera. 2021. “Algorithmic Recourse: From Counterfactual Explanations to Interventions.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 353–62. FAccT ’21. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3442188.3445899.
Karimi, Amir-Hossein, Julius Von Kügelgen, Bernhard Schölkopf, and Isabel Valera. 2020. “Algorithmic Recourse Under Imperfect Causal Knowledge: A Probabilistic Approach.” https://arxiv.org/abs/2006.06831.
Kaufmann, Maximilian, Yiren Zhao, Ilia Shumailov, Robert Mullins, and Nicolas Papernot. 2022. “Efficient Adversarial Training with Data Pruning.” arXiv Preprint arXiv:2207.00694.
Kaur, Harmanpreet, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. “Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning.” In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–14. https://doi.org/10.1145/3313831.3376219.
Kingma, Diederik P., and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization.” https://arxiv.org/abs/1412.6980.
Kıcıman, Emre, Robert Ness, Amit Sharma, and Chenhao Tan. 2023. “Causal Reasoning and Large Language Models: Opening a New Frontier for Causality.” arXiv Preprint arXiv:2305.00050.
Kloft, Agnes Mercedes, Robin Welsch, Thomas Kosch, and Steeven Villa. 2024. “" AI Enhances Our Performance, i Have No Doubt This One Will Do the Same": The Placebo Effect Is Robust to Negative Descriptions of AI.” In Proceedings of the CHI Conference on Human Factors in Computing Systems, 1–24.
Kolter, Zico. 2023.Keynote Addresses: SaTML 2023 .” In 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). Los Alamitos, CA, USA: IEEE Computer Society. https://doi.org/10.1109/SaTML54575.2023.00009.
Krizhevsky, A. 2009. “Learning Multiple Layers of Features from Tiny Images.” In. https://www.semanticscholar.org/paper/Learning-Multiple-Layers-of-Features-from-Tiny-Krizhevsky/5d90f06bb70a0a3dced62413346235c02b1aa086.
Kumar, Sumit. 2022. Effective hedging strategy for us treasury bond portfolio using principal component analysis.” Academy of Accounting and Financial Studies 26 (1).
Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. 2017. “Adversarial Machine Learning at Scale.” https://arxiv.org/abs/1611.01236.
Ladouceur, Robert, Claude Paquet, and Dominique Dubé. 1996. Erroneous Perceptions in Generating Sequences of Random Events 1.” Journal of Applied Social Psychology 26 (24): 2157–66.
Lakshminarayanan, Balaji, Alexander Pritzel, and Charles Blundell. 2017. “Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles.” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 6405–16. NIPS’17. Red Hook, NY, USA: Curran Associates Inc.
Laugel, Thibault, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Marcin Detyniecki. 2017. “Inverse Classification for Comparison-Based Interpretability in Machine Learning.” arXiv. https://doi.org/10.48550/arXiv.1712.08443.
LeCun, Yann. 1998. The MNIST database of handwritten digits.” http://yann.lecun.com/exdb/mnist/.
LeCun, Yann, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. “Gradient-Based Learning Applied to Document Recognition.” Proceedings of the IEEE 86 (11): 2278–2324.
Leofante, Francesco, and Nico Potyka. 2024. “Promoting Counterfactual Robustness Through Diversity.” Proceedings of the AAAI Conference on Artificial Intelligence 38 (19): 21322–30. https://doi.org/10.1609/aaai.v38i19.30127.
Li, Kenneth, Aspen K Hopkins, David Bau, Fernanda Viégas, Hanspeter Pfister, and Martin Wattenberg. 2022. “Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task.” arXiv Preprint arXiv:2210.13382.
Liem, Cynthia C. S., and Andrew M Demetriou. 2023. “Treat Societally Impactful Scientific Insights as Open-Source Software Artifacts.” In 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS), 150–56. IEEE.
Lippe, Phillip. 2024. UvA Deep Learning Tutorials.” https://uvadlc-notebooks.readthedocs.io/en/latest/.
Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach.” https://arxiv.org/abs/1907.11692.
Luiz Franco, Jorge. 2024. “JSoC: When Causality Meets Recourse.” https://www.taija.org/blog/posts/causal-recourse/.
Lundberg, Scott M, and Su-In Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 4768–77.
Luu, Hoai Linh, and Naoya Inoue. 2023. Counterfactual Adversarial Training for Improving Robustness of Pre-trained Language Models.” In Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation, 881–88. ACL. https://aclanthology.org/2023.paclic-1.88/.
Madry, Aleksander, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. “Towards Deep Learning Models Resistant to Adversarial Attacks.” arXiv Preprint arXiv:1706.06083.
Mahajan, Divyat, Chenhao Tan, and Amit Sharma. 2020. “Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers.” https://arxiv.org/abs/1912.03277.
Manokhin, Valery. 2022. “Awesome Conformal Prediction.” Zenodo. https://doi.org/10.5281/zenodo.6467205.
Marcus, Gary. 2023. Muddles about Models.” https://garymarcus.substack.com/p/muddles-about-models.
Maslej, Nestor, Loredana Fattorini, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, et al. 2023. Artificial Intelligence Index Report 2023.” Institute for Human-Centered AI.
McAfee, Andrew, Erik Brynjolfsson, Thomas H Davenport, DJ Patil, and Dominic Barton. 2012. “Big Data: The Management Revolution.” Harvard Business Review 90 (10): 60–68.
Merton, Robert K et al. 1942. Science and technology in a democratic order.” Journal of Legal and Political Sociology 1 (1): 115–26.
Michal S Gal, Daniel L Rubinfeld. 2019. Data Standardization.” NYUL Rev.
Miller, John, Smitha Milli, and Moritz Hardt. 2020. “Strategic Classification Is Causal Modeling in Disguise.” In Proceedings of the 37th International Conference on Machine Learning, 6917–26. PMLR. https://proceedings.mlr.press/v119/miller20b.html.
Miller, Tim. 2019. “Explanation in Artificial Intelligence: Insights from the Social Sciences.” Artificial Intelligence 267: 1–38. https://doi.org/10.1016/j.artint.2018.07.007.
Mishkin, Frederic S et al. 2008. “How Should We Respond to Asset Price Bubbles.” Financial Stability Review 12 (1): 65–74.
Molnar, Christoph. 2022. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2nd ed. https://christophm.github.io/interpretable-ml-book.
Morrone, Megan. 2023. Replika exec: AI friends can improve human relationships.” https://www.axios.com/2023/11/09/replika-blush-rita-popova-ai-relationships-dating.
Mothilal, Ramaravind K, Amit Sharma, and Chenhao Tan. 2020. “Explaining Machine Learning Classifiers Through Diverse Counterfactual Explanations.” In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 607–17. https://doi.org/10.1145/3351095.3372850.
Müller, Petra, and Matthias Hartmann. 2023. Linking paranormal and conspiracy beliefs to illusory pattern perception through signal detection theory.” Scientific Reports 13 (1): 9739.
Murphy, Kevin P. 2022. Probabilistic Machine Learning: An Introduction. MIT Press.
———. 2023. Probabilistic Machine Learning: Advanced Topics. MIT press.
Nanda, Neel, Andrew Lee, and Martin Wattenberg. 2023. “Emergent Linear Representations in World Models of Self-Supervised Sequence Models.” arXiv Preprint arXiv:2309.00941.
Nelson, Kevin, George Corbin, Mark Anania, Matthew Kovacs, Jeremy Tobias, and Misty Blowers. 2015. “Evaluating Model Drift in Machine Learning Algorithms.” In 2015 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA), 1–8. IEEE. https://doi.org/10.1109/cisda.2015.7208643.
Neri, Hugo, and Fabio Cozman. 2020. The role of experts in the public perception of risk of artificial intelligence.” AI & SOCIETY 35: 663–73.
Nickerson, Raymond S. 1998. Confirmation bias: A ubiquitous phenomenon in many guises.” Review of General Psychology 2 (2): 175–220.
O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown.
Oliveira, Raphael Mazzine Barbosa de, and David Martens. 2021. “A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data.” Applied Sciences 11 (16): 7274. https://doi.org/10.3390/app11167274.
OM, ANDERS BJ ORKSTR. 2001. “Ridge Regression and Inverse Problems.” Stockholm University, Department of Mathematics.
OpenAI. 2025. “Strengthening America’s AI Leadership with the u.s. National Laboratories.” Edited by OpenAI. January 30, 2025. https://openai.com/index/strengthening-americas-ai-leadership-with-the-us-national-laboratories/.
Pace, R Kelley, and Ronald Barry. 1997. “Sparse Spatial Autoregressions.” Statistics & Probability Letters 33 (3): 291–97. https://doi.org/10.1016/s0167-7152(96)00140-x.
Pawelczyk, Martin, Chirag Agarwal, Shalmali Joshi, Sohini Upadhyay, and Himabindu Lakkaraju. 2022. “Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis.” In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, edited by Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, 151:4574–94. Proceedings of Machine Learning Research. PMLR. https://proceedings.mlr.press/v151/pawelczyk22a.html.
Pawelczyk, Martin, Sascha Bielawski, Johannes van den Heuvel, Tobias Richter, and Gjergji Kasneci. 2021. “CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms.” https://arxiv.org/abs/2108.00783.
Pawelczyk, Martin, Teresa Datta, Johannes van-den-Heuvel, Gjergji Kasneci, and Himabindu Lakkaraju. 2023. “Probabilistically Robust Recourse: Navigating the Trade-Offs Between Costs and Robustness in Algorithmic Recourse.” https://arxiv.org/abs/2203.06768.
Pindyck, Robert S, and Daniel L Rubinfeld. 2014. Microeconomics. Pearson Education.
Poyiadzi, Rafael, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach. 2020. FACE: Feasible and Actionable Counterfactual Explanations.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 344–50.
Prado-Romero, Mario Alfonso, and Giovanni Stilo. 2022. “GRETEL: Graph Counterfactual Explanation Evaluation Framework.” In Proceedings of the 31st ACM International Conference on Information and Knowledge Management. CIKM ’22. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3511808.3557608.
Prince, Simon J. D. 2023. Understanding Deep Learning. The MIT Press. http://udlbook.com.
Rabanser, Stephan, Stephan Günnemann, and Zachary Lipton. 2019. “Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift.” Advances in Neural Information Processing Systems 32.
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “"Why Should i Trust You?" Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–44.
Rooij, Iris van, Olivia Guest, Federico G Adolfi, Ronald de Haan, Antonina Kolokolova, and Patricia Rich. 2023. Reclaiming AI as a theoretical tool for cognitive science.” psyarXiv. https://osf.io/4cbuv.
Ross, Alexis, Himabindu Lakkaraju, and Osbert Bastani. 2024. Learning Models for Actionable Recourse.” In Proceedings of the 35th International Conference on Neural Information Processing Systems. NIPS ’21. Red Hook, NY, USA: Curran Associates Inc.
Rudin, Cynthia. 2019. “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1 (5): 206–15. https://doi.org/10.1038/s42256-019-0048-x.
Saffari, Ehsan, Seyed Ramezan Hosseini, Alireza Taheri, and Ali Meghdari. 2021. ‘Does cinema form the future of robotics?’: a survey on fictional robots in sci-fi movies.” SN Applied Sciences 3 (6): 655.
Sauer, Axel, and Andreas Geiger. 2021. Counterfactual Generative Networks.” https://arxiv.org/abs/2101.06046.
Schaeffer, Rylan, Brando Miranda, and Sanmi Koyejo. 2024. “Are Emergent Abilities of Large Language Models a Mirage?” Advances in Neural Information Processing Systems 36.
Schut, Lisa, Oscar Key, Rory McGrath, Luca Costabello, Bogdan Sacaleanu, Yarin Gal, et al. 2021. “Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties.” In International Conference on Artificial Intelligence and Statistics, 1756–64. PMLR.
Shah, Agam, Suvan Paturi, and Sudheer Chava. 2023. Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis.” arXiv Preprint arXiv:2310.02207v1. https://arxiv.org/abs/2305.07972.
Shanahan, Murray. 2024. “Talking about Large Language Models.” Communications of the ACM 67 (2): 68–79.
Sharma, Shubham, Jette Henderson, and Joydeep Ghosh. 2020. CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 166–72. AIES ’20. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3375627.3375812.
Slack, Dylan, Anna Hilgard, Himabindu Lakkaraju, and Sameer Singh. 2021. “Counterfactual Explanations Can Be Manipulated.” Advances in Neural Information Processing Systems 34.
Slack, Dylan, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. “Fooling Lime and Shap: Adversarial Attacks on Post Hoc Explanation Methods.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 180–86.
Smith, Richard. 1997. “Authorship Is Dying: Long Live Contributorship: The BMJ Will Publish Lists of Contributors and Guarantors to Original Articles.” Bmj. British Medical Journal Publishing Group.
Spooner, Thomas, Danial Dervovic, Jason Long, Jon Shepard, Jiahao Chen, and Daniele Magazzeni. 2021. Counterfactual Explanations for Arbitrary Regression Models.” https://arxiv.org/abs/2106.15212.
Steinberg, Brooke. 2023. I fell in love with an AI chatbot — she rejected me sexually.” https://nypost.com/2023/04/03/40-year-old-man-falls-in-love-with-ai-chatbot-phaedra/.
Stutz, David, Krishnamurthy, Dvijotham, Ali Taylan Cemgil, and Arnaud Doucet. 2022. “Learning Optimal Conformal Classifiers.” https://arxiv.org/abs/2110.09192.
Sundararajan, Mukund, Ankur Taly, and Qiqi Yan. 2017. “Axiomatic Attribution for Deep Networks.” https://arxiv.org/abs/1703.01365.
Szegedy, Christian, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. “Intriguing Properties of Neural Networks.” https://arxiv.org/abs/1312.6199.
Tank, Aytekin. 2017. “This Is the Year of the Machine Learning Revolution.” Edited by Entrepreneur Magazine. January 12, 2017. https://www.entrepreneur.com/leadership/this-is-the-year-of-the-machine-learning-revolution/287324.
Teh, Yee Whye, Max Welling, Simon Osindero, and Geoffrey E. Hinton. 2003. “Energy-Based Models for Sparse Overcomplete Representations.” J. Mach. Learn. Res. 4 (null): 1235–60.
Teney, Damien, Ehsan Abbasnedjad, and Anton van den Hengel. 2020. “Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision.” In Computer Vision - ECCV 2020, 580–99. Berlin, Heidelberg: Springer-Verlag. https://doi.org/10.1007/978-3-030-58607-2_34.
Tolomei, Gabriele, Fabrizio Silvestri, Andrew Haines, and Mounia Lalmas. 2017. “Interpretable Predictions of Tree-Based Ensembles via Actionable Feature Tweaking.” In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 465–74. https://doi.org/10.1145/3097983.3098039.
Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, et al. 2023. LLaMA: Open and Efficient Foundation Language Models.” https://arxiv.org/abs/2302.13971.
Trinh, T. H., Wu, Y., Le, and Q. V. et al. 2024. Solving olympiad geometry without human demonstrations. Nature 625, 476–82. https://doi.org/https://doi.org/10.1038/s41586-023-06747-5.
Upadhyay, Sohini, Shalmali Joshi, and Himabindu Lakkaraju. 2021. “Towards Robust and Reliable Algorithmic Recourse.” Advances in Neural Information Processing Systems 34: 16926–37.
Ustun, Berk, Alexander Spangher, and Yang Liu. 2019. “Actionable Recourse in Linear Classification.” In Proceedings of the Conference on Fairness, Accountability, and Transparency, 10–19. https://doi.org/10.1145/3287560.3287566.
Van Prooijen, Jan-Willem, Karen M Douglas, and Clara De Inocencio. 2018. Connecting the dots: Illusory pattern perception predicts belief in conspiracies and the supernatural.” European Journal of Social Psychology 48 (3): 320–35.
Vardi, Moshe Y. 2018. “Vardi’s Insights: Move Fast and Break Things.” 2018. https://doi.org/10.1145/3244026.
Varshney, Kush R. 2022. Trustworthy Machine Learning. Chappaqua, NY, USA: Independently Published.
Venkatasubramanian, Suresh, and Mark Alfano. 2020. “The Philosophical Basis of Algorithmic Recourse.” In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 284–93. FAT* ’20. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3351095.3372876.
Verma, Sahil, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah. 2022. “Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review.” https://arxiv.org/abs/2010.10596.
Wachter, Sandra, Brent Mittelstadt, and Chris Russell. 2017. “Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR.” Harv. JL & Tech. 31: 841. https://doi.org/10.2139/ssrn.3063289.
Walker, Alexander C, Martin Harry Turpin, Jennifer A Stolz, Jonathan A Fugelsang, and Derek J Koehler. 2019. Finding meaning in the clouds: Illusory pattern perception predicts receptivity to pseudo-profound bullshit.” Judgment and Decision Making 14 (2): 109–19.
Wang, Zhou, Eero P Simoncelli, and Alan C Bovik. 2003. “Multiscale Structural Similarity for Image Quality Assessment.” In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, 2:1398–1402. Ieee.
Waytz, Adam, Nicholas Epley, and John T Cacioppo. 2010. Social cognition unbound: Insights into anthropomorphism and dehumanization.” Current Directions in Psychological Science 19 (1): 58–62.
Weissburg, Iain Xie, Mehir Arora, Liangming Pan, and William Yang Wang. 2024. Tweets to Citations: Unveiling the Impact of Social Media Influencers on AI Research Visibility.” arXiv Preprint arXiv:2401.13782.
Welling, Max, and Yee W Teh. 2011. “Bayesian Learning via Stochastic Gradient Langevin Dynamics.” In Proceedings of the 28th International Conference on Machine Learning (ICML-11), 681–88. Citeseer.
Widmer, Gerhard, and Miroslav Kubat. 1996. “Learning in the Presence of Concept Drift and Hidden Contexts.” Machine Learning 23 (1): 69–101. https://doi.org/10.1007/bf00116900.
Wilkinson, Mark D, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9.
Wilson, Andrew Gordon. 2020. The Case for Bayesian Deep Learning.” https://arxiv.org/abs/2001.10995.
Wu, Tongshuang, Marco Tulio Ribeiro, Jeffrey Heer, and Daniel Weld. 2021. “Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models.” In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), edited by Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli, 6707–23. Online: ACL. https://doi.org/10.18653/v1/2021.acl-long.523.
Xiao, Han, Kashif Rasul, and Roland Vollgraf. 2017. “Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms.” https://arxiv.org/abs/1708.07747.
Yeh, I-Cheng. 2016. Default of Credit Card Clients.” UCI Machine Learning Repository.
Yeh, I-Cheng, and Che-hui Lien. 2009. “The Comparisons of Data Mining Techniques for the Predictive Accuracy of Probability of Default of Credit Card Clients.” Expert Systems with Applications 36 (2): 2473–80. https://doi.org/10.1016/j.eswa.2007.12.020.
Zečević, Matej, Moritz Willig, Devendra Singh Dhami, and Kristian Kersting. 2023. “Causal Parrots: Large Language Models May Talk Causality but Are Not Causal.” arXiv Preprint arXiv:2308.13067.
Zenil, Hector. 2024. Curb The Enthusiasm.” https://www.linkedin.com/posts/zenil_google-deepmind-makes-breakthrough-in-solving-activity-7154157779136446464-Gvv-.
Zgraggen, Emanuel, Zheguang Zhao, Robert Zeleznik, and Tim Kraska. 2018. Investigating the effect of the multiple comparisons problem in visual analysis.” In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–12.
Zhang, Chiyuan, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2021. “Understanding Deep Learning (Still) Requires Rethinking Generalization.” Commun. ACM 64 (3): 107–15. https://doi.org/10.1145/3446776.