Demystifying AI Interpretability
- Date: Mar 28, 2025
- Time: 03:00 PM - 04:15 PM (Local Time Germany)
- Speaker: Federico Adolfi
- Host: MaxPlanckLaw
- Region: Online
- Topic: Discussion and debate formats, lectures

Throughout, we will use a particular lens to demystify what AI interpretability is, and which goals are within or out of its reach: instead of focusing on the promises of (algorithmic) solutions for interpretability, we will focus on the properties of the (computational) problems they attempt to solve. This lens—which we call computational meta-theory—will allow us to put stakeholders’ goals at the centre and to reason about the adequacy of interpretability ‘hammers’ to hit practically meaningful ‘nails’.
Federicoo Adlfi is currently a postdoctoral researcher at the Ernst Strüngmann Institute for Neuroscience, Max Planck Society. He combines a background in cognitive and brain science, computer science, and music. His PhD in Computational Cognitive Science at the University of Bristol focused on establishing a conceptual and formal framework for computational meta-theory and demonstrating its application to problems in psychology, neuroscience, and artificial intelligence. One of these applications is the problem of AI interpretability, for which he and his colleagues recently provided the first formal analyses of the scope and limits of circuit discovery to interpret neural networks.