TILOS-HDSI Seminar: Engineering Interpretable and Faithful AI Systems
René Vidal, University of Pennsylvania Abstract: Large Language Models (LLMs) and Vision Language Models (VLMs) have achieved remarkable performance across a wide range of tasks. However, their growing deployment has exposed fundamental limitations in faithfulness, safety, and transparency. In this talk, I will present a unified perspective on addressing these challenges through principled model interventions […]