Tutorial on AI Alignment (part 2 of 2): Methodologies for AI Alignment

Ahmad Beirami, Google DeepMind
Hamed Hassani, University of Pennsylvania

The second part of the tutorial focuses on AI alignment techniques and is structured as three segments: In the first segment, we examine black-box techniques aimed at aligning models towards various goals (e.g., safety), such as controlled decoding and the best-of-N algorithm. In the second segment, we will also consider efficiency, where we examine information-theoretic techniques designed to improve inference latency, such as model compression or speculative decoding. If time permits, in the final segment, we discuss inference-aware alignment, which is a framework to align models to work better with inference-time compute algorithms.


Leave A Reply

Your email address will not be published. Required fields are marked *