TILOS-HDSI Seminar: ComPO: Preference Alignment via Comparison Oracles

HDSI 123 and Virtual 3234 Matthews Ln, La Jolla

Tianyi Lin, Columbia University Direct alignment methods are increasingly used for aligning large language models (LLMs) with human preferences. However, these methods suffer from the likelihood displacement, which can be driven by noisy preference pairs that induce similar likelihood for preferred and dis-preferred responses. To address this issue, we consider doing derivative-free optimization based on […]