TILOS-HDSI Seminar: ComPO: Preference Alignment via Comparison Oracles
Tianyi Lin, Columbia University Direct alignment methods are increasingly used for aligning large language models (LLMs) with human preferences. However, these methods suffer from the likelihood displacement, which can be driven by noisy preference pairs that induce similar likelihood for preferred and dis-preferred responses. To address this issue, we consider doing derivative-free optimization based on […]