• Optimization for ML and AI Seminar: Fantastic Pretraining Optimizers and Where to Find Them

    HDSI 123 and Virtual 3234 Matthews Ln, La Jolla, CA, United States

    Tengyu Ma, Stanford Abstract: AdamW has long been the dominant optimizer in language model pretraining, despite numerous claims that alternative optimizers offer 1.4 to 2x speedup. We posit that two methodological shortcomings have obscured fair comparisons and hindered practical adoption: (i) unequal hyperparameter tuning and (ii) limited or misleading evaluation setups. To address these two […]

  • TILOS-HDSI Seminar: ComPO: Preference Alignment via Comparison Oracles

    HDSI 123 and Virtual 3234 Matthews Ln, La Jolla, CA, United States

    Tianyi Lin, Columbia University Direct alignment methods are increasingly used for aligning large language models (LLMs) with human preferences. However, these methods suffer from the likelihood displacement, which can be driven by noisy preference pairs that induce similar likelihood for preferred and dis-preferred responses. To address this issue, we consider doing derivative-free optimization based on […]

  • TILOS-HDSI Seminar with Andrej Risteski (Carnegie Mellon)

    HDSI 123 and Virtual 3234 Matthews Ln, La Jolla, CA, United States

    Title and abstract TBA... Andrej Risteski is an Associate Professor at the Machine Learning Department in Carnegie Mellon University. Prior to that, he was a Norbert Wiener Research Fellow jointly in the Applied Math department and IDSS at MIT. Dr. Risteski received his PhD in the Computer Science Department at Princeton University under the advisement […]