TILOS-HDSI Seminar: Incentivizing Emergent Behaviors for LLMs via Reinforcement Learning

Qualcomm Conference Center Room B (Jacobs Hall first floor) and Virtual 9736 Engineers Ln, La Jolla

Yi Wu, Tsinghua University Abstract: Reinforcement Learning (RL) has become a powerful post-training method for eliciting advanced behaviors in large language models (LLMs). This talk presents recent results showing how RL can incentivize the emergence of LLM capabilities across three domains: (1) multi-player deduction game, Werewolf, where RL-trained LLM agents develop strategic behaviors and outperform […]