TILOS-HDSI Seminar: Incentivizing Emergent Behaviors for LLMs via Reinforcement Learning

Yi Wu, Tsinghua University
Abstract: Reinforcement Learning (RL) has become a powerful post-training method for eliciting advanced behaviors in large language models (LLMs). This talk presents recent results showing how RL can incentivize the emergence of LLM capabilities across three domains: (1) multi-player deduction game, Werewolf, where RL-trained LLM agents develop strategic behaviors and outperform strong human players; (2) agentic search, where large-scale RL enables a 32B model to run multi-step search to answer non-trivial questions beyond commercial baselines; and (3) efficient reasoning, where RL mitigates over-thinking and improves both reliability and compute efficiency.
The papers can be found at
- Werewolf: https://arxiv.org/abs/2310.
18940 (ICML24), https://arxiv.org/abs/2502. 04686 (ICML25) - ASearcher: https://arxiv.org/abs/2508.
07976 - Thinking Efficiency: https://www.arxiv.org/abs/
2506.07104 (NeurIPS25)
All the projects are trained using our large-scale agentic RL system, AReaL, which is open-source at https://github.com/
Yi Wu is an assistant professor at the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University. He obtained his Ph.D. from UC Berkeley and was a researcher at OpenAI from 2019 to 2020. His research focuses on reinforcement learning, multi-agent learning, and LLM agents. His representative works include the value iteration network, the MADDPG/MAPPO algorithm, OpenAI’s hide-and-seek project, and the AReaL project. He received the best paper award at NIPS 2016, the best demo award finalist at ICRA 2024, and MIT TR35 Asia Pacific 2025 award.