Optimization for ML and AI Seminar: Self-play Algorithms for Math Theorem Proving

HDSI 123 and Virtual 3234 Matthews Ln, La Jolla

Tengyu Ma, Stanford University Abstract: I will discuss RL algorithms for automated theorem proving with LLMs, especially in the possible future regime where we run out of high-quality training data. To keep improving the models with limited data, we draw inspiration from mathematicians, who continuously develop new results, partly by proposing novel conjectures or exercises […]