Published 30 January 2025 at 1920 × 1080 in Transformers learn in-context by (functional) gradient descent