Clicky

AlphaOne

Reasoning Models Thinking Slow and Fast at Test Time

Junyu Zhang uiuc    Runpei Dong uiuc    Han Wang uiuc     Xuying Ning uiuc    
Haoran Geng ucb     Peihao Li ucb     Xialin He uiuc     Yutong Bai ucb     Jitendra Malik ucb    
Saurabh Gupta uiuc     Huan Zhang uiuc
uiuc University of Illinois Urbana-Champaign       ucb UC Berkeley
Equal contribution

🚧 Project page under construction — stay tuned! 🚧

Overview


We present AlphaOne (𝛼1), a universal framework for modulating reasoning progress in large reasoning models (LRMs) at test time. 𝛼1 first introduces 𝛼 moment, which represents the scaled thinking phase with a universal parameter 𝛼. Within this scaled pre-𝛼 moment phase, it dynamically schedules slow thinking transitions by modeling the insertion of reasoning transition tokens as a Bernoulli stochastic process. After the 𝛼 moment, 𝛼1 deterministically terminates slow thinking with the end-of-thinking token, thereby fostering fast reasoning and efficient answer generation.

This approach unifies and generalizes existing monotonic scaling methods by enabling flexible and dense slow-to-fast reasoning modulation, while offering critical insights into the joint optimization of reasoning capabilities and computational efficiency.

Figure 1. Overview of AlphaOne.
Figure 1. Overview of AlphaOne (𝛼1).

BibTeX

        @article{AlphaOne25,
          title={AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time},
          author={Zhang, Junyu and Dong, Runpei and Wang, Han and Ning, Xuying and Geng, Haoran and Li, Peihao and He, Xialin and Bai, Yutong and Malik, Jitendra and Gupta, Saurabh and Zhang, Huan},
          journal={arXiv preprint arXiv:2505.24863},
          year={2025}
        }