πTraffic-R1-3B (Public 0.1) π¦
Traffic-R1 is a foundational LLM built specifically for traffic signal control. This publicly available version, Traffic-R1-3B (Public 0.1), delivers superior zero-shot performance and stable generalization, allowing it to reason like a human traffic expert. π§
This model is a checkpoint based on the research in our paper:
Traffic-R1: Reinforced LLMs Bring Human-Like Reasoning to Traffic Signal Control Systems π https://arxiv.org/abs/2508.02344
Introduction Video
Abstract
Traffic signal control (TSC) is vital for mitigating congestion and sustaining urban mobility. In this paper, we introduce Traffic-R1, a foundation model with human-like reasoning for TSC systems. Our model is developed through self-exploration and iteration of reinforced large language models (LLMs) with expert guidance in a simulated traffic environment. Compared to traditional reinforcement learning (RL) and recent LLM-based methods, Traffic-R1 offers three significant advantages. First, Traffic-R1 delivers zero-shot generalisation, transferring unchanged to new road networks and out-of-distribution incidents by utilizing its internal traffic control policies and human-like reasoning. Second, its 3B-parameter architecture is lightweight enough for real-time inference on mobile-class chips, enabling large-scale edge deployment. Third, Traffic-R1 provides an explainable TSC process and facilitates multi-intersection communication through its self-iteration and a new synchronous communication network. Extensive benchmarks demonstrate that Traffic-R1 sets a new state of the art, outperforming strong baselines and training-intensive RL controllers. In practice, the model now manages signals for more than 55,000 drivers daily, shortening average queues by over 5% and halving operator workload.
Compatibility & Reproducibility π οΈ
This model supports a wide range of deployment methods compatible with the Qwen architecture, including those provided by the transformers
library. You can easily use it in a chat mode to interactively discuss traffic-related scenarios.
For more detailed information on deployment, please refer to the official Qwen documentation.
The model is compatible with the signal control evaluation code provided by LLMLight [https://github.com/usail-hkust/LLMTSCS]. You can quickly reproduce our results with minor changes to the prompt format.
A big thanks to these excellent projects! π
Future Releases π
We plan to release our evaluation code (most necessary) and training code soon.
We are working on upgrading base mode Qwen 2.5->Qwen 3 for latest features.
Important Notice β οΈ
This is an earlier checkpoint and doesn't include all the data samples from our offline pretraining stage. We've done this to address commercial and privacy concerns. We will release updates as the model continues to be upgraded internally. π
Model tree for Season998/Traffic-R1
Base model
Qwen/Qwen2.5-3B