This collection provides high-quality, large-scale Romanian pretraining datasets derived from FineWeb-2.