Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training Paper • 2506.10952 • Published 3 days ago • 21