Running 60 60 R1-distilled leaderboard ⚡ Generate a leaderboard of open-r1 models based on evaluation scores