Papers
arxiv:2506.18787

3D Arena: An Open Platform for Generative 3D Evaluation

Published on Jun 23
· Submitted by dylanebert on Jun 24
Authors:

Abstract

3D Arena evaluates generative 3D models using human preferences, revealing insights into visual and textural features' impact on quality.

AI-generated summary

Evaluating Generative 3D models remains challenging due to misalignment between automated metrics and human perception of quality. Current benchmarks rely on image-based metrics that ignore 3D structure or geometric measures that fail to capture perceptual appeal and real-world utility. To address this gap, we present 3D Arena, an open platform for evaluating image-to-3D generation models through large-scale human preference collection using pairwise comparisons. Since launching in June 2024, the platform has collected 123,243 votes from 8,096 users across 19 state-of-the-art models, establishing the largest human preference evaluation for Generative 3D. We contribute the iso3d dataset of 100 evaluation prompts and demonstrate quality control achieving 99.75% user authenticity through statistical fraud detection. Our ELO-based ranking system provides reliable model assessment, with the platform becoming an established evaluation resource. Through analysis of this preference data, we present insights into human preference patterns. Our findings reveal preferences for visual presentation features, with Gaussian splat outputs achieving a 16.6 ELO advantage over meshes and textured models receiving a 144.1 ELO advantage over untextured models. We provide recommendations for improving evaluation methods, including multi-criteria assessment, task-oriented evaluation, and format-aware comparison. The platform's community engagement establishes 3D Arena as a benchmark for the field while advancing understanding of human-centered evaluation in Generative 3D.

Community

Paper author Paper submitter

A year after launching, 3D Arena now has an official paper!

This is a great start, but:

  • it shouldn’t compare textured with untextured models, that’s very hard to evaluate (or at least show both untextured in that case)
  • missing some key models especially Hunyuan3D-2.1 and 2.5 (sadly not open source for 2.5 but very strong)

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.18787 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.18787 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 3