nielsr HF Staff commited on
Commit
5c3a0db
·
verified ·
1 Parent(s): 6e1abd5

Improve model card: Add paper and benchmark GitHub links

Browse files

Hi there!

This PR improves the model card for `SpaceOm` by enhancing its connection to the research paper and related code:

- Adds the `paper: 2506.07966` metadata tag to link the model directly to the [SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence](https://huggingface.co/papers/2506.07966) paper on the Hugging Face Hub, improving discoverability.
- Adds a prominent link to the SpaCE-10 paper and its associated GitHub repository (https://github.com/Cuzyoung/SpaCE-10) at the top of the model card content. This provides immediate context about the model's evaluation benchmark.

These changes help researchers and users quickly understand the model's origin and the context of its performance on the SpaCE-10 benchmark.

Files changed (1) hide show
  1. README.md +26 -204
README.md CHANGED
@@ -1,8 +1,14 @@
1
  ---
2
- task_categories:
3
- - visual-question-answering
 
 
4
  language:
5
  - en
 
 
 
 
6
  tags:
7
  - gguf
8
  - remyx
@@ -16,14 +22,9 @@ tags:
16
  - vision-language
17
  - distance-estimation
18
  - quantitative-spatial-reasoning
 
 
19
  pretty_name: SpaceOm-GGUF
20
- license: apache-2.0
21
- datasets:
22
- - remyxai/SpaceThinker
23
- base_model:
24
- - remyxai/SpaceOm
25
- pipeline_tag: image-text-to-text
26
- library_name: llama.cpp
27
  model-index:
28
  - name: SpaceOm
29
  results:
@@ -35,230 +36,51 @@ model-index:
35
  type: benchmark
36
  metrics:
37
  - type: success_rate
38
- name: Overall Success Rate
39
  value: 0.5419
40
- results_by_subcategory:
41
- - name: 3D Positional Relation / Orientation
42
- success_rate: 0.4877
43
- - name: Object Localization / 3D Localization
44
- success_rate: 0.6337
45
- - name: Object Properties / Size
46
- success_rate: 0.5043
47
- - task:
48
- type: visual-question-answering
49
- name: Spatial Reasoning
50
- dataset:
51
- name: BLINK
52
- type: benchmark
53
- metrics:
54
- - type: success_rate
55
  name: Overall Success Rate
56
- value: 0.599
57
- results_by_subcategory:
58
- - name: 3D Positional Relation / Orientation
59
- success_rate: 0.7972
60
- - name: Counting / Object Counting
61
- success_rate: 0.6167
62
- - name: Depth and Distance / Relative
63
- success_rate: 0.621
64
- - name: Object Localization / 2D Localization
65
- success_rate: 0.582
66
- - name: Point and Object Tracking / Point Correspondence
67
- success_rate: 0.3779
68
- - task:
69
- type: visual-question-answering
70
- name: Spatial Reasoning
71
- dataset:
72
- name: MMIU
73
- type: benchmark
74
- metrics:
75
  - type: success_rate
 
76
  name: Overall Success Rate
77
- value: 0.388
78
- results_by_subcategory:
79
- - name: Camera and Image Transformation / 2D Transformation
80
- success_rate: 0.255
81
- - name: Camera and Image Transformation / 3D Camera Pose
82
- success_rate: 0.4
83
- - name: Camera and Image Transformation / Camera Motion
84
- success_rate: 0.4436
85
- - name: Depth and Distance / Absolute
86
- success_rate: 0.265
87
- - name: Object Localization / 3D Localization
88
- success_rate: 0.3625
89
- - name: Point and Object Tracking / 3D Tracking
90
- success_rate: 0.725
91
- - name: Point and Object Tracking / Point Correspondence
92
- success_rate: 0.265
93
- - task:
94
- type: visual-question-answering
95
- name: Spatial Reasoning
96
- dataset:
97
- name: MMVP
98
- type: benchmark
99
- metrics:
100
  - type: success_rate
 
101
  name: Overall Success Rate
102
- value: 0.5833
103
- results_by_subcategory:
104
- - name: Others / Miscellaneous
105
- success_rate: 0.5833
106
- - task:
107
- type: visual-question-answering
108
- name: Spatial Reasoning
109
- dataset:
110
- name: QSpatialBench-Plus
111
- type: benchmark
112
- metrics:
113
  - type: success_rate
 
114
  name: Overall Success Rate
115
- value: 0.4455
116
- results_by_subcategory:
117
- - name: Depth and Distance / Absolute
118
- success_rate: 0.4455
119
- - task:
120
- type: visual-question-answering
121
- name: Spatial Reasoning
122
- dataset:
123
- name: QSpatialBench-ScanNet
124
- type: benchmark
125
- metrics:
126
  - type: success_rate
 
127
  name: Overall Success Rate
128
- value: 0.4876
129
- results_by_subcategory:
130
- - name: Depth and Distance / Absolute
131
- success_rate: 0.464
132
- - name: Object Properties / Size
133
- success_rate: 0.5111
134
- - task:
135
- type: visual-question-answering
136
- name: Spatial Reasoning
137
- dataset:
138
- name: RealWorldQA
139
- type: benchmark
140
- metrics:
141
  - type: success_rate
 
142
  name: Overall Success Rate
143
- value: 0.6105
144
- results_by_subcategory:
145
- - name: Others / Miscellaneous
146
- success_rate: 0.6105
147
- - task:
148
- type: visual-question-answering
149
- name: Spatial Reasoning
150
- dataset:
151
- name: SpatialSense
152
- type: benchmark
153
- metrics:
154
  - type: success_rate
 
155
  name: Overall Success Rate
156
- value: 0.7043
157
- results_by_subcategory:
158
- - name: 3D Positional Relation / Orientation
159
- success_rate: 0.7043
160
- - task:
161
- type: visual-question-answering
162
- name: Spatial Reasoning
163
- dataset:
164
- name: VGBench
165
- type: benchmark
166
- metrics:
167
  - type: success_rate
 
168
  name: Overall Success Rate
169
- value: 0.3504
170
- results_by_subcategory:
171
- - name: Camera and Image Transformation / 2D Transformation
172
- success_rate: 0.2568
173
- - name: Camera and Image Transformation / 3D Camera Pose
174
- success_rate: 0.4371
175
- - name: Depth and Distance / Absolute
176
- success_rate: 0.3339
177
- - name: Depth and Distance / Relative
178
- success_rate: 0.32
179
- - name: Object Localization / 3D Localization
180
- success_rate: 0.4283
181
- - name: Point and Object Tracking / 3D Tracking
182
- success_rate: 0.3264
183
- - task:
184
- type: visual-question-answering
185
- name: Spatial Reasoning
186
- dataset:
187
- name: VSI-Bench_8
188
- type: benchmark
189
- metrics:
190
  - type: success_rate
 
191
  name: Overall Success Rate
192
- value: 0.2558
193
- results_by_subcategory:
194
- - name: 3D Positional Relation / Orientation
195
- success_rate: 0.3998
196
- - name: Counting / Object Counting
197
- success_rate: 0.229
198
- - name: Depth and Distance / Absolute
199
- success_rate: 0.1562
200
- - name: Depth and Distance / Relative
201
- success_rate: 0.3648
202
- - name: Object Properties / Size
203
- success_rate: 0.1645
204
- - name: Others / Miscellaneous
205
- success_rate: 0.2204
206
- - task:
207
- type: visual-question-answering
208
- name: Spatial Reasoning
209
- dataset:
210
- name: VSR-ZeroShot
211
- type: benchmark
212
- metrics:
213
  - type: success_rate
 
214
  name: Overall Success Rate
215
- value: 0.8085
216
- results_by_subcategory:
217
- - name: 3D Positional Relation / Orientation
218
- success_rate: 0.8085
219
- - task:
220
- type: visual-question-answering
221
- name: Spatial Reasoning
222
- dataset:
223
- name: cvbench
224
- type: benchmark
225
- metrics:
226
  - type: success_rate
 
227
  name: Overall Success Rate
228
- value: 0.6839
229
- results_by_subcategory:
230
- - name: Counting / Object Counting
231
- success_rate: 0.6294
232
- - name: Depth and Distance / Relative
233
- success_rate: 0.7408
234
- - name: Object Localization / 3D Localization
235
- success_rate: 0.6815
236
- - task:
237
- type: visual-question-answering
238
- name: Spatial Reasoning
239
- dataset:
240
- name: spatialbench
241
- type: benchmark
242
- metrics:
243
  - type: success_rate
 
244
  name: Overall Success Rate
 
245
  value: 0.6553
246
- results_by_subcategory:
247
- - name: 3D Positional Relation / Orientation
248
- success_rate: 0.6765
249
- - name: Counting / Object Counting
250
- success_rate: 0.75
251
- - name: Object Properties / Existence
252
- success_rate: 0.925
253
- - name: Object Properties / Reachability
254
- success_rate: 0.55
255
- - name: Object Properties / Size
256
- success_rate: 0.375
257
-
258
  ---
259
 
260
  # SpaceOm
261
 
 
 
 
262
  **Model creator:** [remyxai](https://huggingface.co/remyxai)<br>
263
  **Original model**: [SpaceOm](https://huggingface.co/remyxai/SpaceOm)<br>
264
  **GGUF quantization:** `llama.cpp` commit [2baf07727f921d9a4a1b63a2eff941e95d0488ed](https://github.com/ggerganov/llama.cpp/tree/2baf07727f921d9a4a1b63a2eff941e95d0488ed)<br>
 
1
  ---
2
+ base_model:
3
+ - remyxai/SpaceOm
4
+ datasets:
5
+ - remyxai/SpaceThinker
6
  language:
7
  - en
8
+ library_name: llama.cpp
9
+ license: apache-2.0
10
+ pipeline_tag: image-text-to-text
11
+ paper: 2506.07966
12
  tags:
13
  - gguf
14
  - remyx
 
22
  - vision-language
23
  - distance-estimation
24
  - quantitative-spatial-reasoning
25
+ task_categories:
26
+ - visual-question-answering
27
  pretty_name: SpaceOm-GGUF
 
 
 
 
 
 
 
28
  model-index:
29
  - name: SpaceOm
30
  results:
 
36
  type: benchmark
37
  metrics:
38
  - type: success_rate
 
39
  value: 0.5419
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  - type: success_rate
42
+ value: 0.599
43
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  - type: success_rate
45
+ value: 0.388
46
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
47
  - type: success_rate
48
+ value: 0.5833
49
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
50
  - type: success_rate
51
+ value: 0.4455
52
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  - type: success_rate
54
+ value: 0.4876
55
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
56
  - type: success_rate
57
+ value: 0.6105
58
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
59
  - type: success_rate
60
+ value: 0.7043
61
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
  - type: success_rate
63
+ value: 0.3504
64
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  - type: success_rate
66
+ value: 0.2558
67
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
68
  - type: success_rate
69
+ value: 0.8085
70
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  - type: success_rate
72
+ value: 0.6839
73
  name: Overall Success Rate
74
+ - type: success_rate
75
  value: 0.6553
76
+ name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
77
  ---
78
 
79
  # SpaceOm
80
 
81
+ This model is evaluated in the paper [SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence](https://huggingface.co/papers/2506.07966).
82
+ The code for the SpaCE-10 benchmark is available at: https://github.com/Cuzyoung/SpaCE-10.
83
+
84
  **Model creator:** [remyxai](https://huggingface.co/remyxai)<br>
85
  **Original model**: [SpaceOm](https://huggingface.co/remyxai/SpaceOm)<br>
86
  **GGUF quantization:** `llama.cpp` commit [2baf07727f921d9a4a1b63a2eff941e95d0488ed](https://github.com/ggerganov/llama.cpp/tree/2baf07727f921d9a4a1b63a2eff941e95d0488ed)<br>