shubhrapandit commited on
Commit
7a1c418
·
verified ·
1 Parent(s): ac8d74d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -1
README.md CHANGED
@@ -277,6 +277,37 @@ The following performance benchmarks were conducted with [vLLM](https://docs.vll
277
  </tr>
278
  </thead>
279
  <tbody style="text-align: center">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
280
  <tr>
281
  <th rowspan="3" valign="top">A100x1</th>
282
  <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
@@ -371,9 +402,40 @@ The following performance benchmarks were conducted with [vLLM](https://docs.vll
371
  </tr>
372
  </thead>
373
  <tbody style="text-align: center">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
374
  <tr>
375
  <th rowspan="3" valign="top">A100x1</th>
376
- <th>Qwen/Qwen2.5-VL-7B-Instruct-quantized.</th>
377
  <td></td>
378
  <td>0.7</td>
379
  <td>1347</td>
 
277
  </tr>
278
  </thead>
279
  <tbody style="text-align: center">
280
+ <tr>
281
+ <th rowspan="3" valign="top">A6000x1</th>
282
+ <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
283
+ <td></td>
284
+ <td>4.9</td>
285
+ <td>912</td>
286
+ <td>3.2</td>
287
+ <td>1386</td>
288
+ <td>3.1</td>
289
+ <td>1431</td>
290
+ </tr>
291
+ <tr>
292
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w8a8</th>
293
+ <td>1.50</td>
294
+ <td>3.6</td>
295
+ <td>1248</td>
296
+ <td>2.1</td>
297
+ <td>2163</td>
298
+ <td>2.0</td>
299
+ <td>2237</td>
300
+ </tr>
301
+ <tr>
302
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w4a16</th>
303
+ <td>2.05</td>
304
+ <td>3.3</td>
305
+ <td>1351</td>
306
+ <td>1.4</td>
307
+ <td>3252</td>
308
+ <td>1.4</td>
309
+ <td>3321</td>
310
+ </tr>
311
  <tr>
312
  <th rowspan="3" valign="top">A100x1</th>
313
  <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
 
402
  </tr>
403
  </thead>
404
  <tbody style="text-align: center">
405
+ <tr>
406
+ <th rowspan="3" valign="top">A6000x1</th>
407
+ <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
408
+ <td></td>
409
+ <td>0.4</td>
410
+ <td>1837</td>
411
+ <td>1.5</td>
412
+ <td>6846</td>
413
+ <td>1.7</td>
414
+ <td>7638</td>
415
+ </tr>
416
+ <tr>
417
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w8a8</th>
418
+ <td>1.41</td>
419
+ <td>0.5</td>
420
+ <td>2297</td>
421
+ <td>2.3</td>
422
+ <td>10137</td>
423
+ <td>2.5</td>
424
+ <td>11472</td>
425
+ </tr>
426
+ <tr>
427
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w4a16</th>
428
+ <td>1.60</td>
429
+ <td>0.4</td>
430
+ <td>1828</td>
431
+ <td>2.7</td>
432
+ <td>12254</td>
433
+ <td>3.4</td>
434
+ <td>15477</td>
435
+ </tr>
436
  <tr>
437
  <th rowspan="3" valign="top">A100x1</th>
438
+ <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
439
  <td></td>
440
  <td>0.7</td>
441
  <td>1347</td>