sebastavar commited on
Commit
dc772ca
·
verified ·
1 Parent(s): b39556f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -62,7 +62,7 @@ Throughput varies with Mac model, context, and sampler settings.
62
 
63
  ## Evaluation
64
 
65
- Perplexity (PPL) on a small internal text corpus using the base tokenizer.
66
  <table>
67
  <thead>
68
  <tr><th>Variant</th><th>PPL (ctx=4096)</th></tr>
@@ -73,7 +73,9 @@ Perplexity (PPL) on a small internal text corpus using the base tokenizer.
73
  <tr><td>MLX 4-bit (gs=32)</td><td>13.70 (+27.4% vs 8-bit/gs64, +31.0% vs 6-bit/gs32)</td></tr>
74
  </tbody>
75
  </table>
76
- Note: Small, domain-specific eval for quick sanity; not a benchmark suite.
 
 
77
 
78
  ## Conversion details (provenance)
79
 
 
62
 
63
  ## Evaluation
64
 
65
+ Perplexity (PPL) streaming evaluation on WikiText-2; window=stride=4096, ~100k tokens, EOS inserted between docs.
66
  <table>
67
  <thead>
68
  <tr><th>Variant</th><th>PPL (ctx=4096)</th></tr>
 
73
  <tr><td>MLX 4-bit (gs=32)</td><td>13.70 (+27.4% vs 8-bit/gs64, +31.0% vs 6-bit/gs32)</td></tr>
74
  </tbody>
75
  </table>
76
+ Interpretation:
77
+ - MLX 6-bit/gs32 edges out MLX 8-bit/gs64 slightly (better quality at lower footprint).
78
+ - MLX 4-bit/gs32 shows a meaningful drop in quality; fine for tight memory, but expect more errors.
79
 
80
  ## Conversion details (provenance)
81