Update README.md
Browse files
README.md
CHANGED
@@ -55,7 +55,7 @@ print(generate(
|
|
55 |
|
56 |
## Performance (Apple Silicon, real-world)
|
57 |
|
58 |
-
LM Studio and CLI (MLX, Q6 gs32): ~
|
59 |
- tested on on M1 Max 32 GB (short runs show lower t/s due to startup overhead)
|
60 |
|
61 |
Throughput varies with Mac model, context, and sampler settings.
|
|
|
55 |
|
56 |
## Performance (Apple Silicon, real-world)
|
57 |
|
58 |
+
LM Studio and CLI (MLX, Q6 gs32): ~49–55 tok/s, TTFB ~0.35–0.45 s (2k-token responses)
|
59 |
- tested on on M1 Max 32 GB (short runs show lower t/s due to startup overhead)
|
60 |
|
61 |
Throughput varies with Mac model, context, and sampler settings.
|