model_info: | |
name: anemll-Qwen3-1.7B-MLX-dequantized-ctx1024 | |
version: 0.3.3 | |
description: | | |
Demonstarates running Qwen3-1.7B-MLX-dequantized on Apple Neural Engine | |
Context length: 1024 | |
Batch size: 64 | |
Chunks: 1 | |
license: MIT | |
author: Anemll | |
framework: Core ML | |
language: Python | |
architecture: qwen3 | |
parameters: | |
context_length: 1024 | |
batch_size: 64 | |
lut_embeddings: none | |
lut_ffn: 6 | |
lut_lmhead: 8 | |
num_chunks: 1 | |
model_prefix: qwen | |
embeddings: qwen_embeddings_lut8.mlmodelc | |
lm_head: qwen_lm_head_lut8.mlmodelc | |
ffn: qwen_FFN_PF_lut6.mlmodelc | |
split_lm_head: 16 | |