3 1 3

ANEMLL: Open Source project for TPU models

anemll

https://www.Anemll.com

AI & ML interests

Apple Neural Engine, on-device-compute

Recent Activity

liked a model 7 days ago

smpanaro/Qwen2.5-0.5B-4bit-PerTensor

updated a model 8 days ago

anemll/anemll-dwq-llama-3.2-1B-4b-pf6b-ctx1024_0.3.0

updated a model 8 days ago

anemll/anemll-Qwen3-0.6B-ctx512_0.3.0

View all activity

Organizations

Generate throughput plots for LLMs on devices

models 25

anemll/anemll-dwq-llama-3.2-1B-4b-pf6b-ctx1024_0.3.0

Updated 8 days ago • 29

anemll/anemll-Qwen3-0.6B-ctx512_0.3.0

Updated 8 days ago • 111

anemll/anemll-Qwen3-4B-ctx1024_0.3.0

Updated 8 days ago • 178 • 1

anemll/anemll-Qwen3-1.7B-LUT6-ctx100

Updated 8 days ago • 43

anemll/anemll-Qwen-Qwen3-0.6B-FP16-ctx512_0.3.3

Updated 9 days ago • 5

anemll/anemll-Llama-3.2-1B-FAST-iOS_0.3.0

Updated 10 days ago • 24

anemll/anemll-Nemo_full-8B-FP16-b64-w512

Updated May 7 • 32 • 1

anemll/ANEMLL-Prefill-bench

Updated May 5

anemll/anemll-hermes3b-LUT8_0.3.0

Updated Apr 12 • 14

anemll/anemll-Meta-Llama-3.2-1B-LUT8_ctx512_0.3.0

Updated Apr 12 • 40 • 1

View 25 models

datasets 0

None public yet

ANEMLL: Open Source project for TPU models

AI & ML interests

Recent Activity

Organizations

Collections 5

anemll/anemll-Qwen3-4B-ctx1024_0.3.0

anemll/anemll-Qwen3-0.6B-ctx512_0.3.0

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

FFN Fusion: Rethinking Sequential Computation in Large Language Models

anemll/anemll-Qwen3-4B-ctx1024_0.3.0

anemll/anemll-Qwen3-0.6B-ctx512_0.3.0

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

FFN Fusion: Rethinking Sequential Computation in Large Language Models

spaces 1

On-Device LLM Throughput Calculator

models 25

anemll/anemll-dwq-llama-3.2-1B-4b-pf6b-ctx1024_0.3.0

anemll/anemll-Qwen3-0.6B-ctx512_0.3.0

anemll/anemll-Qwen3-4B-ctx1024_0.3.0

anemll/anemll-Qwen3-1.7B-LUT6-ctx100

anemll/anemll-Qwen-Qwen3-0.6B-FP16-ctx512_0.3.3

anemll/anemll-Llama-3.2-1B-FAST-iOS_0.3.0

anemll/anemll-Nemo_full-8B-FP16-b64-w512

anemll/ANEMLL-Prefill-bench

anemll/anemll-hermes3b-LUT8_0.3.0

anemll/anemll-Meta-Llama-3.2-1B-LUT8_ctx512_0.3.0

datasets 0

ANEMLL: Open Source project for TPU models

AI & ML interests

Recent Activity

Organizations

Collections 5

spaces 1

On-Device LLM Throughput Calculator

models 25 Sort: Recently updated

datasets 0

models 25