LlamaEdge compatible quants for SmolVLM2 models.
AI & ML interests
Run open source LLMs across CPU and GPU without changing the binary in Rust and Wasm locally!
Recent Activity
Organization Card
Run Open source LLMs and create OpenAI-compatible API services for the Llama2 series of LLMs locally With LlamaEdge!
Give it a try
Run a single command in your command line terminal.
bash <(curl -sSfL 'https://raw.githubusercontent.com/LlamaEdge/LlamaEdge/main/run-llm.sh') --interactive
Follow the on-screen instructions to install the WasmEdge Runtime and download your favorite open-source LLM. Then, choose whether you want to chat with the model via the CLI or via a web UI.
See it in action | GitHub | Docs
Why?
LlamaEdge, powered by Rust and WasmEdge, provides a strong alternative to Python in AI inference.
- Lightweight. The total runtime size is 30MB.
- Fast. Full native speed on GPUs.
- Portable. Single cross-platform binary on different CPUs, GPUs, and OSes.
- Secure. Sandboxed and isolated execution on untrusted devices.
- Container-ready. Supported in Docker, containerd, Podman, and Kubernetes.
Learn more
Please visit the LlamaEdge project to learn more.
models
284

second-state/SmolLM3-3B-GGUF
Text Generation
•
3B
•
Updated
•
179

second-state/gemma-3n-E4B-it-GGUF
Image-Text-to-Text
•
7B
•
Updated
•
421

second-state/gemma-3n-E2B-it-GGUF
Image-Text-to-Text
•
4B
•
Updated
•
573

second-state/gemma-3-1b-it-GGUF
Text Generation
•
1.0B
•
Updated
•
1.31k

second-state/DeepSeek-R1-0528-Qwen3-8B-GGUF
Text Generation
•
8B
•
Updated
•
103

second-state/Osmosis-Structure-0.6B-GGUF
0.6B
•
Updated
•
1.25k
•
5

second-state/SmolVLM2-2.2B-Instruct-GGUF
Image-Text-to-Text
•
2B
•
Updated
•
1.31k
•
2

second-state/SmolVLM2-500M-Video-Instruct-GGUF
Image-Text-to-Text
•
0.4B
•
Updated
•
289

second-state/SmolVLM2-256M-Video-Instruct-GGUF
Image-Text-to-Text
•
0.2B
•
Updated
•
424

second-state/SeaLLMs-Audio-7B-GGUF
Audio-Text-to-Text
•
8B
•
Updated
•
64
datasets
0
None public yet