Sebastian Stavar

sebastavar

sebastavar
sebastianstavar

AI & ML interests

Text Generation & Chat Assistants; Model Compression & Quantization (Q4/Q6/Q8, gs32); Inference & Serving (on-prem, low-latency); RAG / Retrieval; Agents & Tool Use; Distillation / LoRA / Fine-tuning

Recent Activity

new activity 14 days ago

halley-ai/gpt-oss-20b-MLX-4bit-gs32:UPDATE (Aug 22, 2025) — Prefer the 5-bit build -- gpt-oss-20B — MLX 4-bit (gs=32) for Apple Silicon

new activity 15 days ago

halley-ai/README:GPT-OSS-20B on a 16 GB Mac (MLX): Why Q3 Quantization Didn't Work, and What I Recommend

new activity 15 days ago

halley-ai/gpt-oss-20b-MLX-5bit-gs32:GPT-OSS-20B · MLX Q5 (gs=32) — practical on 24–32 GB Macs, much closer to Q8 than Q4

View all activity

Organizations

models 0

None public yet

datasets 0

None public yet