Model Card for Model ID

This is a int4_awq quantized checkpoint of bigcode/starcoder2-15b. It takes about 10GB of VRAM.

Running this Model

Run via docker with text-generation-interface:

docker run --gpus all --shm-size 64g -p 8080:80 -v ~/.cache/huggingface:/data \
    ghcr.io/huggingface/text-generation-inference:3.1.0 \
    --model-id shavera/starcoder2-15b-w4-autoawq-gemm

Downloads last month: 9

Safetensors

Model size

2.66B params

Tensor type

F32

I32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shavera/starcoder2-15b-w4-autoawq-gemm

Base model

bigcode/starcoder2-15b

Quantized

(18)

this model