Instructions to use JosephusCheung/Guanaco with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JosephusCheung/Guanaco with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="JosephusCheung/Guanaco") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("JosephusCheung/Guanaco") model = AutoModelForCausalLM.from_pretrained("JosephusCheung/Guanaco") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use JosephusCheung/Guanaco with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "JosephusCheung/Guanaco" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JosephusCheung/Guanaco", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/JosephusCheung/Guanaco
- SGLang
How to use JosephusCheung/Guanaco with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "JosephusCheung/Guanaco" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JosephusCheung/Guanaco", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "JosephusCheung/Guanaco" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JosephusCheung/Guanaco", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use JosephusCheung/Guanaco with Docker Model Runner:
docker model run hf.co/JosephusCheung/Guanaco
Commit History
Update README.md e044a62
Update README.md 677ae12
sharded 8156219
Upload folder using huggingface_hub 56ab3e7
Update README.md 99059d1
Update README.md 5299c5a
Update README.md 8344205
Update README.md 6152b94
Update README.md e8960c6
Update README.md 06bc253
Update README.md fb4c874
Update README.md d36a7cb
Update README.md a8e4008
Update README.md 4d44ce2
Update README.md e233149
Update README.md 7331bed
Update README.md 1750b03
Update README.md 09ac929
Update README.md 46f3d34
Update README.md 56c2d12
Update README.md baeb9de
Update README.md c098edc
Upload StupidBanner.png 50b998f
Update README.md c6eb301
Update README.md bf8d464
Update README.md 32ac3d9
Update README.md 2ba4e2a
Update README.md da299b0
Update README.md 083a022
Create README.md 714a052
Upload 5 files 77c61d2
Upload pytorch_model.bin fffbfd1
Upload 6 files 3f19ef4
initial commit 5cfdb0d
Joseph Cheung commited on