Instructions to use mhenrichsen/danskgpt-tiny-chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mhenrichsen/danskgpt-tiny-chat with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mhenrichsen/danskgpt-tiny-chat") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("mhenrichsen/danskgpt-tiny-chat") model = AutoModelForCausalLM.from_pretrained("mhenrichsen/danskgpt-tiny-chat") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use mhenrichsen/danskgpt-tiny-chat with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "mhenrichsen/danskgpt-tiny-chat" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mhenrichsen/danskgpt-tiny-chat", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/mhenrichsen/danskgpt-tiny-chat
- SGLang
How to use mhenrichsen/danskgpt-tiny-chat with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mhenrichsen/danskgpt-tiny-chat" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mhenrichsen/danskgpt-tiny-chat", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mhenrichsen/danskgpt-tiny-chat" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mhenrichsen/danskgpt-tiny-chat", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use mhenrichsen/danskgpt-tiny-chat with Docker Model Runner:
docker model run hf.co/mhenrichsen/danskgpt-tiny-chat
How to convert the model to a gguf model?
Hi Mads,
I am playing around and would like to convert it to gguf but have given up now :D (i am a newbie) I came to the conclussion that its based on llama and tried using the llama.cpp convert, but failed to do so. Any pointers ?
Figured it out, was on the right approach but just some currupted files that was messing with me.
Can you share some details on how to convert it to gguf? Would like to try the model in GPT4ALL.
Hi @pksorensen and @tintwotin
I'm sorry about the late reply, I have not been getting any notifications about the discussions.
To convert to gguf, please use llama.cpp: https://github.com/ggerganov/llama.cpp
Remember to use the correct prompt format when chatting with the model :)