Edit model card

This a ~90m assistant model for cameloid models like LLama/Alpaca/Vicuna/Guanaco that use the llama tokenizer, allowing for speedups up to 3x with greed sampling. Its trained on 5.5 billion tokens of refinedweb and uses the GPTBigcode architecture and has a context window: 1024. To use please see this article on assisted generation https://huggingface.co/blog/assisted-generation.

Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train epinnock/protylopus