Model Card for Model ID

Model Information

Summary description and brief definition of inputs and outputs.

Description

The text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

Running with the `pipeline` API

import torch
from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="premkumarkora/kora-2-2b-it",
    load_in_4bit=True,                 # actually load 4-bit integer weights
    bnb_4bit_quant_type="nf4",         # NormalFloat4 quantizer
    device_map="auto",                 # shard layers automatically on your GPU(s)
    torch_dtype=torch.bfloat16,        # do matrix-math in BF16 (optional)
)

messages = [
    {"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
]

outputs = pipe(messages, max_new_tokens=256)
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
print(assistant_response)

Developed by: [PremKumar Kora]

premkumarkora
/

kora-2-2b-it

Model Card for Model ID

Model Information

Description

Running with the `pipeline` API

Model tree for premkumarkora/kora-2-2b-it

Model Card for Model ID

Model Information

Description

Running with the pipeline API

Model tree for premkumarkora/kora-2-2b-it

Running with the `pipeline` API