--- library_name: transformers tags: - text-generation - conversational - instruction-tuned - 4-bit precision - bitsandbytes --- # Rishi-2-2B-IT **Model ID:** `korarishi/rishi-2-2b-it` ## Model Information Summary description and brief definition of inputs and outputs. ## Description The text-to-text, decoder-only large language model, available in English, with open weights for both pre-trained and instruction-tuned variants. Rishi-2-2B-IT is suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Its compact size allows deployment on limited-resource environments such as laptops, desktops, or private cloud infrastructure, democratizing access to state-of-the-art AI models. ## Running with the pipeline API ```python import torch from transformers import pipeline pipe = pipeline( "text-generation", model="korarishi/rishi-2-2b-it", model_kwargs={"torch_dtype": torch.bfloat16}, device="cuda", # replace with "mps" to run on a Mac device ) messages = [ {"role": "user", "content": "Who are you? Please, answer in pirate-speak."}, ] outputs = pipe(messages, max_new_tokens=256) assistant_response = outputs[0]["generated_text"][-1]["content"].strip() print(assistant_response) ``` ## Running on single / multi GPU ```bash # pip install accelerate ``` ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch tokenizer = AutoTokenizer.from_pretrained("korarishi/rishi-2-2b-it") model = AutoModelForCausalLM.from_pretrained( "korarishi/rishi-2-2b-it", device_map="auto", torch_dtype=torch.bfloat16, ) input_text = "Write me a poem about Machine Learning." input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") outputs = model.generate(**input_ids, max_new_tokens=32) print(tokenizer.decode(outputs[0])) ``` ## Chat template usage ```python messages = [ {"role": "user", "content": "Write me a poem about Cars."}, ] input_ids = tokenizer.apply_chat_template( messages, return_tensors="pt", return_dict=True ).to("cuda") outputs = model.generate(**input_ids, max_new_tokens=256) print(tokenizer.decode(outputs[0])) ``` Developed by: [korarishi](https://huggingface.co/korarishi)