gsar78
/

Lemnos_it_en_v2

Text Generation

text-generation-inference

Model card Files Files and versions Community

Lemnos_it_en_v2 / README.md

gsar78's picture

Update README.md

da27324 verified 11 months ago

|

history blame contribute delete

2.78 kB

	---
	license: apache-2.0
	datasets:
	- wikimedia/wikipedia
	- custom
	language:
	- en
	pipeline_tag: text-generation
	---

	## Description
	This is "Lemnos" , a new Instruction Tuned model based on the Llama 2 model architecture.

	It was trained on general wikipedia corpus and then finetuned on a custom instruction dataset.

	It is only for use as an experimental version prior launching a new one which also supports Greek.

	## Usage:

	Prerequisites packages:
	- transformers
	- accelerate
	- bitsandbytes-0.43.1

	Minimum Environment: T4 GPU (The free of charge Google Colab T4, should run fine)
	or just run all this Colab (make sure you select T4 GPU):
	https://colab.research.google.com/drive/1lp-JygPxsaQp-NdB7Mh_uVVYeIIXcAlt?usp=sharing

	Notice: Since it is a 7B parameter model, in FP32 and it takes some time to load all the safetensors.
	An alternative, 4-bit quantized will be uploaded soon.

	```python
	# Upgrade in case bitsandbytes already installed
	pip install transformers accelerate bitsandbytes -U
	# or from Colab
	!pip install transformers accelerate bitsandbytes -U
	```

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
	import torch

	# Specify the model hub
	hub_model = 'gsar78/Lemnos_it_en_v2'

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained(hub_model, trust_remote_code=True)

	# Configure the BitsAndBytesConfig for 4-bit quantization
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type='nf4',
	bnb_4bit_compute_dtype=torch.bfloat16,
	bnb_4bit_use_double_quant=False
	)

	# Load the model with the specified configuration
	model = AutoModelForCausalLM.from_pretrained(
	hub_model,
	quantization_config=bnb_config,
	trust_remote_code=True,
	device_map="auto"
	)

	# Function to generate text based on a prompt using the Alpaca format
	def generate_text(prompt, max_length=512):
	# Format the prompt according to the Alpaca format
	alpaca_prompt = f"### Instruction:\n{prompt}\n\n### Response:\n"

	# Tokenize the input prompt
	inputs = tokenizer(alpaca_prompt, return_tensors="pt").to(model.device)

	# Generate text using the model
	outputs = model.generate(
	input_ids=inputs['input_ids'],
	max_length=max_length,
	num_return_sequences=1,
	pad_token_id=tokenizer.eos_token_id
	)

	# Decode the generated tokens to text
	generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

	# Remove the prompt part from the output to get only the response
	response = generated_text[len(alpaca_prompt):]

	return response


	# Example question
	prompt = "What are the three basic colors?"
	generated_text = generate_text(prompt)
	print(generated_text)

	# Output:
	# Red, blue, and yellow.
	```