metadata
language: en
license: mit
library_name: transformers
pipeline_tag: text-generation
tags:
- text-generation
- conversational
- survey-response-generation
- synthetic-data
- fine-tuned
- chatbot
aryashah00/survey-finetuned-tinyllama-for-deployment
Model Description
This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 optimized for generating synthetic survey responses across multiple domains. It has been instruction-tuned using a custom dataset of survey responses, with each response reflecting a specific persona.
Training Data
- Dataset Size: ~3,000 examples
- Domains: 10 domains including healthcare, education, etc.
- Format: ChatML instruction format with system and user prompts
Training Details
- Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
Usage
This model is specifically designed for generating synthetic survey responses from different personas. It works best when provided with:
- A detailed persona description
- A specific survey question
EXAMPLE Inference on CPU:
import torch
import os
from transformers import AutoTokenizer, LlamaForCausalLM, LlamaConfig
# Force CPU usage by hiding all CUDA devices
os.environ["CUDA_VISIBLE_DEVICES"] = ""
# Use a standard non-quantized TinyLlama model instead of the quantized one
model_name = "aryashah00/survey-finetuned-tinyllama-for-deployment" # Base model without quantization
# Load tokenizer
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load model with explicit CPU configuration
print("Loading model on CPU (this may take a while)...")
model = LlamaForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float32,
device_map="cpu"
)
print(f"Model loaded successfully on: {next(model.parameters()).device}")
# Example persona and survey question
persona = "A caring mother who lost her first child due to a miscarriage."
question = "Rate on a scale of 1(less likely) to 5(extremely likely) for the following question: I deeply care about others"
# Format messages following chat template
messages = [
{"role": "system", "content": f"You are embodying the following persona: {persona}"},
{"role": "user", "content": f"Survey Question: {question}\n\nPlease provide your honest score on a scale of 1 to 5 and detailed reason for this score to this question."}
]
# Apply chat template
print("Preparing input...")
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
# Generate response
print("Generating response...")
with torch.no_grad():
output_ids = model.generate(
input_ids=input_ids,
max_new_tokens=256,
temperature=0.9,
top_p=0.9,
do_sample=True
)
# Decode and print the response
output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
# Extract the model's answer (remove the prompt)
response_start = output.find(input_text) + len(input_text)
generated_response = output[response_start:].strip()
print("\nGenerated response:\n", generated_response)
Limitations
- The model is optimized for survey response generation and may not perform well on other tasks
- Response quality depends on the clarity and specificity of the persona and question
- The model may occasionally generate responses that don't fully align with the given persona
License
This model follows the license of the base model TinyLlama/TinyLlama-1.1B-Chat-v1.0.