aryashah00's picture
Update README.md
f9fa77b verified
metadata
language: en
license: mit
library_name: transformers
pipeline_tag: text-generation
tags:
  - text-generation
  - conversational
  - survey-response-generation
  - synthetic-data
  - fine-tuned
  - chatbot

aryashah00/survey-finetuned-tinyllama-for-deployment

Model Description

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 optimized for generating synthetic survey responses across multiple domains. It has been instruction-tuned using a custom dataset of survey responses, with each response reflecting a specific persona.

Training Data

  • Dataset Size: ~3,000 examples
  • Domains: 10 domains including healthcare, education, etc.
  • Format: ChatML instruction format with system and user prompts

Training Details

Usage

This model is specifically designed for generating synthetic survey responses from different personas. It works best when provided with:

  1. A detailed persona description
  2. A specific survey question

EXAMPLE Inference on CPU:

import torch
import os
from transformers import AutoTokenizer, LlamaForCausalLM, LlamaConfig

# Force CPU usage by hiding all CUDA devices
os.environ["CUDA_VISIBLE_DEVICES"] = ""

# Use a standard non-quantized TinyLlama model instead of the quantized one
model_name = "aryashah00/survey-finetuned-tinyllama-for-deployment"  # Base model without quantization

# Load tokenizer
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load model with explicit CPU configuration
print("Loading model on CPU (this may take a while)...")
model = LlamaForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="cpu"
)

print(f"Model loaded successfully on: {next(model.parameters()).device}")

# Example persona and survey question
persona = "A caring mother who lost her first child due to a miscarriage."
question = "Rate on a scale of 1(less likely) to 5(extremely likely) for the following question: I deeply care about others"

# Format messages following chat template
messages = [
    {"role": "system", "content": f"You are embodying the following persona: {persona}"},
    {"role": "user", "content": f"Survey Question: {question}\n\nPlease provide your honest score on a scale of 1 to 5 and detailed reason for this score to this question."}
]

# Apply chat template
print("Preparing input...")
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

# Generate response
print("Generating response...")
with torch.no_grad():
    output_ids = model.generate(
        input_ids=input_ids,
        max_new_tokens=256,
        temperature=0.9,
        top_p=0.9,
        do_sample=True
    )

# Decode and print the response
output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
# Extract the model's answer (remove the prompt)
response_start = output.find(input_text) + len(input_text)
generated_response = output[response_start:].strip()
print("\nGenerated response:\n", generated_response)

Limitations

  • The model is optimized for survey response generation and may not perform well on other tasks
  • Response quality depends on the clarity and specificity of the persona and question
  • The model may occasionally generate responses that don't fully align with the given persona

License

This model follows the license of the base model TinyLlama/TinyLlama-1.1B-Chat-v1.0.