Edit model card

Odd Eyed Black Cat Odd Eyed Black Cat by fourbyfourblazer, on Flickr

Table of Contents

Model Description

cat0.1 is a conversational AI model with 3 billion parameters, optimized for efficiency using 4-bit precision. Designed to engage in dynamic and uncensored dialogues, cat0.1 has been trained over the past eight months through an iterative process of training and interactive chatting. The model embodies a diverse range of characters, enabling versatile and engaging interactions. cat0.1 is adapted from unsloth/Llama-3.2-3B-bnb-4bit, leveraging its robust architecture to enhance conversational capabilities.

Model Architecture

  • Parameters: 3 billion
  • Precision: 4-bit
  • Training Configuration:
    • Rank: 32
    • Alpha: 64
  • Hardware: Trained on an RTX 4090 laptop GPU

Training Data

The model was trained on a diverse set of conversational data collected over eight months. The data includes interactions with various characters, ensuring a wide range of conversational styles and topics. Training data is continuously updated with new chunks, allowing the model to evolve and adapt over time.

Training Procedure

cat0.1 employs a progressive training approach:

  1. Initial Training: The model is initially trained on a base set of conversational data.
  2. Interactive Training: The trained model is engaged in chats, generating new data based on its interactions.
  3. Data Update Cycle:
    • Data Collection: New conversational data chunks are gathered from interactions.
    • Training Update: The model is retrained with the new data. Occasionally, older data is removed to focus on recent interactions, while retaining previous model parameters.
  4. Iteration: This cycle of training and data updating is repeated frequently to ensure the model remains current and responsive.

Usage

cat0.1 is designed for applications requiring dynamic and unrestricted conversational capabilities. Suitable use cases include:

  • Chatbots: For platforms needing engaging and versatile conversational agents.
  • Creative Writing Assistance: Helping writers generate dialogue and character interactions.
  • Entertainment: Providing interactive experiences in games and virtual environments.

Example

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("rwitz/cat0.1")
model = AutoModelForCausalLM.from_pretrained("rwitz/cat0.1", torch_dtype=torch.float16)

# Encode input
input_ids = tokenizer.encode("Hello, how are you?", return_tensors="pt")

# Generate response
with torch.no_grad():
    output = model.generate(input_ids, max_length=50)

# Decode and print
print(tokenizer.decode(output[0], skip_special_tokens=True))
Downloads last month
26
Safetensors
Model size
1.85B params
Tensor type
F32
·
FP16
·
U8
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for rwitz/cat0.1

Quantized
(31)
this model

Collection including rwitz/cat0.1