MistralPirate-7b-v2

This model card describes MistralPirate-7b-v2, an advanced language model specifically fine-tuned for generating coherent and accurate pirate-themed content. This model represents a significant improvement over its predecessor, leveraging the OpenHermes 2.5 base model and a substantially expanded and structured dataset.

Model Description

Developed by: phanerozoic
License: cc-by-nc-4.0
Finetuned from model: OpenHermes 2.5

Direct Use

MistralPirate-7b-v2 excels in generating pirate dialect and is ideal for applications in interactive storytelling, gaming, educational content, and conversational AI where pirate-themed language is desired.

Downstream Use

The model can be adapted for various downstream tasks that require a blend of creative language generation and domain-specific knowledge, such as in thematic content creation or language learning tools.

Out-of-Scope Use

MistralPirate-7b-v2 is not designed for general-purpose language modeling or contexts outside of its pirate-themed training. Using it in non-pirate or more formal applications may result in suboptimal performance.

Bias, Risks, and Limitations

MistralPirate-7b-v2, while exhibiting improved coherence and factual accuracy, is still limited by its training data and may inherit biases present within. It is best used in contexts where pirate-themed language is appropriate and not for serious or sensitive communication.

Recommendations

Users should be aware of the model's thematic focus and limitations. It is recommended to use MistralPirate-7b-v2 in appropriate thematic contexts and avoid relying on it for accurate information outside its pirate dialect expertise.

Custom Stopping Strings Usage

To enhance the output quality and coherence, MistralPirate-7b-v2 is configured to recognize certain custom stopping strings. These strings are:

"},"
"User:"
"You:"
"\nUser"
"\nUser:"

These stopping strings are crucial in guiding the model to accurately determine the end of a response or a segment in conversation. Their usage is particularly effective in scenarios involving dialogue, helping to maintain clarity and context in the model's outputs.

Training Data

The model was trained on a dataset that is ten times larger than its predecessor's, composed of pirate-themed content formatted in ChatML.

Preprocessing

The training data, unlike the data from v1, was preprocessed into ChatML format to provide structured and complex training input.

Training Hyperparameters

Training Regime: FP32
Warmup Steps: 1
Per Device Train Batch Size: 1
Gradient Accumulation Steps: 64
Max Steps: 1000
Learning Rate: 0.0002
Logging Steps: 1
Save Steps: 1
Lora Alpha: 32
Dimension Count: 16

Speeds, Sizes, Times

Training was completed in approximately 10 minutes on an RTX 6000 Ada GPU.

Testing Data

The model was evaluated against the Wikitext database, achieving a perplexity score of 5.65.

Factors

Evaluation focused on language coherence and adherence to the pirate dialect.

Metrics

Perplexity was used as the primary metric to assess the model's language modeling performance.

Results

The model demonstrated a significant improvement in language coherence and factual accuracy compared to its predecessor.

Performance Highlights

MistralPirate-7b-v2 has shown a marked improvement in producing rigorous and sensible outputs while maintaining a pirate tone. Unlike its predecessor, this version consistently maintains coherence in its language generation, veering away from nonsensical responses. A significant achievement is its perplexity score against the Wikitext database, which stands at about 5.65, demonstrating its enhanced language modeling capabilities.

Summary

MistralPirate-7b-v2 marks a notable advancement in domain-specific language modeling, particularly in generating pirate-themed content.

Model Architecture and Objective

MistralPirate-7b-v2 is based on the OpenHermes 2.5 architecture, fine-tuned to generate pirate-themed content with high coherence and factual accuracy.

Compute Infrastructure

The model was trained on an RTX 6000 Ada GPU, with a focus on rapid and efficient training.

Hardware

Type: RTX 6000 Ada
Utilization: Used for a total duration of approximately 10 minutes for the complete training process.

Acknowledgments

We extend our deepest gratitude to the teams behind the Mistral and OpenHermes 2.5 models. Their groundbreaking work in language modeling provided the foundation upon which MistralPirate-7b-v2 was developed. Special thanks to the OpenHermes team for their contributions and support in advancing the capabilities of domain-specific language models.