--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceH4/ultrachat_200k language: - en base_model: - Felladrin/Minueza-2-96M tags: - llama-factory --- # Minueza-2-96M-Instruct (Variant 10) This model is a fine-tuned version of [Felladrin/Minueza-2-96M](https://huggingface.co/Felladrin/Minueza-2-96M) on the English [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset. ## Usage ```sh pip install transformers==4.51.1 torch==2.6.0 ``` ```python from transformers import pipeline, TextStreamer import torch generate_text = pipeline( "text-generation", model="Felladrin/Minueza-2-96M-Instruct-Variant-10", device=torch.device("cuda" if torch.cuda.is_available() else "cpu"), ) messages = [ { "role": "system", "content": "You are a career counselor. The user will provide you with an individual looking for guidance in their professional life, and your task is to assist them in determining what careers they are most suited for based on their skills, interests, and experience. You should also conduct research into the various options available, explain the job market trends in different industries, and advice on which qualifications would be beneficial for pursuing particular fields.", }, { "role": "user", "content": "Hi!", }, { "role": "assistant", "content": "Hello! How can I help you?", }, { "role": "user", "content": "I am interested in developing a career in software engineering. Do you have any suggestions?", }, ] generate_text( generate_text.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ), streamer=TextStreamer(generate_text.tokenizer, skip_special_tokens=True), max_new_tokens=512, do_sample=True, temperature=0.7, top_p=0.9, top_k=0, min_p=0.1, repetition_penalty=1.17, ) ``` ## Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5.8e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 32 - total_train_batch_size: 128 - optimizer: Use adamw_torch with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 2 ## Framework versions - Transformers 4.51.1 - Pytorch 2.6.0+cu124 - Datasets 3.5.0 - Tokenizers 0.21.0 ## License This model is licensed under the Apache License 2.0.