Arnic
/

Gemma-2-2b-it-chat-medicare

@@ -20,13 +20,13 @@ base_model:
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
@@ -86,17 +86,65 @@ Use the code below to get started with the model.
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
 <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
@@ -105,7 +153,8 @@ Use the code below to get started with the model.
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics

 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** Arash Nicoomanesh
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
+- **Finetuned from model [optional]:** google/gemma-2b-it
 ### Model Sources [optional]
 ### Training Procedure
+model = Gemma2ForCausalLM.from_pretrained(  # Changed here
+    base_model,
+    quantization_config=bnb_config,
+    device_map="auto",
+    attn_implementation=attn_implementation
+)
+tokenizer = GemmaTokenizerFast.from_pretrained(base_model, padding_side="right",
+    truncation_side="right", trust_remote_code=True)
 #### Preprocessing [optional]
+dataset = load_dataset(dataset_name, split="all", cache_dir="./cache")
+dataset = dataset.shuffle(seed=42).select(range(3000))  # Use 3k samples for a better demo
+# Define a cleaning function to remove unwanted artifacts
+def clean_text(text):
+    # Remove URLs and any "Chat Doctor" or similar phrases
+    text = re.sub(r'\b(?:www\.[^\s]+|http\S+)', '', text)                   # Remove URLs
+    text = re.sub(r'\b(?:Chat Doctor(?:.com)?(?:.in)?|www\.(?:google|yahoo)\S*)', '', text)  # Remove site names
+    text = re.sub(r'\s+', ' ', text)                                        # Collapse multiple spaces
+    return text.strip()
+#### Training Hyperparameters
+training_args = TrainingArguments(
+    output_dir=new_model,
+    per_device_train_batch_size=1,
+    per_device_eval_batch_size=1,
+    gradient_accumulation_steps=2,
+    optim="paged_adamw_32bit",
+    num_train_epochs=1,
+    eval_strategy="steps",
+    eval_steps=200,
+    save_steps=500,  # Keep save_steps as 500
+    logging_steps=1,
+    warmup_steps=10,
+    logging_strategy="steps",
+    learning_rate=2e-4,
+    fp16=True,
+    bf16=False,
+    group_by_length=True,
+    report_to="wandb",
+    load_best_model_at_end=False  # Disable loading best model at the end
+)
+# Trainer with early stopping callback
+trainer = SFTTrainer(
+    model=model,
+    train_dataset=dataset["train"],
+    eval_dataset=dataset["test"],
+    peft_config=peft_config,
+    max_seq_length=512,
+    dataset_text_field="text",  # Specify the text field in your dataset
+    tokenizer=tokenizer,
+    args=training_args,
+    packing=False,
+)
 #### Speeds, Sizes, Times [optional]
 <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 ## Evaluation
+View run noble-hill-29 at: https://wandb.ai/anicomanesh/Fine-tune%20Gemma-2-2b-it%20on%20Medical%20Dataset/runs/06xd9vvz
+wandb: ⭐️ View project at: https://wandb.ai/anicomanesh/Fine-tune%20Gemma-2-2b-it%20on%20Medical%20Dat
 ### Testing Data, Factors & Metrics