tasal9 commited on
Commit
d5f3038
Β·
1 Parent(s): 1e0b125

Update Space with advanced template including Load Model button and enhanced features

Browse files
Files changed (3) hide show
  1. README.md +20 -8
  2. app.py +462 -73
  3. requirements.txt +8 -5
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: pashto-base-bloom Training Space
3
  emoji: πŸš€
4
  colorFrom: blue
5
  colorTo: purple
@@ -8,15 +8,27 @@ sdk_version: 4.36.1
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
- hardware: zero-a10g
12
  ---
13
 
14
- # pashto-base-bloom Training Space
15
 
16
- This space provides three main functionalities for the pashto-base-bloom model:
17
 
18
- 1. **Train**: Train the model from scratch
19
- 2. **Fine-tune**: Fine-tune the existing model
20
- 3. **Test**: Test the model with sample inputs
21
 
22
- The space uses ZeroGPU for efficient GPU computation.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: pashto-base-bloom Advanced Training Space
3
  emoji: πŸš€
4
  colorFrom: blue
5
  colorTo: purple
 
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
+ hardware: zero-gpu-a10g
12
  ---
13
 
14
+ # pashto-base-bloom Advanced Training Space
15
 
16
+ This space provides enhanced functionality for working with the pashto-base-bloom model:
17
 
18
+ ## ✨ New Features
 
 
19
 
20
+ 1. **Load Model Button**: Explicitly load the model before using other features
21
+ 2. **Advanced Generation Settings**: Control temperature, top-p, and repetition penalty
22
+ 3. **Model Evaluation**: Measure model performance on test data
23
+ 4. **Enhanced Training**: Better progress tracking and parameter tuning
24
+ 5. **Model Information**: View details about the model architecture and parameters
25
+ 6. **Recommendations**: Get suggestions for next steps after each operation
26
+
27
+ ## πŸ”§ Capabilities
28
+
29
+ - **Test**: Generate text with customizable parameters
30
+ - **Train**: Train or fine-tune the model with your data
31
+ - **Evaluate**: Measure model performance quantitatively
32
+ - **Upload**: Save your trained models to Hugging Face Hub
33
+
34
+ Powered by ZeroGPU for efficient GPU acceleration.
app.py CHANGED
@@ -1,145 +1,534 @@
 
 
 
 
 
1
  import gradio as gr
2
  import spaces
3
  import torch
4
- from transformers import AutoTokenizer, AutoModelForCausalLM
5
  import os
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
- # Model configuration
8
  MODEL_NAME = "tasal9/pashto-base-bloom"
 
9
 
10
  @spaces.GPU
11
  def load_model():
12
- """Load the model and tokenizer"""
 
 
 
 
 
 
 
 
13
  try:
14
- tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
15
- model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.float16)
16
- if tokenizer.pad_token is None:
17
- tokenizer.pad_token = tokenizer.eos_token
18
- return model, tokenizer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  except Exception as e:
20
- return None, None
 
 
 
 
 
 
 
 
 
 
21
 
22
  @spaces.GPU
23
- def test_model(input_text, max_length=100, temperature=0.7):
24
- """Test the model with given input"""
25
- if not input_text.strip():
26
- return "Please enter some text to test the model."
27
-
28
- model, tokenizer = load_model()
29
 
30
- if model is None or tokenizer is None:
31
- return "❌ Failed to load model. Please check if the model exists on Hugging Face Hub."
32
 
33
  try:
34
- inputs = tokenizer.encode(input_text, return_tensors="pt")
35
 
 
36
  with torch.no_grad():
37
- outputs = model.generate(
38
- inputs,
39
- max_length=len(inputs[0]) + max_length,
40
  temperature=temperature,
 
 
41
  do_sample=True,
42
- pad_token_id=tokenizer.eos_token_id
43
  )
44
 
45
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
46
- return response[len(input_text):].strip()
 
 
47
 
48
  except Exception as e:
49
  return f"❌ Error during generation: {str(e)}"
50
 
51
- def train_model(dataset_text, epochs=1, learning_rate=2e-5):
52
- """Train the model (placeholder implementation)"""
53
- return f"πŸš€ Training started with {epochs} epochs and learning rate {learning_rate}\n\nNote: This is a placeholder. Actual training requires dataset preparation and more computational resources."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
- def finetune_model(dataset_text, epochs=1, learning_rate=5e-5):
56
- """Fine-tune the model (placeholder implementation)"""
57
- return f"πŸ”§ Fine-tuning started with {epochs} epochs and learning rate {learning_rate}\n\nNote: This is a placeholder. Actual fine-tuning requires dataset preparation and more computational resources."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
  # Create Gradio interface
60
- with gr.Blocks(title="pashto-base-bloom Training Space", theme=gr.themes.Soft()) as iface:
61
- gr.Markdown(f"# pashto-base-bloom Training Space")
62
- gr.Markdown("Choose your operation: Train, Fine-tune, or Test the model")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
 
64
  with gr.Tabs():
65
  # Test Tab
66
  with gr.TabItem("πŸ§ͺ Test Model"):
67
- gr.Markdown("### Test the model with your input")
68
  with gr.Row():
69
  with gr.Column():
70
  test_input = gr.Textbox(
71
- label="Input Text",
72
  placeholder="Enter text to test the model...",
73
  lines=3
74
  )
75
- max_length_slider = gr.Slider(
76
- minimum=10,
77
- maximum=500,
78
- value=100,
79
- label="Max Length"
80
- )
81
- temperature_slider = gr.Slider(
82
- minimum=0.1,
83
- maximum=2.0,
84
- value=0.7,
85
- label="Temperature"
86
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87
  test_btn = gr.Button("πŸš€ Generate", variant="primary")
88
 
89
  with gr.Column():
90
  test_output = gr.Textbox(
91
- label="Model Output",
92
- lines=5,
93
  interactive=False
94
  )
 
 
 
 
 
 
95
 
96
  test_btn.click(
97
- fn=test_model,
98
- inputs=[test_input, max_length_slider, temperature_slider],
99
  outputs=test_output
100
  )
101
 
102
  # Train Tab
103
  with gr.TabItem("πŸ‹οΈ Train Model"):
104
- gr.Markdown("### Train the model from scratch")
105
  train_dataset = gr.Textbox(
106
  label="Training Dataset",
107
- placeholder="Upload or paste your training data...",
108
- lines=5
109
  )
110
  with gr.Row():
111
- train_epochs = gr.Number(label="Epochs", value=1, minimum=1)
112
- train_lr = gr.Number(label="Learning Rate", value=2e-5, minimum=1e-6)
 
113
 
 
114
  train_btn = gr.Button("πŸš€ Start Training", variant="primary")
115
- train_output = gr.Textbox(label="Training Output", lines=5, interactive=False)
116
 
117
  train_btn.click(
118
  fn=train_model,
119
- inputs=[train_dataset, train_epochs, train_lr],
120
  outputs=train_output
121
  )
122
 
123
- # Fine-tune Tab
124
- with gr.TabItem("πŸ”§ Fine-tune Model"):
125
- gr.Markdown("### Fine-tune the existing model")
126
- finetune_dataset = gr.Textbox(
127
- label="Fine-tuning Dataset",
128
- placeholder="Upload or paste your fine-tuning data...",
129
- lines=5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
130
  )
 
 
 
 
131
  with gr.Row():
132
- finetune_epochs = gr.Number(label="Epochs", value=1, minimum=1)
133
- finetune_lr = gr.Number(label="Learning Rate", value=5e-5, minimum=1e-6)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
 
135
- finetune_btn = gr.Button("πŸ”§ Start Fine-tuning", variant="primary")
136
- finetune_output = gr.Textbox(label="Fine-tuning Output", lines=5, interactive=False)
137
 
138
- finetune_btn.click(
139
- fn=finetune_model,
140
- inputs=[finetune_dataset, finetune_epochs, finetune_lr],
141
- outputs=finetune_output
142
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
143
 
144
  if __name__ == "__main__":
145
  iface.launch()
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Enhanced Space Template with Load Model Button and Advanced Features
4
+ """
5
+
6
  import gradio as gr
7
  import spaces
8
  import torch
 
9
  import os
10
+ import time
11
+ import json
12
+ from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, Trainer
13
+ from datasets import Dataset
14
+ from huggingface_hub import HfApi, upload_folder
15
+ import numpy as np
16
+
17
+ # Global variables to store model and tokenizer
18
+ MODEL = None
19
+ TOKENIZER = None
20
+ MODEL_LOADED = False
21
+ MODEL_LOADING_TIME = None
22
 
23
+ # Model configuration - replace with your model
24
  MODEL_NAME = "tasal9/pashto-base-bloom"
25
+ MODEL_TYPE = "causal_lm" # "causal_lm", "seq2seq", "text_classification", etc.
26
 
27
  @spaces.GPU
28
  def load_model():
29
+ """Load the model and tokenizer with progress tracking"""
30
+ global MODEL, TOKENIZER, MODEL_LOADED, MODEL_LOADING_TIME
31
+
32
+ if MODEL_LOADED and MODEL is not None and TOKENIZER is not None:
33
+ return "βœ… Model already loaded and ready to use!"
34
+
35
+ start_time = time.time()
36
+ progress_updates = []
37
+
38
  try:
39
+ progress_updates.append("πŸ” Starting model loading process...")
40
+ yield "\n".join(progress_updates)
41
+
42
+ progress_updates.append("⏳ Loading tokenizer...")
43
+ yield "\n".join(progress_updates)
44
+
45
+ # Load tokenizer
46
+ TOKENIZER = AutoTokenizer.from_pretrained(MODEL_NAME)
47
+ if TOKENIZER.pad_token is None and TOKENIZER.eos_token is not None:
48
+ TOKENIZER.pad_token = TOKENIZER.eos_token
49
+
50
+ progress_updates.append("βœ… Tokenizer loaded successfully")
51
+ yield "\n".join(progress_updates)
52
+
53
+ progress_updates.append(f"⏳ Loading model {MODEL_NAME} to GPU (this may take a while)...")
54
+ yield "\n".join(progress_updates)
55
+
56
+ # Load model with appropriate settings based on type
57
+ if MODEL_TYPE == "causal_lm":
58
+ MODEL = AutoModelForCausalLM.from_pretrained(
59
+ MODEL_NAME,
60
+ torch_dtype=torch.float16,
61
+ device_map="auto"
62
+ )
63
+ else:
64
+ # Default to causal language model if type not specified
65
+ MODEL = AutoModelForCausalLM.from_pretrained(
66
+ MODEL_NAME,
67
+ torch_dtype=torch.float16,
68
+ device_map="auto"
69
+ )
70
+
71
+ MODEL_LOADED = True
72
+ MODEL_LOADING_TIME = time.time() - start_time
73
+
74
+ progress_updates.append(f"βœ… Model loaded successfully in {MODEL_LOADING_TIME:.2f} seconds")
75
+ progress_updates.append(f"πŸš€ Model is ready to use! You can now use the features below.")
76
+ progress_updates.append(f"πŸ’‘ RECOMMENDATION: Start by testing the model with a simple prompt to ensure it's working properly.")
77
+
78
+ yield "\n".join(progress_updates)
79
+
80
  except Exception as e:
81
+ error_msg = f"❌ Failed to load model: {str(e)}"
82
+ progress_updates.append(error_msg)
83
+ yield "\n".join(progress_updates)
84
+ MODEL_LOADED = False
85
+ return "\n".join(progress_updates)
86
+
87
+ def check_model_loaded():
88
+ """Check if model is loaded and return appropriate message"""
89
+ if not MODEL_LOADED or MODEL is None or TOKENIZER is None:
90
+ return False, "❌ Please load the model first using the 'Load Model' button at the top of the page."
91
+ return True, "Model loaded and ready"
92
 
93
  @spaces.GPU
94
+ def generate_text(input_text, max_length=100, temperature=0.7, top_p=0.9, repetition_penalty=1.2):
95
+ """Generate text from the model"""
96
+ # Check if model is loaded
97
+ is_loaded, message = check_model_loaded()
98
+ if not is_loaded:
99
+ return message
100
 
101
+ if not input_text.strip():
102
+ return "Please enter a prompt to generate text."
103
 
104
  try:
105
+ inputs = TOKENIZER(input_text, return_tensors="pt").to(MODEL.device)
106
 
107
+ # Generate text with specified parameters
108
  with torch.no_grad():
109
+ outputs = MODEL.generate(
110
+ **inputs,
111
+ max_length=max_length,
112
  temperature=temperature,
113
+ top_p=top_p,
114
+ repetition_penalty=repetition_penalty,
115
  do_sample=True,
116
+ pad_token_id=TOKENIZER.eos_token_id
117
  )
118
 
119
+ generated_text = TOKENIZER.decode(outputs[0], skip_special_tokens=True)
120
+
121
+ # Return just the newly generated text without the prompt
122
+ return generated_text[len(input_text):].strip()
123
 
124
  except Exception as e:
125
  return f"❌ Error during generation: {str(e)}"
126
 
127
+ def prepare_training_dataset(dataset_text):
128
+ """Prepare training dataset from text input"""
129
+ # Check if model is loaded
130
+ is_loaded, message = check_model_loaded()
131
+ if not is_loaded:
132
+ return None, message
133
+
134
+ lines = [line.strip() for line in dataset_text.split("\n") if line.strip()]
135
+
136
+ if not lines:
137
+ return None, "❌ Empty dataset. Please provide training examples."
138
+
139
+ try:
140
+ # Create a simple dataset
141
+ dataset = Dataset.from_dict({"text": lines})
142
+
143
+ # Tokenize the dataset
144
+ def tokenize_function(examples):
145
+ return TOKENIZER(examples["text"], padding="max_length", truncation=True, max_length=512)
146
+
147
+ tokenized_dataset = dataset.map(tokenize_function, batched=True)
148
+ return tokenized_dataset, f"βœ… Dataset prepared with {len(lines)} examples"
149
+
150
+ except Exception as e:
151
+ return None, f"❌ Failed to prepare dataset: {str(e)}"
152
 
153
+ @spaces.GPU
154
+ def train_model(dataset_text, epochs=1, learning_rate=2e-5, batch_size=2, save_model=False):
155
+ """Train the model with actual implementation"""
156
+ # Check if model is loaded
157
+ is_loaded, message = check_model_loaded()
158
+ if not is_loaded:
159
+ return message
160
+
161
+ if not dataset_text.strip():
162
+ return "❌ Please provide training data."
163
+
164
+ try:
165
+ # Prepare dataset
166
+ dataset, prep_message = prepare_training_dataset(dataset_text)
167
+ if dataset is None:
168
+ return prep_message
169
+
170
+ progress_updates = []
171
+ progress_updates.append(f"πŸ” Starting training process...")
172
+ progress_updates.append(f"πŸ“š {prep_message}")
173
+ yield "\n".join(progress_updates)
174
+
175
+ # Training arguments
176
+ output_dir = f"./results-{int(time.time())}"
177
+ training_args = TrainingArguments(
178
+ output_dir=output_dir,
179
+ num_train_epochs=epochs,
180
+ learning_rate=float(learning_rate),
181
+ per_device_train_batch_size=batch_size,
182
+ gradient_accumulation_steps=4,
183
+ warmup_steps=50,
184
+ logging_steps=10,
185
+ save_steps=200,
186
+ save_total_limit=2,
187
+ )
188
+
189
+ # Initialize trainer
190
+ trainer = Trainer(
191
+ model=MODEL,
192
+ args=training_args,
193
+ train_dataset=dataset,
194
+ )
195
+
196
+ progress_updates.append(f"πŸš€ Starting training for {epochs} epoch(s) with learning rate {learning_rate}...")
197
+ yield "\n".join(progress_updates)
198
+
199
+ # Train the model
200
+ train_result = trainer.train()
201
+
202
+ progress_updates.append(f"βœ… Training complete!")
203
+ progress_updates.append(f"πŸ“Š Training Loss: {train_result.training_loss:.4f}")
204
+ progress_updates.append(f"⏱️ Training Time: {train_result.metrics['train_runtime']:.2f} seconds")
205
+
206
+ # Save model if requested
207
+ if save_model:
208
+ model_save_dir = f"./trained-model-{int(time.time())}"
209
+ trainer.save_model(model_save_dir)
210
+ TOKENIZER.save_pretrained(model_save_dir)
211
+
212
+ progress_updates.append(f"πŸ’Ύ Model saved to {model_save_dir}")
213
+ progress_updates.append(f"πŸ“ To use this model, you can upload it to the Hugging Face Hub using the 'Upload Model' tab.")
214
+
215
+ progress_updates.append("\nπŸ’‘ RECOMMENDATIONS AFTER TRAINING:")
216
+ progress_updates.append("1. Test the model with new prompts to see how it performs")
217
+ progress_updates.append("2. If results aren't satisfactory, try adjusting hyperparameters or training for more epochs")
218
+ progress_updates.append("3. Consider increasing the dataset size for better results")
219
+
220
+ yield "\n".join(progress_updates)
221
+
222
+ except Exception as e:
223
+ return f"❌ Training failed: {str(e)}"
224
+
225
+ @spaces.GPU
226
+ def evaluate_model(test_data, metric_choice="perplexity"):
227
+ """Evaluate the model on test data"""
228
+ # Check if model is loaded
229
+ is_loaded, message = check_model_loaded()
230
+ if not is_loaded:
231
+ return message
232
+
233
+ if not test_data.strip():
234
+ return "❌ Please provide test data."
235
+
236
+ try:
237
+ # Split test data into examples
238
+ test_examples = [example.strip() for example in test_data.split("\n") if example.strip()]
239
+
240
+ results = []
241
+ total_perplexity = 0
242
+
243
+ for i, example in enumerate(test_examples):
244
+ inputs = TOKENIZER(example, return_tensors="pt").to(MODEL.device)
245
+
246
+ with torch.no_grad():
247
+ outputs = MODEL(**inputs, labels=inputs["input_ids"])
248
+ loss = outputs.loss.item()
249
+ perplexity = torch.exp(torch.tensor(loss)).item()
250
+
251
+ total_perplexity += perplexity
252
+ results.append(f"Example {i+1} - Perplexity: {perplexity:.4f}")
253
+
254
+ avg_perplexity = total_perplexity / len(test_examples)
255
+
256
+ final_result = "\n".join(results)
257
+ final_result += f"\n\nπŸ“Š Average Perplexity: {avg_perplexity:.4f}"
258
+
259
+ # Add recommendations
260
+ final_result += "\n\nπŸ’‘ RECOMMENDATIONS AFTER EVALUATION:"
261
+ final_result += "\n1. Lower perplexity indicates better model performance"
262
+ final_result += "\n2. If perplexity is high, consider additional training or fine-tuning"
263
+ final_result += "\n3. Try comparing results across different model versions"
264
+
265
+ return final_result
266
+
267
+ except Exception as e:
268
+ return f"❌ Evaluation failed: {str(e)}"
269
+
270
+ def upload_model_to_hub(model_dir, repo_name, token):
271
+ """Upload trained model to HuggingFace Hub"""
272
+ if not os.path.exists(model_dir):
273
+ return "❌ Model directory not found. Please train a model first."
274
+
275
+ if not repo_name.strip():
276
+ return "❌ Please provide a repository name."
277
+
278
+ if not token.strip():
279
+ return "❌ Please provide your HuggingFace token."
280
+
281
+ try:
282
+ api = HfApi()
283
+
284
+ # Create the repo if it doesn't exist
285
+ try:
286
+ api.create_repo(repo_id=repo_name, token=token, exist_ok=True)
287
+ except Exception as e:
288
+ return f"❌ Failed to create repository: {str(e)}"
289
+
290
+ # Upload the model files
291
+ api.upload_folder(
292
+ folder_path=model_dir,
293
+ repo_id=repo_name,
294
+ token=token,
295
+ commit_message=f"Upload trained model from Spaces"
296
+ )
297
+
298
+ response = f"βœ… Model successfully uploaded to {repo_name}!"
299
+ response += "\n\nπŸ’‘ RECOMMENDATIONS AFTER UPLOADING:"
300
+ response += "\n1. You can now use this model in other applications by referencing its name"
301
+ response += f"\n2. Try using it: `from transformers import AutoModel; model = AutoModel.from_pretrained('{repo_name}')`"
302
+ response += "\n3. Share the model with others who might find it useful"
303
+
304
+ return response
305
+
306
+ except Exception as e:
307
+ return f"❌ Upload failed: {str(e)}"
308
+
309
+ def model_info():
310
+ """Display information about the loaded model"""
311
+ if not MODEL_LOADED or MODEL is None:
312
+ return "❌ Model not loaded. Please load the model first."
313
+
314
+ info = f"# Model Information\n\n"
315
+ info += f"- **Model Name**: {MODEL_NAME}\n"
316
+ info += f"- **Model Type**: {MODEL_TYPE}\n"
317
+ info += f"- **Loading Time**: {MODEL_LOADING_TIME:.2f} seconds\n\n"
318
+
319
+ # Get model parameters
320
+ total_params = sum(p.numel() for p in MODEL.parameters())
321
+ trainable_params = sum(p.numel() for p in MODEL.parameters() if p.requires_grad)
322
+
323
+ info += f"- **Total Parameters**: {total_params:,}\n"
324
+ info += f"- **Trainable Parameters**: {trainable_params:,}\n"
325
+ info += f"- **Model Device**: {next(MODEL.parameters()).device}\n\n"
326
+
327
+ # Get tokenizer info
328
+ vocab_size = len(TOKENIZER)
329
+ info += f"- **Tokenizer Vocabulary Size**: {vocab_size:,}\n"
330
+ info += f"- **Padding Token**: `{TOKENIZER.pad_token}`\n"
331
+ info += f"- **EOS Token**: `{TOKENIZER.eos_token}`\n\n"
332
+
333
+ info += "## Model Usage Recommendations\n\n"
334
+ info += "1. **Testing**: Start with simple prompts to test the model's capabilities\n"
335
+ info += "2. **Training**: Use domain-specific data for best results\n"
336
+ info += "3. **Evaluation**: Regularly evaluate to track improvement\n"
337
+ info += "4. **Parameters**: Experiment with temperature (0.7-1.0) for creative tasks, lower (0.2-0.5) for factual responses\n"
338
+
339
+ return info
340
 
341
  # Create Gradio interface
342
+ with gr.Blocks(title=f"{MODEL_NAME} Advanced Space", theme=gr.themes.Soft()) as iface:
343
+ gr.Markdown(f"# {MODEL_NAME} Advanced Training Space")
344
+ gr.Markdown("This space provides advanced functionality for training, testing, and using language models with ZeroGPU acceleration.")
345
+
346
+ # Load model section - must be done first
347
+ with gr.Box():
348
+ gr.Markdown("### πŸš€ Step 1: Load Model (Required)")
349
+ with gr.Row():
350
+ with gr.Column():
351
+ load_btn = gr.Button("πŸ“₯ Load Model", variant="primary", size="lg")
352
+ gr.Markdown("⚠️ You must load the model before using any features below")
353
+ with gr.Column():
354
+ model_loading_output = gr.Markdown("Model not loaded. Click the button to load.")
355
+
356
+ # Connect the load button
357
+ load_btn.click(fn=load_model, outputs=model_loading_output)
358
+
359
+ # Model Info Tab
360
+ with gr.Accordion("ℹ️ Model Information", open=False):
361
+ model_info_output = gr.Markdown("Load the model to see information")
362
+ model_info_btn = gr.Button("πŸ“Š Show Model Information")
363
+ model_info_btn.click(fn=model_info, outputs=model_info_output)
364
 
365
+ # Main functionality tabs
366
  with gr.Tabs():
367
  # Test Tab
368
  with gr.TabItem("πŸ§ͺ Test Model"):
369
+ gr.Markdown("### Generate text with the model")
370
  with gr.Row():
371
  with gr.Column():
372
  test_input = gr.Textbox(
373
+ label="Input Prompt",
374
  placeholder="Enter text to test the model...",
375
  lines=3
376
  )
377
+ with gr.Row():
378
+ max_length_slider = gr.Slider(
379
+ minimum=10,
380
+ maximum=1000,
381
+ value=100,
382
+ step=10,
383
+ label="Max Output Length"
384
+ )
385
+ temperature_slider = gr.Slider(
386
+ minimum=0.1,
387
+ maximum=2.0,
388
+ value=0.7,
389
+ label="Temperature"
390
+ )
391
+ with gr.Row():
392
+ top_p_slider = gr.Slider(
393
+ minimum=0.1,
394
+ maximum=1.0,
395
+ value=0.9,
396
+ step=0.05,
397
+ label="Top-p (nucleus sampling)"
398
+ )
399
+ repetition_penalty_slider = gr.Slider(
400
+ minimum=1.0,
401
+ maximum=2.0,
402
+ value=1.2,
403
+ step=0.05,
404
+ label="Repetition Penalty"
405
+ )
406
  test_btn = gr.Button("πŸš€ Generate", variant="primary")
407
 
408
  with gr.Column():
409
  test_output = gr.Textbox(
410
+ label="Generated Output",
411
+ lines=8,
412
  interactive=False
413
  )
414
+ gr.Markdown("""
415
+ ### Parameter Guide
416
+ - **Temperature**: Higher values (>1) make output more random, lower values (<1) make it more focused and deterministic
417
+ - **Top-p**: Controls diversity by limiting tokens to the most probable ones that sum to probability p
418
+ - **Repetition Penalty**: Penalizes repetition of words/phrases (higher values reduce repetition)
419
+ """)
420
 
421
  test_btn.click(
422
+ fn=generate_text,
423
+ inputs=[test_input, max_length_slider, temperature_slider, top_p_slider, repetition_penalty_slider],
424
  outputs=test_output
425
  )
426
 
427
  # Train Tab
428
  with gr.TabItem("πŸ‹οΈ Train Model"):
429
+ gr.Markdown("### Train or fine-tune the model on your data")
430
  train_dataset = gr.Textbox(
431
  label="Training Dataset",
432
+ placeholder="Enter training examples, one per line...",
433
+ lines=8
434
  )
435
  with gr.Row():
436
+ train_epochs = gr.Number(label="Epochs", value=1, minimum=1, maximum=10)
437
+ train_lr = gr.Number(label="Learning Rate", value=2e-5, minimum=1e-6, maximum=1e-3)
438
+ train_batch = gr.Number(label="Batch Size", value=2, minimum=1, maximum=8)
439
 
440
+ train_save_model = gr.Checkbox(label="Save trained model locally", value=True)
441
  train_btn = gr.Button("πŸš€ Start Training", variant="primary")
442
+ train_output = gr.Textbox(label="Training Progress", lines=10, interactive=False)
443
 
444
  train_btn.click(
445
  fn=train_model,
446
+ inputs=[train_dataset, train_epochs, train_lr, train_batch, train_save_model],
447
  outputs=train_output
448
  )
449
 
450
+ # Evaluate Tab
451
+ with gr.TabItem("πŸ“Š Evaluate Model"):
452
+ gr.Markdown("### Evaluate model performance on test data")
453
+ eval_dataset = gr.Textbox(
454
+ label="Test Dataset",
455
+ placeholder="Enter test examples, one per line...",
456
+ lines=8
457
+ )
458
+
459
+ with gr.Row():
460
+ metric_choice = gr.Radio(
461
+ ["perplexity", "accuracy"],
462
+ label="Evaluation Metric",
463
+ value="perplexity"
464
+ )
465
+
466
+ eval_btn = gr.Button("πŸ“Š Evaluate", variant="primary")
467
+ eval_output = gr.Textbox(label="Evaluation Results", lines=8, interactive=False)
468
+
469
+ eval_btn.click(
470
+ fn=evaluate_model,
471
+ inputs=[eval_dataset, metric_choice],
472
+ outputs=eval_output
473
  )
474
+
475
+ # Upload Tab
476
+ with gr.TabItem("πŸ“€ Upload Model"):
477
+ gr.Markdown("### Upload trained models to HuggingFace Hub")
478
  with gr.Row():
479
+ model_dir_input = gr.Textbox(
480
+ label="Model Directory",
481
+ placeholder="./trained-model-1234567890",
482
+ lines=1
483
+ )
484
+ repo_name_input = gr.Textbox(
485
+ label="Repository Name",
486
+ placeholder="username/model-name",
487
+ lines=1
488
+ )
489
+
490
+ hf_token_input = gr.Textbox(
491
+ label="HuggingFace Token",
492
+ placeholder="hf_...",
493
+ type="password",
494
+ lines=1
495
+ )
496
 
497
+ upload_btn = gr.Button("πŸ“€ Upload to Hub", variant="primary")
498
+ upload_output = gr.Textbox(label="Upload Status", lines=5, interactive=False)
499
 
500
+ upload_btn.click(
501
+ fn=upload_model_to_hub,
502
+ inputs=[model_dir_input, repo_name_input, hf_token_input],
503
+ outputs=upload_output
504
  )
505
+
506
+ # Footer with recommendations
507
+ gr.Markdown("""
508
+ ## πŸ’‘ Recommendations for Working with this Model
509
+
510
+ ### After Loading the Model:
511
+ 1. **Start by testing**: Use the Test tab with simple prompts to understand the model's capabilities
512
+ 2. **Evaluate baseline performance**: Run an evaluation on sample data before any training
513
+
514
+ ### For Training:
515
+ 1. **Start small**: Begin with a small dataset and 1-2 epochs to test the training process
516
+ 2. **Use domain-specific data**: For best results, use data from your target domain
517
+ 3. **Monitor training loss**: If loss isn't decreasing, try adjusting the learning rate
518
+
519
+ ### For Evaluation:
520
+ 1. **Use diverse test examples**: Include both simple and complex examples in your test set
521
+ 2. **Compare before/after**: Evaluate before and after training to measure improvement
522
+
523
+ ### For Model Upload:
524
+ 1. **Use descriptive repo names**: Include model type and purpose in the repository name
525
+ 2. **Document your changes**: Add a good description when uploading your model
526
+
527
+ ### General Tips:
528
+ 1. **Save checkpoints**: Always save your model after significant training
529
+ 2. **Track experiments**: Keep notes on hyperparameters and results
530
+ 3. **Start simple**: Master basic usage before attempting complex tasks
531
+ """)
532
 
533
  if __name__ == "__main__":
534
  iface.launch()
requirements.txt CHANGED
@@ -1,6 +1,9 @@
1
- gradio==4.36.1
2
  spaces
3
- torch
4
- transformers
5
- datasets
6
- accelerate
 
 
 
 
1
+ gradio>=4.36.1
2
  spaces
3
+ torch>=2.0.0
4
+ transformers>=4.30.0
5
+ datasets>=2.13.0
6
+ huggingface_hub>=0.16.0
7
+ numpy>=1.24.0
8
+ accelerate>=0.21.0
9
+ scikit-learn>=1.2.2