"]},"execution_count":13,"metadata":{},"output_type":"execute_result"}],"source":["instruction = \"ఘోర ప్రమాదం నుంచి కోలుకుని తిరిగి అంతర్జాతీయ క్రికెట్ ఆడుతున్న భారత వికెట్ కీపర్ రిషభ్ పంత్ ఓ అద్భుతమని పాకిస్థాన్ మాజీ కెప్టెన్ వసీమ్ అక్రమ్ కొనియాడాడు. ‘రోడ్డు ప్రమాదం తర్వాత ఎవరికైనా కోలుకునేందుకు చాలా సమయం పడుతుంది. ఇక ఆటగాడికైతే మరింత కష్టంగా ఉంటుంది. కానీ పంత్ అలా కాదు. నిజంగా తను మిరాకిల్ కిడ్. అతడిని యువతరం ఆదర్శంగా తీసుకోవాల్సిందే. ఐపీఎల్, టీ20 ప్రపంచకప్లోనూ ప్రభావం చూపి ఇప్పుడు టెస్టుల్లోనూ ఆకట్టుకుంటున్నాడు. ఆసీస్తో టెస్టు సిరీస్లోనూ తను కీలకం కానున్నాడు’ అని అక్రమ్ ప్రశంసించాడు. \"\n","\n","\n","prompt = template.format(\n"," article=instruction,\n"," response=\"\",\n",")\n","\n","# RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!\n","response_text = generate_response(model, tokenizer, prompt, device, 256)\n","\n","Markdown(response_text)"]},{"cell_type":"markdown","metadata":{},"source":["The 2B model doesn't seem to understand the instruction and keep repeating the context in telugu. The 9B gemma2 is better at understanding telugu but 2B is not. Lets experiment smaller model on the fintuning with telugu news dataset and compare the results\n"]},{"cell_type":"markdown","metadata":{},"source":["# 7. Applying Gemma LoRA\n","\n","In this Session, we'll be applying the LoRA (**Low-Rank Adaptation**) technique to the **Gemma model**, a method designed to make fine-tuning large models like Gemma both **fast and efficient**. LoRA, a part of **PEFT** (**Parameter Efficient Fine-Tuning**), focuses on updating specific parts of a pre-trained model by only training a select few dense layers. This drastically cuts down on the computational demands and GPU memory needs, all without adding any extra time to the inference process. Here's what makes LoRA so powerful for our purposes:\n","\n","
\n","Paper: LoRA: Low-Rank Adaptation of Large Language Models\n","\n","- **Dramatically reduces the number of parameters** needed, by up to **10,000 times**.\n","- **Cuts down GPU memory usage** by **three times**.\n","- **Maintains quick inference times** with **no additional latency**.\n","\n","The essence of PEFT, and by extension LoRA, is to enhance a model's performance using minimal resources, focusing on fine-tuning a handful of parameters for specific tasks. This technique is particularly advantageous as it:\n"," \n","- Optimizes rank decomposition matrices, maintaining the original model weights while adding optimized low-rank weights **A** and **B**.\n","- Allows for up to **threefold reductions** in both time and computational costs.\n","- Enables easy swapping of the LoRA module (weights **A** and **B**) according to the task at hand, lowering storage requirements and avoiding any increase in inference time.\n","\n","When applied specifically to **Transformer architectures**, targeting **attention weights** and keeping MLP modules static, LoRA significantly enhances the model's efficiency. For instance, in GPT-3 175B models, it:\n"," \n","- **Reduces VRAM usage** from **1.2TB to 350GB**.\n","- **Lowers checkpoint size** from **350GB to 35MB**.\n","- **Boosts training speed** by approximately **25%**.\n","\n","By integrating LoRA into Gemma, we aim to streamline the model's fine-tuning process in this Session, making it quicker and more resource-efficient, without compromising on performance."]},{"cell_type":"code","execution_count":14,"metadata":{"execution":{"iopub.execute_input":"2024-04-13T17:19:02.046698Z","iopub.status.busy":"2024-04-13T17:19:02.046069Z","iopub.status.idle":"2024-04-13T17:19:02.051888Z","shell.execute_reply":"2024-04-13T17:19:02.050874Z","shell.execute_reply.started":"2024-04-13T17:19:02.046662Z"},"trusted":true},"outputs":[],"source":["# LoRA configuration: Sets up the parameters for Low-Rank Adaptation, which is a method for efficient fine-tuning of transformers.\n","# USE LORA for saving memory and computation\n","lora_config = LoraConfig(\n"," r = 8, # Rank of the adaptation matrices. A lower rank means fewer parameters to train.\n"," target_modules = [\"q_proj\", \"o_proj\", \"k_proj\", \"v_proj\",\n"," \"gate_proj\", \"up_proj\", \"down_proj\"], # Transformer modules to apply LoRA.\n"," task_type = \"CAUSAL_LM\", # The type of task, here it is causal language modeling.\n",")"]},{"cell_type":"markdown","metadata":{},"source":["# 8.Evaluation Metrics"]},{"cell_type":"code","execution_count":15,"metadata":{},"outputs":[],"source":["# create evaluation metric ROUGE score for telugu language\n","import evaluate\n","\n","metric = evaluate.load(\"rouge\")\n","\n","# rouge metric formula\n","def compute_metrics(eval_pred):\n"," predictions, labels = eval_pred\n"," return metric.compute(predictions=predictions, references=labels)"]},{"cell_type":"markdown","metadata":{},"source":["# 8. Training Gemma\n","\n","Now that everything is set up, it's time to finetune the Gemma model on your data. This section will guide you through the training process, including setting up your training loop and selecting the right hyperparameters."]},{"cell_type":"code","execution_count":16,"metadata":{"execution":{"iopub.execute_input":"2024-04-13T17:19:02.053215Z","iopub.status.busy":"2024-04-13T17:19:02.052920Z","iopub.status.idle":"2024-04-13T17:19:02.068689Z","shell.execute_reply":"2024-04-13T17:19:02.067788Z","shell.execute_reply.started":"2024-04-13T17:19:02.053190Z"},"trusted":true},"outputs":[],"source":["def formatting_func(examples):\n"," \"\"\"\n"," Formats a given example (a dictionary containing question and answer list) using the predefined template.\n"," \n"," Parameters:\n"," - example (dict): A dictionary with keys corresponding to the columns of the dataset, such as 'article' and 'response'.\n"," \n"," Returns:\n"," - list: A list containing a single formatted string that combines the instruction and the response.\n"," \"\"\"\n"," # Add the phrase to verify training success and format the text using the template and the specific example's instruction and response.\n"," # we have to return list of strings example[question_column] and example[answer_column] are list of strings\n"," articles = examples[question_column]\n"," responses = examples[answer_column]\n"," inputs = []\n"," for i in range(len(articles)):\n"," inputs.append(template.format(article=articles[i], response=responses[i]))\n","\n"," #line = template.format(instruction=example[question_column], response=example[answer_column])\n"," return inputs\n"]},{"cell_type":"code","execution_count":17,"metadata":{"execution":{"iopub.execute_input":"2024-04-13T17:19:02.070337Z","iopub.status.busy":"2024-04-13T17:19:02.069881Z","iopub.status.idle":"2024-04-13T17:19:04.066929Z","shell.execute_reply":"2024-04-13T17:19:04.065995Z","shell.execute_reply.started":"2024-04-13T17:19:02.070304Z"},"trusted":true},"outputs":[{"name":"stderr","output_type":"stream","text":["huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n","To disable this warning, you can either:\n","\t- Avoid using `tokenizers` before the fork if possible\n","\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"]},{"name":"stderr","output_type":"stream","text":["/home/watchtower/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/trl/trainer/sft_trainer.py:300: UserWarning: You passed a processing_class with `padding_side` not equal to `right` to the SFTTrainer. This might lead to some unexpected behaviour due to overflow issues when training a model in half-precision. You might consider adding `processing_class.padding_side = 'right'` to your code.\n"," warnings.warn(\n"]}],"source":["!rm -rf outputs\n","os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:128'\n","# Setup for the trainer object that will handle fine-tuning of the model.\n","trainer = SFTTrainer(\n"," model=model, # The pre-trained model to fine-tune.\n"," train_dataset=dataset['train'], # The dataset used for training(83k)\n"," eval_dataset=dataset['validation'], # The dataset used for validation(10k)\n"," max_seq_length=512, # The maximum sequence length for the model inputs.\n"," compute_metrics=compute_metrics,\n"," args=TrainingArguments( # Arguments for training setup.\n"," per_device_train_batch_size= 4 , # Batch size per device (e.g., GPU).\n"," #gradient_accumulation_steps=4, # Number of steps to accumulate gradients before updating model weights.\n"," warmup_steps=10, # Number of steps to gradually increase the learning rate at the beginning of training.\n"," max_steps=10000, # Total number of training steps to perform.\n"," learning_rate=2e-4, # Learning rate for the optimizer.\n"," fp16=True, # Whether to use 16-bit floating point precision for training. False means 32-bit is used.\n"," logging_steps=1, # How often to log training information.\n"," output_dir=\"outputs\", # Directory where training outputs will be saved.\n"," eval_strategy=\"steps\",\n"," per_device_eval_batch_size=4,\n"," gradient_checkpointing=True, # Enable gradient checkpointing to save memory.\n"," #optim=\"paged_adamw_8bit\", # The optimizer to use, with 8-bit precision for efficiency.\n"," eval_accumulation_steps = 4, # FIX for evaluation https://discuss.huggingface.co/t/cuda-out-of-memory-when-using-trainer-with-compute-metrics/2941/3\n"," eval_steps=2000\n"," \n"," ),\n"," # peft_config=lora_config, # For The LoRA configuration for efficient fine-tuning.\n"," formatting_func=formatting_func, # The function to format the dataset examples.\n"," \n",")\n"]},{"cell_type":"code","execution_count":18,"metadata":{"execution":{"iopub.execute_input":"2024-04-13T17:19:04.068824Z","iopub.status.busy":"2024-04-13T17:19:04.068476Z","iopub.status.idle":"2024-04-13T17:19:46.369697Z","shell.execute_reply":"2024-04-13T17:19:46.368790Z","shell.execute_reply.started":"2024-04-13T17:19:04.068791Z"},"scrolled":true,"trusted":true},"outputs":[{"name":"stderr","output_type":"stream","text":["\u001b[34m\u001b[1mwandb\u001b[0m: \u001b[33mWARNING\u001b[0m The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.\n","`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.\n"]},{"data":{"text/html":["\n"," \n"," \n","
\n"," [ 1856/10000 29:54 < 2:11:22, 1.03 it/s, Epoch 0.09/1]\n","
\n"," \n"," \n"," \n"," Step | \n"," Training Loss | \n"," Validation Loss | \n","
\n"," \n"," \n"," \n","
"],"text/plain":[""]},"metadata":{},"output_type":"display_data"}],"source":["# train the model to the processed data.\n","trainer.train()"]},{"cell_type":"code","execution_count":1,"metadata":{},"outputs":[{"ename":"NameError","evalue":"name 'trainer' is not defined","output_type":"error","traceback":["\u001b[0;31m---------------------------------------------------------------------------\u001b[0m","\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)","Cell \u001b[0;32mIn[1], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m# Push the model to huggingface under my user name saidies12 and model name telugu-news-headline-generation\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m \u001b[43mtrainer\u001b[49m\u001b[38;5;241m.\u001b[39mpush_to_hub(\n\u001b[1;32m 3\u001b[0m repository_name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msaidines12/telugu-news-headline-generation\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 4\u001b[0m )\n","\u001b[0;31mNameError\u001b[0m: name 'trainer' is not defined"]}],"source":["# Push the model to huggingface under my user name saidies12 and model name telugu-news-headline-generation\n","trainer.push_to_hub(\n"," repository_name=\"saidines12/telugu-news-headline-generation\",\n",")"]},{"cell_type":"markdown","metadata":{},"source":["# 9. Q&A Results After Finetuning\n","\n","After training, let's see how much our Gemma model has improved. We'll rerun the question-answering test and compare the results to the pre-finetuning performance."]},{"cell_type":"code","execution_count":null,"metadata":{"execution":{"iopub.execute_input":"2024-04-13T17:19:46.371158Z","iopub.status.busy":"2024-04-13T17:19:46.370828Z","iopub.status.idle":"2024-04-13T17:19:55.489935Z","shell.execute_reply":"2024-04-13T17:19:55.489009Z","shell.execute_reply.started":"2024-04-13T17:19:46.371132Z"},"trusted":true},"outputs":[{"data":{"text/markdown":["\n","రోడ్డు ప్రమాదం తర్వాత కోలుకున్నాడు రిషభ్ పంత్\n","రోడ్డు ప్ర"],"text/plain":[""]},"execution_count":25,"metadata":{},"output_type":"execute_result"}],"source":["instruction = \"అక్రమ్ కరాచీ ఘోర ప్రమాదం నుంచి కోలుకుని తిరిగి అంతర్జాతీయ క్రికెట్ ఆడుతున్న భారత వికెట్ కీపర్ రిషభ్ పంత్ ఓ అద్భుతమని పాకిస్థాన్ మాజీ కెప్టెన్ వసీమ్ అక్రమ్ కొనియాడాడు. ‘రోడ్డు ప్రమాదం తర్వాత ఎవరికైనా కోలుకునేందుకు చాలా సమయం పడుతుంది. ఇక ఆటగాడికైతే మరింత కష్టంగా ఉంటుంది. కానీ పంత్ అలా కాదు. నిజంగా తను మిరాకిల్ కిడ్. అతడిని యువతరం ఆదర్శంగా తీసుకోవాల్సిందే. ఐపీఎల్, టీ20 ప్రపంచకప్లోనూ ప్రభావం చూపి ఇప్పుడు టెస్టుల్లోనూ ఆకట్టుకుంటున్నాడు. ఆసీస్తో టెస్టు సిరీస్లోనూ తను కీలకం కానున్నాడు’ అని అక్రమ్ ప్రశంసించాడు. \"\n","\n","\n","prompt = template.format(\n"," article=instruction,\n"," response=\"\",\n",")\n","\n","response_text = generate_response(trainer.model, tokenizer, prompt, device,32)\n","# TODO: Fix repitition of response\n","\n","Markdown(response_text)"]},{"cell_type":"markdown","metadata":{},"source":["**Although** the performance of the Gemma2B model okay, it is still better headline than reapeating the article from the last result. There is big room for improvement as we are using LORA with quantization. first try without LORA, and the performance doesn't match expected then Ramp up to bigger gemma 9B model which is really good at understanding telugu and instruction following. "]},{"cell_type":"markdown","metadata":{},"source":["# 10. Conclusion\n","\n","In this beginner-friendly notebook, we've outlined the process of fine-tuning the Gemma model, a Large Language Model (LLM), specifically for Python Q&A generation. Starting from data loading and preprocessing, we've demonstrated how to train the Gemma model effectively, even for those new to working with LLMs.\n","\n","We leveraged the Dataset_Python_Question_Answer, featuring hundreds of Python programming questions and answers, to train and refine the Gemma model's capabilities in generating accurate Q&As. This journey, while introductory, underscores the potential and straightforward path to engaging with LLMs through the Gemma model.\n","\n","Achieving the best performance with the Gemma model (or any LLM) generally requires training with more extensive datasets and over more epochs. Future enhancements could include integrating Retrieval-Augmented Generation (RAG) and Direct Preference Optimization (DPO) training techniques, offering a way to further improve the model by incorporating external knowledge bases for more precise and relevant responses.\n","\n","Ultimately, this notebook is designed to make the Gemma model approachable for beginners, illustrating that straightforward steps can unlock the potential of LLMs for diverse domain-specific tasks. It encourages users to experiment with the Gemma model across various fields, broadening the scope of its application and enhancing its utility."]},{"cell_type":"markdown","metadata":{},"source":["Reference:\n","\n"]},{"cell_type":"code","execution_count":9,"metadata":{},"outputs":[{"ename":"HfHubHTTPError","evalue":"502 Server Error: Bad Gateway for url: https://huggingface.co/api/models/saidines12/telugu-news-headline-generation/commit/main\n\n\n\n\n \n \n \n \n \n \n \n \n\n Hugging Face - The AI community building the future.\n \n \n\n\n\n\n
\n \n\n\n","output_type":"error","traceback":["\u001b[0;31m---------------------------------------------------------------------------\u001b[0m","\u001b[0;31mHTTPError\u001b[0m Traceback (most recent call last)","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/utils/_http.py:406\u001b[0m, in \u001b[0;36mhf_raise_for_status\u001b[0;34m(response, endpoint_name)\u001b[0m\n\u001b[1;32m 405\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 406\u001b[0m \u001b[43mresponse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mraise_for_status\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 407\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m HTTPError \u001b[38;5;28;01mas\u001b[39;00m e:\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/requests/models.py:1024\u001b[0m, in \u001b[0;36mResponse.raise_for_status\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 1023\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m http_error_msg:\n\u001b[0;32m-> 1024\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m HTTPError(http_error_msg, response\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m)\n","\u001b[0;31mHTTPError\u001b[0m: 502 Server Error: Bad Gateway for url: https://huggingface.co/api/models/saidines12/telugu-news-headline-generation/commit/main","\nThe above exception was the direct cause of the following exception:\n","\u001b[0;31mHfHubHTTPError\u001b[0m Traceback (most recent call last)","Cell \u001b[0;32mIn[9], line 4\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mhuggingface_hub\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m HfApi\n\u001b[1;32m 2\u001b[0m api \u001b[38;5;241m=\u001b[39m HfApi()\n\u001b[0;32m----> 4\u001b[0m \u001b[43mapi\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mupload_file\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 5\u001b[0m \u001b[43m \u001b[49m\u001b[43mpath_or_fileobj\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m/data1/max/telugu_corpus/andhrajyothy_data/gemma-fine-tuning-on-telugu-news-dataset.ipynb\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 6\u001b[0m \u001b[43m \u001b[49m\u001b[43mrepo_id\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43msaidines12/telugu-news-headline-generation\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 7\u001b[0m \u001b[43m \u001b[49m\u001b[43mrepo_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmodel\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 8\u001b[0m \u001b[43m \u001b[49m\u001b[43mpath_in_repo\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mnotebooks/gemma-fine-tuning-on-telugu-news-dataset.ipynb\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 9\u001b[0m \u001b[43m)\u001b[49m\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:114\u001b[0m, in \u001b[0;36mvalidate_hf_hub_args.._inner_fn\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 111\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m check_use_auth_token:\n\u001b[1;32m 112\u001b[0m kwargs \u001b[38;5;241m=\u001b[39m smoothly_deprecate_use_auth_token(fn_name\u001b[38;5;241m=\u001b[39mfn\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m, has_token\u001b[38;5;241m=\u001b[39mhas_token, kwargs\u001b[38;5;241m=\u001b[39mkwargs)\n\u001b[0;32m--> 114\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfn\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/hf_api.py:1485\u001b[0m, in \u001b[0;36mfuture_compatible.._inner\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 1482\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrun_as_future(fn, \u001b[38;5;28mself\u001b[39m, \u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[1;32m 1484\u001b[0m \u001b[38;5;66;03m# Otherwise, call the function normally\u001b[39;00m\n\u001b[0;32m-> 1485\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfn\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/hf_api.py:4653\u001b[0m, in \u001b[0;36mHfApi.upload_file\u001b[0;34m(self, path_or_fileobj, path_in_repo, repo_id, token, repo_type, revision, commit_message, commit_description, create_pr, parent_commit, run_as_future)\u001b[0m\n\u001b[1;32m 4645\u001b[0m commit_message \u001b[38;5;241m=\u001b[39m (\n\u001b[1;32m 4646\u001b[0m commit_message \u001b[38;5;28;01mif\u001b[39;00m commit_message \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mUpload \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mpath_in_repo\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m with huggingface_hub\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 4647\u001b[0m )\n\u001b[1;32m 4648\u001b[0m operation \u001b[38;5;241m=\u001b[39m CommitOperationAdd(\n\u001b[1;32m 4649\u001b[0m path_or_fileobj\u001b[38;5;241m=\u001b[39mpath_or_fileobj,\n\u001b[1;32m 4650\u001b[0m path_in_repo\u001b[38;5;241m=\u001b[39mpath_in_repo,\n\u001b[1;32m 4651\u001b[0m )\n\u001b[0;32m-> 4653\u001b[0m commit_info \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcreate_commit\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 4654\u001b[0m \u001b[43m \u001b[49m\u001b[43mrepo_id\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mrepo_id\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4655\u001b[0m \u001b[43m \u001b[49m\u001b[43mrepo_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mrepo_type\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4656\u001b[0m \u001b[43m \u001b[49m\u001b[43moperations\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m[\u001b[49m\u001b[43moperation\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4657\u001b[0m \u001b[43m \u001b[49m\u001b[43mcommit_message\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcommit_message\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4658\u001b[0m \u001b[43m \u001b[49m\u001b[43mcommit_description\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcommit_description\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4659\u001b[0m \u001b[43m \u001b[49m\u001b[43mtoken\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtoken\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4660\u001b[0m \u001b[43m \u001b[49m\u001b[43mrevision\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mrevision\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4661\u001b[0m \u001b[43m \u001b[49m\u001b[43mcreate_pr\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcreate_pr\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4662\u001b[0m \u001b[43m \u001b[49m\u001b[43mparent_commit\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mparent_commit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 4663\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 4665\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m commit_info\u001b[38;5;241m.\u001b[39mpr_url \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 4666\u001b[0m revision \u001b[38;5;241m=\u001b[39m quote(_parse_revision_from_pr_url(commit_info\u001b[38;5;241m.\u001b[39mpr_url), safe\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:114\u001b[0m, in \u001b[0;36mvalidate_hf_hub_args.._inner_fn\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 111\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m check_use_auth_token:\n\u001b[1;32m 112\u001b[0m kwargs \u001b[38;5;241m=\u001b[39m smoothly_deprecate_use_auth_token(fn_name\u001b[38;5;241m=\u001b[39mfn\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m, has_token\u001b[38;5;241m=\u001b[39mhas_token, kwargs\u001b[38;5;241m=\u001b[39mkwargs)\n\u001b[0;32m--> 114\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfn\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/hf_api.py:1485\u001b[0m, in \u001b[0;36mfuture_compatible.._inner\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 1482\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrun_as_future(fn, \u001b[38;5;28mself\u001b[39m, \u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[1;32m 1484\u001b[0m \u001b[38;5;66;03m# Otherwise, call the function normally\u001b[39;00m\n\u001b[0;32m-> 1485\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfn\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/hf_api.py:3995\u001b[0m, in \u001b[0;36mHfApi.create_commit\u001b[0;34m(self, repo_id, operations, commit_message, commit_description, token, repo_type, revision, create_pr, num_threads, parent_commit, run_as_future)\u001b[0m\n\u001b[1;32m 3993\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m 3994\u001b[0m commit_resp \u001b[38;5;241m=\u001b[39m get_session()\u001b[38;5;241m.\u001b[39mpost(url\u001b[38;5;241m=\u001b[39mcommit_url, headers\u001b[38;5;241m=\u001b[39mheaders, data\u001b[38;5;241m=\u001b[39mdata, params\u001b[38;5;241m=\u001b[39mparams)\n\u001b[0;32m-> 3995\u001b[0m \u001b[43mhf_raise_for_status\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcommit_resp\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mendpoint_name\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcommit\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 3996\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m RepositoryNotFoundError \u001b[38;5;28;01mas\u001b[39;00m e:\n\u001b[1;32m 3997\u001b[0m e\u001b[38;5;241m.\u001b[39mappend_to_message(_CREATE_COMMIT_NO_REPO_ERROR_MESSAGE)\n","File \u001b[0;32m~/.pyenv/versions/3.10.2/envs/venv1/lib/python3.10/site-packages/huggingface_hub/utils/_http.py:477\u001b[0m, in \u001b[0;36mhf_raise_for_status\u001b[0;34m(response, endpoint_name)\u001b[0m\n\u001b[1;32m 473\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m _format(HfHubHTTPError, message, response) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01me\u001b[39;00m\n\u001b[1;32m 475\u001b[0m \u001b[38;5;66;03m# Convert `HTTPError` into a `HfHubHTTPError` to display request information\u001b[39;00m\n\u001b[1;32m 476\u001b[0m \u001b[38;5;66;03m# as well (request id and/or server error message)\u001b[39;00m\n\u001b[0;32m--> 477\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m _format(HfHubHTTPError, \u001b[38;5;28mstr\u001b[39m(e), response) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01me\u001b[39;00m\n","\u001b[0;31mHfHubHTTPError\u001b[0m: 502 Server Error: Bad Gateway for url: https://huggingface.co/api/models/saidines12/telugu-news-headline-generation/commit/main\n\n\n\n\n \n \n \n \n \n \n \n \n\n Hugging Face - The AI community building the future.\n \n \n\n\n\n\n
\n \n\n\n"]}],"source":["from huggingface_hub import HfApi\n","api = HfApi()\n","\n","api.upload_file(\n"," path_or_fileobj=\"/data1/max/telugu_corpus/andhrajyothy_data/gemma-fine-tuning-on-telugu-news-dataset.ipynb\",\n"," repo_id=\"saidines12/telugu-news-headline-generation\",\n"," repo_type=\"model\",\n"," path_in_repo=\"notebooks/gemma-fine-tuning-on-telugu-news-dataset.ipynb\",\n",")"]}],"metadata":{"kaggle":{"accelerator":"none","dataSources":[{"databundleVersionId":7669720,"sourceId":64148,"sourceType":"competition"},{"datasetId":4616621,"sourceId":7970419,"sourceType":"datasetVersion"},{"isSourceIdPinned":true,"modelInstanceId":8318,"sourceId":28785,"sourceType":"modelInstanceVersion"}],"dockerImageVersionId":30683,"isGpuEnabled":false,"isInternetEnabled":true,"language":"python","sourceType":"notebook"},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.10.2"}},"nbformat":4,"nbformat_minor":4}