{ "cells": [ { "cell_type": "code", "execution_count": null, "id": "de09d1f3-ae75-4e4a-b4c0-dbb1339a439c", "metadata": {}, "outputs": [], "source": [ "#This Jupyter File contains the following scripts for Granite3.2-2B-FP16:\n", "\n", "#1)The Training Code used to train the LoRA adapters for the model and the output losses (if available).\n", "#->The model can be ran again to check the for the losses.\n", "\n", "#2)The Testing Code used to test the 5 variants of the model at different base precisions using the same FP16 LoRA Adapters.\n", "\n", "#3) The Evaluation Code used to evaluate the responses of the combined model and LoRA Adapters." ] }, { "cell_type": "code", "execution_count": null, "id": "82ec46fe-ad4c-4185-ad18-92d5287feada", "metadata": {}, "outputs": [], "source": [ "#TRAINING CODE FOR Granite3.2-2B-FP16" ] }, { "cell_type": "code", "execution_count": null, "id": "1ad9c7f0-77a2-4bc3-a79d-dd2bbe58cc03", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/jovyan/Falcon1B/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n", "[nltk_data] Downloading package punkt to /home/jovyan/nltk_data...\n", "[nltk_data] Package punkt is already up-to-date!\n", "Map: 100%|██████████| 15000/15000 [00:26<00:00, 568.70 examples/s]\n", "Map: 100%|██████████| 1500/1500 [00:02<00:00, 648.03 examples/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Sample 0:\n", "Input:\n", " Question: I have written a canny edge detection algorithm for a project. I want to know is there any method to link the broken segments of an edge, since i am getting a single edge as a conglomeration of a few segments. I am getting around 100 segments, which i am sure can be decreased with some intelligence. Please help.\n", "Answer: You can use a method named dynamic programming. A very good intro on this can be found on chapter 6 of Sonka's digital image processing book\n", "Label mask:\n", " [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 2448, 883, 793, 312, 1411, 8189, 7094, 16031, 32, 399, 5029, 4644, 8642, 544, 458, 883, 526, 2431, 544, 18471, 225, 40, 432, 28903, 3795, 1182, 18452, 1778, 8202, 7618, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n", "\n", "Sample 1:\n", "Input:\n", " Context: I have a dataset of book reviews:\n", "```\n", "| user_id | ISBN | vote | votes_for_user | average_user_vote | ISBN_categ |\n", " 213 3242X 4.5 12 3.4 1 \n", " 563 1245X 3.2 74 2.3 2\n", "```\n", "\n", "where \n", "```\n", " vote = rating given by user to a certain book\n", " votes_for_user = number of votes the user has in the dataset (nr of rows)\n", " average_user_vote = average of a user's votes\n", " ISBN_categ = integer categorical of the ISBN (since that is a string).\n", "```\n", "\n", "I want to apply a clustering algorithm such as DBSCAN to see how many clusters I can form with this dataset. \n", "My question is: \n", "Should I apply the clustering on the dataframe as it is (minus the ISBN column) or should I construct more features for every user and construct a dataframe where every user appears only once, together with their features, and cluster that? \n", "Remember, the intent here is to cluster users (by user_id), not data points (votes).\n", "Question: Clustering of users in a dataset\n", "Answer: If your objective is to find clusters of users, then you are interested in finding groups of \"similar\" reviewers.\n", "Therefore you should:\n", "\n", "- Retain information which relates to the users in a meaningful way - e.g. votes_for_user.\n", "\n", "- Discard information which has no meaningful relationship to a user - e.g. user_id (unless perhaps it contains some information such as time / order).\n", "\n", "- Be mindful of fields which may contain implicit relationships involving a user - e.g. vote may be a result of the interaction between user and ISBN.\n", "Label mask:\n", " [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 1370, 25110, 438, 372, 2290, 18063, 432, 4250, 30, 1615, 844, 884, 16574, 328, 17340, 8351, 432, 313, 18310, 20, 37499, 32, 203, 8309, 1444, 844, 1395, 44, 203, 203, 31, 9687, 504, 2471, 1510, 1875, 1196, 372, 322, 4250, 328, 312, 33081, 3352, 429, 484, 32, 89, 32, 34751, 81, 979, 81, 496, 32, 203, 203, 31, 3645, 2294, 2471, 1510, 1401, 1289, 33081, 12112, 372, 312, 1256, 429, 484, 32, 89, 32, 1256, 81, 314, 308, 23437, 19368, 561, 4304, 1629, 2471, 3751, 619, 1133, 517, 2532, 547, 203, 203, 31, 4261, 12204, 2790, 432, 3829, 1510, 1631, 4799, 10353, 25041, 14907, 7172, 312, 1256, 429, 484, 32, 89, 32, 20424, 1631, 526, 312, 1056, 432, 322, 15994, 3733, 1256, 461, 2756, 14282, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n", "\n", "Sample 2:\n", "Input:\n", " Question: What's a common technical challenge when using logistic regression?\n", "Answer: Dealing with class imbalance, which is when the number of observations in one class is significantly lower than the number of observations in the other class.\n", "Label mask:\n", " [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1513, 23959, 623, 443, 3960, 10401, 30, 1510, 438, 1412, 322, 1451, 432, 25285, 328, 1591, 443, 438, 32323, 7216, 2784, 322, 1451, 432, 25285, 328, 322, 1604, 443, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Loading checkpoint shards: 100%|██████████| 2/2 [00:23<00:00, 11.74s/it]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "trainable params: 34,078,720 || all params: 2,567,610,368 || trainable%: 1.3273\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_65109/1516336251.py:195: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `Trainer.__init__`. Use `processing_class` instead.\n", " trainer = Trainer(model=model, args=training_args, train_dataset=train_dataset, eval_dataset=test_dataset, tokenizer=tokenizer,\n" ] }, { "data": { "text/html": [ "\n", "
Step | \n", "Training Loss | \n", "Validation Loss | \n", "
---|---|---|
610 | \n", "6.004300 | \n", "0.846677 | \n", "
620 | \n", "2.690300 | \n", "0.721633 | \n", "
630 | \n", "1.892000 | \n", "0.662078 | \n", "
640 | \n", "1.855500 | \n", "0.638113 | \n", "
650 | \n", "1.878100 | \n", "0.642582 | \n", "
660 | \n", "1.878800 | \n", "0.658247 | \n", "
670 | \n", "1.836700 | \n", "0.670157 | \n", "
680 | \n", "1.941900 | \n", "0.676019 | \n", "
\n", "