{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "01991402-68d2-4cfb-9b3a-22f170ccf74b",
   "metadata": {},
   "source": [
    "# Build a smart assistant to help mental health counselors respond thoughtfully to patients. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f643f217-faf3-4b5a-9f14-63b898a4d7b8",
   "metadata": {},
   "source": [
    "## The Mental Health Chatbot (Multi-Turn with LLM + Classifier) works in **two steps**:\n",
    "\n",
    "1. **Understanding the situation**: When you describe a patient's issue, the system uses a machine learning model to figure out what kind of response might be most helpful—like giving advice, validating feelings, asking a follow-up question, or sharing some mental health information.\n",
    "\n",
    "2. **Generating a helpful reply**: After the system decides what type of response is appropriate, it asks a language model (Flan-T5) to write a suggestion based on that need. For example, if the model thinks the user needs validation, it will ask the LLM to generate an empathetic and supportive response.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0cae93e4-7e58-4497-8047-4a069bc7a6c6",
   "metadata": {},
   "source": [
    "## This following python code does:\n",
    "- Classifies user messages into response types (advice, validation, information, question)\n",
    "- Uses a language model (Flan-T5) to generate counselor-like responses\n",
    "- Maintains a limited conversation history\n",
    "- Allows exporting conversation history to a JSON file"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d29686fa-8db3-4447-9970-da864a96dc64",
   "metadata": {},
   "source": [
    "### Load Required Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "8375290a-0034-4050-af72-c76183020bec",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Pi/miniconda3/envs/myenv/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    }
   ],
   "source": [
    "import json\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.feature_extraction.text import TfidfVectorizer\n",
    "from sklearn.preprocessing import LabelEncoder\n",
    "from xgboost import XGBClassifier\n",
    "from transformers import pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c15554b4-d47f-4052-a18d-a51742065fc4",
   "metadata": {},
   "source": [
    "### Load and Label Dataset "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dbb763bb-eaa1-45ce-ad03-3d6bb300c4e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"dataset/Kaggle_Mental_Health_Conversations_train.csv\")\n",
    "df = df[['Context', 'Response']].dropna().copy()\n",
    "\n",
    "keywords_to_labels = {\n",
    "    'advice': ['try', 'should', 'suggest', 'recommend'],\n",
    "    'validation': ['understand', 'feel', 'valid', 'normal'],\n",
    "    'information': ['cause', 'often', 'disorder', 'symptom'],\n",
    "    'question': ['how', 'what', 'why', 'have you']\n",
    "}\n",
    "\n",
    "def auto_label_response(response):\n",
    "    response = response.lower()\n",
    "    for label, keywords in keywords_to_labels.items():\n",
    "        if any(word in response for word in keywords):\n",
    "            return label\n",
    "    return 'information'\n",
    "\n",
    "df['response_type'] = df['Response'].apply(auto_label_response)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3cc76163-b688-49b1-a5ab-3f991e6a790b",
   "metadata": {},
   "source": [
    "### Train on Combined Context + Response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "5aff7aff-7756-4c28-9424-9a9963580883",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Pi/miniconda3/envs/myenv/lib/python3.10/site-packages/xgboost/training.py:183: UserWarning: [22:52:39] WARNING: /Users/runner/work/xgboost/xgboost/src/learner.cc:738: \n",
      "Parameters: { \"use_label_encoder\" } are not used.\n",
      "\n",
      "  bst.update(dtrain, iteration=i, fobj=obj)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-1 {\n",
       "  /* Definition of color scheme common for light and dark mode */\n",
       "  --sklearn-color-text: #000;\n",
       "  --sklearn-color-text-muted: #666;\n",
       "  --sklearn-color-line: gray;\n",
       "  /* Definition of color scheme for unfitted estimators */\n",
       "  --sklearn-color-unfitted-level-0: #fff5e6;\n",
       "  --sklearn-color-unfitted-level-1: #f6e4d2;\n",
       "  --sklearn-color-unfitted-level-2: #ffe0b3;\n",
       "  --sklearn-color-unfitted-level-3: chocolate;\n",
       "  /* Definition of color scheme for fitted estimators */\n",
       "  --sklearn-color-fitted-level-0: #f0f8ff;\n",
       "  --sklearn-color-fitted-level-1: #d4ebff;\n",
       "  --sklearn-color-fitted-level-2: #b3dbfd;\n",
       "  --sklearn-color-fitted-level-3: cornflowerblue;\n",
       "\n",
       "  /* Specific color for light theme */\n",
       "  --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
       "  --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-icon: #696969;\n",
       "\n",
       "  @media (prefers-color-scheme: dark) {\n",
       "    /* Redefinition of color scheme for dark theme */\n",
       "    --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
       "    --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-icon: #878787;\n",
       "  }\n",
       "}\n",
       "\n",
       "#sk-container-id-1 {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 pre {\n",
       "  padding: 0;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-hidden--visually {\n",
       "  border: 0;\n",
       "  clip: rect(1px 1px 1px 1px);\n",
       "  clip: rect(1px, 1px, 1px, 1px);\n",
       "  height: 1px;\n",
       "  margin: -1px;\n",
       "  overflow: hidden;\n",
       "  padding: 0;\n",
       "  position: absolute;\n",
       "  width: 1px;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-dashed-wrapped {\n",
       "  border: 1px dashed var(--sklearn-color-line);\n",
       "  margin: 0 0.4em 0.5em 0.4em;\n",
       "  box-sizing: border-box;\n",
       "  padding-bottom: 0.4em;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-container {\n",
       "  /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
       "     but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
       "     so we also need the `!important` here to be able to override the\n",
       "     default hidden behavior on the sphinx rendered scikit-learn.org.\n",
       "     See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
       "  display: inline-block !important;\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-text-repr-fallback {\n",
       "  display: none;\n",
       "}\n",
       "\n",
       "div.sk-parallel-item,\n",
       "div.sk-serial,\n",
       "div.sk-item {\n",
       "  /* draw centered vertical line to link estimators */\n",
       "  background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
       "  background-size: 2px 100%;\n",
       "  background-repeat: no-repeat;\n",
       "  background-position: center center;\n",
       "}\n",
       "\n",
       "/* Parallel-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item::after {\n",
       "  content: \"\";\n",
       "  width: 100%;\n",
       "  border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
       "  flex-grow: 1;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel {\n",
       "  display: flex;\n",
       "  align-items: stretch;\n",
       "  justify-content: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:first-child::after {\n",
       "  align-self: flex-end;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:last-child::after {\n",
       "  align-self: flex-start;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:only-child::after {\n",
       "  width: 0;\n",
       "}\n",
       "\n",
       "/* Serial-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-serial {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "  align-items: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  padding-right: 1em;\n",
       "  padding-left: 1em;\n",
       "}\n",
       "\n",
       "\n",
       "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
       "clickable and can be expanded/collapsed.\n",
       "- Pipeline and ColumnTransformer use this feature and define the default style\n",
       "- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
       "*/\n",
       "\n",
       "/* Pipeline and ColumnTransformer style (default) */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable {\n",
       "  /* Default theme specific background. It is overwritten whether we have a\n",
       "  specific estimator or a Pipeline/ColumnTransformer */\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "/* Toggleable label */\n",
       "#sk-container-id-1 label.sk-toggleable__label {\n",
       "  cursor: pointer;\n",
       "  display: flex;\n",
       "  width: 100%;\n",
       "  margin-bottom: 0;\n",
       "  padding: 0.5em;\n",
       "  box-sizing: border-box;\n",
       "  text-align: center;\n",
       "  align-items: start;\n",
       "  justify-content: space-between;\n",
       "  gap: 0.5em;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label .caption {\n",
       "  font-size: 0.6rem;\n",
       "  font-weight: lighter;\n",
       "  color: var(--sklearn-color-text-muted);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n",
       "  /* Arrow on the left of the label */\n",
       "  content: \"▸\";\n",
       "  float: left;\n",
       "  margin-right: 0.25em;\n",
       "  color: var(--sklearn-color-icon);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "/* Toggleable content - dropdown */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content {\n",
       "  max-height: 0;\n",
       "  max-width: 0;\n",
       "  overflow: hidden;\n",
       "  text-align: left;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content pre {\n",
       "  margin: 0.2em;\n",
       "  border-radius: 0.25em;\n",
       "  color: var(--sklearn-color-text);\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
       "  /* Expand drop-down */\n",
       "  max-height: 200px;\n",
       "  max-width: 100%;\n",
       "  overflow: auto;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
       "  content: \"▾\";\n",
       "}\n",
       "\n",
       "/* Pipeline/ColumnTransformer-specific style */\n",
       "\n",
       "#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator-specific style */\n",
       "\n",
       "/* Colorize estimator box */\n",
       "#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  /* The background is the default theme color */\n",
       "  color: var(--sklearn-color-text-on-default-background);\n",
       "}\n",
       "\n",
       "/* On hover, darken the color of the background */\n",
       "#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "/* Label box, darken color on hover, fitted */\n",
       "#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator label */\n",
       "\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  font-family: monospace;\n",
       "  font-weight: bold;\n",
       "  display: inline-block;\n",
       "  line-height: 1.2em;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label-container {\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "/* Estimator-specific */\n",
       "#sk-container-id-1 div.sk-estimator {\n",
       "  font-family: monospace;\n",
       "  border: 1px dotted var(--sklearn-color-border-box);\n",
       "  border-radius: 0.25em;\n",
       "  box-sizing: border-box;\n",
       "  margin-bottom: 0.5em;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "/* on hover */\n",
       "#sk-container-id-1 div.sk-estimator:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
       "\n",
       "/* Common style for \"i\" and \"?\" */\n",
       "\n",
       ".sk-estimator-doc-link,\n",
       "a:link.sk-estimator-doc-link,\n",
       "a:visited.sk-estimator-doc-link {\n",
       "  float: right;\n",
       "  font-size: smaller;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1em;\n",
       "  height: 1em;\n",
       "  width: 1em;\n",
       "  text-decoration: none !important;\n",
       "  margin-left: 0.5em;\n",
       "  text-align: center;\n",
       "  /* unfitted */\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted,\n",
       "a:link.sk-estimator-doc-link.fitted,\n",
       "a:visited.sk-estimator-doc-link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "/* Span, style for the box shown on hovering the info icon */\n",
       ".sk-estimator-doc-link span {\n",
       "  display: none;\n",
       "  z-index: 9999;\n",
       "  position: relative;\n",
       "  font-weight: normal;\n",
       "  right: .2ex;\n",
       "  padding: .5ex;\n",
       "  margin: .5ex;\n",
       "  width: min-content;\n",
       "  min-width: 20ex;\n",
       "  max-width: 50ex;\n",
       "  color: var(--sklearn-color-text);\n",
       "  box-shadow: 2pt 2pt 4pt #999;\n",
       "  /* unfitted */\n",
       "  background: var(--sklearn-color-unfitted-level-0);\n",
       "  border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted span {\n",
       "  /* fitted */\n",
       "  background: var(--sklearn-color-fitted-level-0);\n",
       "  border: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link:hover span {\n",
       "  display: block;\n",
       "}\n",
       "\n",
       "/* \"?\"-specific style due to the `<a>` HTML tag */\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link {\n",
       "  float: right;\n",
       "  font-size: 1rem;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1rem;\n",
       "  height: 1rem;\n",
       "  width: 1rem;\n",
       "  text-decoration: none;\n",
       "  /* unfitted */\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "#sk-container-id-1 a.estimator_doc_link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>XGBClassifier(base_score=None, booster=None, callbacks=None,\n",
       "              colsample_bylevel=None, colsample_bynode=None,\n",
       "              colsample_bytree=None, device=None, early_stopping_rounds=None,\n",
       "              enable_categorical=False, eval_metric=&#x27;mlogloss&#x27;,\n",
       "              feature_types=None, feature_weights=None, gamma=None,\n",
       "              grow_policy=None, importance_type=None,\n",
       "              interaction_constraints=None, learning_rate=0.1, max_bin=None,\n",
       "              max_cat_threshold=None, max_cat_to_onehot=None,\n",
       "              max_delta_step=None, max_depth=6, max_leaves=None,\n",
       "              min_child_weight=None, missing=nan, monotone_constraints=None,\n",
       "              multi_strategy=None, n_estimators=100, n_jobs=None, num_class=4, ...)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow\"><div><div>XGBClassifier</div></div><div><a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://xgboost.readthedocs.io/en/release_3.0.0/python/python_api.html#xgboost.XGBClassifier\">?<span>Documentation for XGBClassifier</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></div></label><div class=\"sk-toggleable__content fitted\"><pre>XGBClassifier(base_score=None, booster=None, callbacks=None,\n",
       "              colsample_bylevel=None, colsample_bynode=None,\n",
       "              colsample_bytree=None, device=None, early_stopping_rounds=None,\n",
       "              enable_categorical=False, eval_metric=&#x27;mlogloss&#x27;,\n",
       "              feature_types=None, feature_weights=None, gamma=None,\n",
       "              grow_policy=None, importance_type=None,\n",
       "              interaction_constraints=None, learning_rate=0.1, max_bin=None,\n",
       "              max_cat_threshold=None, max_cat_to_onehot=None,\n",
       "              max_delta_step=None, max_depth=6, max_leaves=None,\n",
       "              min_child_weight=None, missing=nan, monotone_constraints=None,\n",
       "              multi_strategy=None, n_estimators=100, n_jobs=None, num_class=4, ...)</pre></div> </div></div></div></div>"
      ],
      "text/plain": [
       "XGBClassifier(base_score=None, booster=None, callbacks=None,\n",
       "              colsample_bylevel=None, colsample_bynode=None,\n",
       "              colsample_bytree=None, device=None, early_stopping_rounds=None,\n",
       "              enable_categorical=False, eval_metric='mlogloss',\n",
       "              feature_types=None, feature_weights=None, gamma=None,\n",
       "              grow_policy=None, importance_type=None,\n",
       "              interaction_constraints=None, learning_rate=0.1, max_bin=None,\n",
       "              max_cat_threshold=None, max_cat_to_onehot=None,\n",
       "              max_delta_step=None, max_depth=6, max_leaves=None,\n",
       "              min_child_weight=None, missing=nan, monotone_constraints=None,\n",
       "              multi_strategy=None, n_estimators=100, n_jobs=None, num_class=4, ...)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['combined_text'] = df['Context'] + \" \" + df['Response']\n",
    "\n",
    "le = LabelEncoder()\n",
    "y = le.fit_transform(df['response_type'])\n",
    "\n",
    "vectorizer = TfidfVectorizer(max_features=2000, ngram_range=(1, 2))\n",
    "X = vectorizer.fit_transform(df['combined_text'])\n",
    "\n",
    "X_train, X_test, y_train, y_test = train_test_split(\n",
    "    X, y, test_size=0.2, stratify=y, random_state=42\n",
    ")\n",
    "\n",
    "xgb_model = XGBClassifier(\n",
    "    objective='multi:softmax',\n",
    "    num_class=len(le.classes_),\n",
    "    eval_metric='mlogloss',\n",
    "    use_label_encoder=False,\n",
    "    max_depth=6,\n",
    "    learning_rate=0.1,\n",
    "    n_estimators=100\n",
    ")\n",
    "xgb_model.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c7b4e142-8479-4da0-8f6a-70e488f90349",
   "metadata": {},
   "source": [
    "### Load LLM (Flan-T5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "ea35abb0-f2df-4331-b41a-965a9e42b4c5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading Flan-T5 model... (this may take a few seconds)\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Device set to use mps:0\n"
     ]
    }
   ],
   "source": [
    "print(\"Loading Flan-T5 model... (this may take a few seconds)\")\n",
    "llm = pipeline(\"text2text-generation\", model=\"google/flan-t5-base\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1ccf043-7431-4767-9ac8-73c59fe46ccf",
   "metadata": {},
   "source": [
    "### Prediction + Prompt Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "982161f8-824c-45c2-b968-e6e7ad5e0874",
   "metadata": {},
   "outputs": [],
   "source": [
    "def predict_response_type(user_input):\n",
    "    combined = user_input + \" placeholder_response\"\n",
    "    vec = vectorizer.transform([combined])\n",
    "    prediction = xgb_model.predict(vec)[0]\n",
    "    predicted_class = le.inverse_transform([prediction])[0]\n",
    "    confidence = np.max(xgb_model.predict_proba(vec))\n",
    "    return predicted_class, confidence\n",
    "\n",
    "def prompt_templates(user_input, response_type):\n",
    "    templates = {\n",
    "        \"advice\": f\"A student said: \\\"{user_input}\\\". What practical advice should a mental health counselor offer?\",\n",
    "        \"validation\": f\"A student said: \\\"{user_input}\\\". Respond with an emotionally supportive message that shows empathy and validates their feelings.\",\n",
    "        \"information\": f\"A student said: \\\"{user_input}\\\". Explain what might be happening emotionally from a counselor's perspective.\",\n",
    "        \"question\": f\"A student said: \\\"{user_input}\\\". What are 1-2 thoughtful follow-up questions a counselor might ask?\"\n",
    "    }\n",
    "    return templates.get(response_type, templates[\"information\"])\n",
    "\n",
    "def generate_llm_response(user_input, response_type):\n",
    "    prompt = prompt_templates(user_input, response_type)\n",
    "    result = llm(prompt, max_length=150, do_sample=True, temperature=0.7, top_p=0.9)\n",
    "    return result[0][\"generated_text\"].strip()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8599f696-9a22-4f03-9021-596b46febf26",
   "metadata": {},
   "source": [
    "### Conversation Memory + Exporting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "f95ba731-e303-42b2-a46d-143f5aaeb914",
   "metadata": {},
   "outputs": [],
   "source": [
    "MAX_MEMORY_TURNS = 6\n",
    "history = []\n",
    "\n",
    "def trim_memory(history, max_turns=MAX_MEMORY_TURNS):\n",
    "    return history[-max_turns:]\n",
    "\n",
    "def save_conversation(history, filename=\"chat_history.json\"):\n",
    "    with open(filename, \"w\") as f:\n",
    "        json.dump(history, f, indent=2)\n",
    "    print(f\"\\nConversation saved to {filename}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74e2aba2-20fd-48b7-b770-8206aa9fb396",
   "metadata": {},
   "source": [
    "### Intro + Chat"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "f3e4b5a8-f2b0-4ed4-b6d3-23541f615c0a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "--- Multi-Turn Mental Health Chatbot ---\n",
      "This assistant simulates a counselor's conversation using AI.\n",
      "- Type something your patient/student might say\n",
      "- Type 'save' to export the conversation\n",
      "- Type 'exit' to quit\n",
      "\n",
      "Example:\n",
      "User: I feel like I’ll mess up my big presentation tomorrow.\n",
      "Counselor: It’s completely normal to feel nervous before a big event...\n",
      "\n"
     ]
    }
   ],
   "source": [
    "def show_intro():\n",
    "    print(\"\\n--- Multi-Turn Mental Health Chatbot ---\")\n",
    "    print(\"This assistant simulates a counselor's conversation using AI.\")\n",
    "    print(\"- Type something your patient/student might say\")\n",
    "    print(\"- Type 'save' to export the conversation\")\n",
    "    print(\"- Type 'exit' to quit\\n\")\n",
    "\n",
    "    print(\"Example:\")\n",
    "    print(\"User: I feel like I’ll mess up my big presentation tomorrow.\")\n",
    "    print(\"Counselor: It’s completely normal to feel nervous before a big event...\\n\")\n",
    "\n",
    "show_intro()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b818787-dab2-4d0c-a384-55cfc6072b9d",
   "metadata": {},
   "source": [
    "### Chat Loop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "748b8c59-fbc0-4ca3-814f-5f299128fddd",
   "metadata": {},
   "outputs": [
    {
     "name": "stdin",
     "output_type": "stream",
     "text": [
      "User:  i'm nervous\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(Predicted: information, Confidence: 85.5%)\n",
      "Counselor: The student might be feeling anxious or uncertain about the situation.\n"
     ]
    },
    {
     "name": "stdin",
     "output_type": "stream",
     "text": [
      "User:  exit\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Goodbye\n"
     ]
    }
   ],
   "source": [
    "while True:\n",
    "    user_input = input(\"User: \").strip()\n",
    "\n",
    "    if user_input.lower() == \"exit\":\n",
    "        print(\"Goodbye\")\n",
    "        break\n",
    "    elif user_input.lower() == \"save\":\n",
    "        save_conversation(history)\n",
    "        continue\n",
    "\n",
    "    predicted_type, confidence = predict_response_type(user_input)\n",
    "    print(f\"(Predicted: {predicted_type}, Confidence: {confidence:.1%})\")\n",
    "\n",
    "    llm_reply = generate_llm_response(user_input, predicted_type)\n",
    "\n",
    "    history.append({\"role\": \"user\", \"content\": user_input})\n",
    "    history.append({\"role\": \"assistant\", \"content\": llm_reply})\n",
    "    history = trim_memory(history)\n",
    "\n",
    "    print(\"Counselor:\", llm_reply)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "72251dc9-6090-4448-8fe9-04ad82079520",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python (myenv)",
   "language": "python",
   "name": "myenv"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}