scdong commited on
Commit
fe336db
Β·
1 Parent(s): 5bf9716

Initial commit of mental_health_chatbot app

Browse files
.DS_Store CHANGED
Binary files a/.DS_Store and b/.DS_Store differ
 
Dockerfile CHANGED
@@ -1,26 +1,30 @@
1
- # Base image
2
  FROM python:3.10-slim
3
 
4
- # System dependencies
 
 
 
 
 
 
 
5
  RUN apt-get update && apt-get install -y \
6
  build-essential \
7
- cmake \
8
  git \
9
- curl \
10
- ninja-build \
11
- libopenblas-dev \
12
- && apt-get clean
13
 
14
- # Set working directory
15
- WORKDIR /app
16
-
17
- # Copy requirements and install
18
  COPY requirements.txt .
19
- RUN pip install --upgrade pip && \
20
- pip install --no-cache-dir --prefer-binary -r requirements.txt
21
 
22
- # Copy the rest of the app
23
  COPY . .
24
 
25
- # Run the app
26
- CMD ["streamlit", "run", "app_local.py", "--server.port=8501", "--server.address=0.0.0.0"]
 
 
 
 
1
+ # Use the official Python base image
2
  FROM python:3.10-slim
3
 
4
+ # Set environment variables
5
+ ENV PYTHONDONTWRITEBYTECODE=1
6
+ ENV PYTHONUNBUFFERED=1
7
+
8
+ # Set work directory
9
+ WORKDIR /app
10
+
11
+ # Install system dependencies
12
  RUN apt-get update && apt-get install -y \
13
  build-essential \
14
+ libffi-dev \
15
  git \
16
+ && rm -rf /var/lib/apt/lists/*
 
 
 
17
 
18
+ # Install Python dependencies
 
 
 
19
  COPY requirements.txt .
20
+ RUN pip install --no-cache-dir --upgrade pip && \
21
+ pip install --no-cache-dir -r requirements.txt
22
 
23
+ # Copy app source code
24
  COPY . .
25
 
26
+ # Expose the port Streamlit runs on
27
+ EXPOSE 8501
28
+
29
+ # Run the application
30
+ CMD ["streamlit", "run", "app.py"]
README.md CHANGED
@@ -1,109 +1,161 @@
 
 
 
 
1
  ---
2
- title: Counselor Assistant
3
- emoji: 🧠
4
- colorFrom: purple
5
- colorTo: blue
6
- sdk: streamlit
7
- sdk_version: "1.32.2"
8
- app_file: app.py
9
- pinned: false
10
- ---
11
- title: Counselor Assistant
12
- emoji: 🧠
13
- colorFrom: purple
14
- colorTo: blue
15
- sdk: streamlit
16
- sdk_version: "1.32.2"
17
- app_file: app.py
18
- pinned: false
 
 
 
 
 
 
 
 
 
 
 
19
  ---
20
 
21
- # 🧠 Counselor Assistant - AI Support for Mental Health Conversations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- This Streamlit app helps **mental health counselors** explore supportive language and therapeutic approaches for responding to patient concerns. It leverages a **machine learning classifier** (XGBoost) to detect intent and a **quantized Mistral-7B model** to generate helpful replies.
 
 
 
 
 
 
 
24
 
25
  ---
26
 
27
- ## βœ… Features
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
- - **ML-Based Intent Prediction**: Uses TF-IDF + XGBoost to detect if the message needs validation, advice, information, or follow-up.
30
- - **Mistral-7B Counselor Response**: LLM-powered suggestion using your local model (gguf) or Hugging Face-hosted model.
31
- - **Multi-turn Memory**: Keeps context from recent messages.
32
- - **Word Count + Guidance**: Shows word count to help user compose inputs.
33
- - **Conversation Export**: Save to `.csv` or `.json` for recordkeeping.
34
- - **Interview-Ready Design**: Friendly UI, emoji icons, prompt examples, and reset options.
35
 
36
  ---
37
 
38
- ## πŸš€ How to Run
 
 
39
 
40
- ### πŸ§ͺ Locally (with your own Mistral model)
 
 
 
41
 
42
- 1. Clone the repo:
43
- ```bash
44
- git clone https://github.com/your-username/mental_health_chatbot.git
45
- cd mental_health_chatbot
46
- ```
47
 
48
- 2. Place your **gguf model** under `models/`:
49
- ```
50
- models/mistral-7b-instruct-v0.1.Q4_K_M.gguf
51
- ```
52
 
53
- 3. Create and activate a virtual environment:
54
- ```bash
55
- pip install -r requirements.txt
56
- streamlit run app.py
57
- ```
58
 
59
  ---
60
 
61
- ### 🌐 Deploy on Hugging Face Spaces
62
 
63
- - βœ… Use this repo
64
- - βœ… Upload your model or switch to a hosted one like `google/flan-t5-base`
65
- - βœ… Add a `Space` on [Hugging Face](https://huggingface.co/spaces)
66
 
67
- > You can modify `generate_llm_response()` to use Hugging Face's `pipeline` if GPU is not needed.
 
 
68
 
69
  ---
70
 
71
- ## πŸ“ Folder Structure
 
 
72
 
73
  ```
74
- mental_health_chatbot/
75
- β”‚
76
- β”œβ”€β”€ app.py
77
- β”œβ”€β”€ requirements.txt
78
- β”œβ”€β”€ Dockerfile
79
- β”œβ”€β”€ README.md
80
- β”œβ”€β”€ LICENSE
81
- β”œβ”€β”€ data/
82
- β”‚ └── Kaggle_Mental_Health_Conversations_train.csv
83
- β”œβ”€β”€ models/
84
- β”‚ └── mistral-7b-instruct-v0.1.Q4_K_M.gguf
85
- β”œβ”€β”€ utils/
86
- β”‚ └── helper_functions.py
87
- └── static/
88
- └── images/
89
- └── demo.gif
90
  ```
91
 
 
 
92
  ---
93
 
94
- ## πŸ“Έ Screenshot / Demo
95
 
96
- ![demo](static/images/demo.gif)
97
 
98
- ---
 
 
99
 
100
- ## πŸ‘©β€πŸ’» Why I Built This
101
-
102
- Mental health professionals often face emotionally complex conversations. This assistant offers **non-binding, thoughtful suggestions** to aid their decision-making β€” not to replace them, but to **support their professional judgment**.
103
 
104
  ---
105
 
106
  ## πŸ“œ License
107
 
108
- MIT License β€” see [`LICENSE`](./LICENSE)
 
109
 
 
 
1
+ # 🧠 Mental Health Counselor Assistant
2
+
3
+ **Mental Health Counselor Assistant** is an AI-powered Streamlit application designed to help therapists and mental health professionals respond more effectively to patients. It combines machine learning for **intent classification** with powerful large language models (LLMs) like **Flan-T5**, **Mistral-7B**, and **Flan-Alpaca-GPT4** to generate emotionally intelligent, context-aware suggestions.
4
+
5
  ---
6
+
7
+ ## πŸ“¦ Project Structure
8
+
9
+ ```
10
+ mental_health_chatbot/
11
+ β”œβ”€β”€ app.py # Main app (fast Flan-T5 only)
12
+ β”œβ”€β”€ app_use_Mistral-7B.py # Local Mistral-7B (quantized .gguf)
13
+ β”œβ”€β”€ app_with_FlanT5_FlanAlpacaGPT4_FlanUL2.py # App with selectable hosted Hugging Face LLMs
14
+ β”œβ”€β”€ requirements.txt # Python dependencies
15
+ β”œβ”€β”€ Dockerfile # For containerized deployment
16
+ β”œβ”€β”€ README.md # You're here!
17
+ β”œβ”€β”€ LICENSE # Open-source license
18
+
19
+ β”œβ”€β”€ dataset/
20
+ β”‚ └── Kaggle_Mental_Health_Conversations_train.csv # Training dataset
21
+
22
+ β”œβ”€β”€ utils/
23
+ β”‚ └── helper_functions.py # Preprocessing and helpers
24
+
25
+ β”œβ”€β”€ notebooks/
26
+ β”‚ β”œβ”€β”€ Flan-T5_on_Kaggle_Dataset.ipynb # Inference experiments
27
+ β”‚ β”œβ”€β”€ Mistral-7B_on_Kaggle_Dataset.ipynb # Local LLM generation
28
+ β”‚ └── ML_on_Kaggle_Dataset.ipynb # Model training and eval
29
+
30
+ └── log/
31
+ └── chat_log_2025-04-11_05-31-25.csv # Example saved conversation log
32
+ ```
33
+
34
  ---
35
 
36
+ ## πŸ’‘ Features
37
+
38
+ - πŸ” **Response Type Classification**
39
+ Predicts one of four counseling intents from patient input:
40
+ - `advice`
41
+ - `validation`
42
+ - `information`
43
+ - `question`
44
+
45
+ - 🧠 **LLM-Powered Reply Generation**
46
+ Generates suggestions using:
47
+ - βœ… `google/flan-t5-base` (fast, hosted)
48
+ - βœ… `mistralai/Mistral-7B-Instruct-v0.1` (runs locally via `llama.cpp`)
49
+ - βœ… `declare-lab/flan-alpaca-gpt4-xl` and `flan-ul2` (hosted via Hugging Face)
50
+
51
+ ⚠️ **Note**: LLM response times may vary. Hosted models are slower but provide higher-quality, human-like responses.
52
 
53
+ - 🎯 **Primed & Prompt-Engineered Outputs**
54
+ Uses few-shot prompt design to ensure counselor-style tone, empathy, and helpfulness.
55
+
56
+ - πŸ§ͺ **Modular & Experiment-Ready**
57
+ Jupyter notebooks for training, testing, and benchmarking models.
58
+
59
+ - πŸ’Ύ **Conversation Logging**
60
+ Auto-saves counselor-patient interactions to `/log` for post-session review.
61
 
62
  ---
63
 
64
+ ## πŸš€ Getting Started
65
+
66
+ ### 1. Clone the Repository
67
+
68
+ ```bash
69
+ git clone https://github.com/your-username/mental_health_chatbot.git
70
+ cd mental_health_chatbot
71
+ ```
72
+
73
+ ### 2. Set Up the Environment
74
+
75
+ ```bash
76
+ python -m venv venv
77
+ source venv/bin/activate # Windows: venv\Scripts\activate
78
+ pip install -r requirements.txt
79
+ ```
80
+
81
+ ### 3. Run the App
82
+
83
+ #### Run fast, default version (Flan-T5):
84
+
85
+ ```bash
86
+ streamlit run app.py
87
+ ```
88
+
89
+ #### Run version with all Hugging Face LLMs:
90
+
91
+ ```bash
92
+ streamlit run app_with_FlanT5_FlanAlpacaGPT4_FlanUL2.py
93
+ ```
94
+
95
+ #### Run local Mistral-7B (with `llama-cpp-python`):
96
 
97
+ ```bash
98
+ streamlit run app_use_Mistral-7B.py
99
+ ```
 
 
 
100
 
101
  ---
102
 
103
+ ## 🐳 Docker (Optional)
104
+
105
+ To build and deploy with Docker:
106
 
107
+ ```bash
108
+ docker build -t counselor-assistant .
109
+ docker run -p 8501:8501 counselor-assistant
110
+ ```
111
 
112
+ ---
 
 
 
 
113
 
114
+ ## πŸ““ Dataset
 
 
 
115
 
116
+ This project uses the Kaggle dataset:
117
+ **Kaggle_Mental_Health_Conversations_train.csv**
118
+ Located in the `dataset/` folder, it contains real user mental health exchanges labeled with response types.
 
 
119
 
120
  ---
121
 
122
+ ## πŸ“” Notebooks
123
 
124
+ Explore or reproduce training and benchmarking in:
 
 
125
 
126
+ - `ML_on_Kaggle_Dataset.ipynb`: Training XGBoost and TF-IDF classifier
127
+ - `Flan-T5_on_Kaggle_Dataset.ipynb`: Hosted inference test
128
+ - `Mistral-7B_on_Kaggle_Dataset.ipynb`: Local quantized LLM
129
 
130
  ---
131
 
132
+ ## πŸ“‚ Log Files
133
+
134
+ Each chat session is saved with timestamp in the `log/` directory, e.g.:
135
 
136
  ```
137
+ log/chat_log_2025-04-11_05-31-25.csv
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
138
  ```
139
 
140
+ These CSVs include patient messages, predicted intent, AI-generated counselor suggestions, and classification confidence.
141
+
142
  ---
143
 
144
+ ## πŸ§‘β€βš•οΈ Intended Use
145
 
146
+ This tool is built for **prototyping** and **research**. Potential users include:
147
 
148
+ - Mental health professionals exploring AI support
149
+ - Researchers in NLP and digital health
150
+ - Developers building therapeutic chatbot systems
151
 
152
+ ⚠️ This tool is **not a replacement for human therapy**.
 
 
153
 
154
  ---
155
 
156
  ## πŸ“œ License
157
 
158
+ Licensed under the **MIT License**.
159
+ See [`LICENSE`](./LICENSE) for full terms.
160
 
161
+ ---
app.py CHANGED
@@ -1,4 +1,4 @@
1
- # Streamlit App: Counselor Assistant (XGBoost + Selectable LLMs from Hugging Face)
2
 
3
  import streamlit as st
4
  import os
@@ -35,7 +35,7 @@ This tool offers **AI-powered suggestions** to support you when responding to yo
35
 
36
  ### What it does:
37
  - 🧩 Predicts what type of support is best: *Advice*, *Validation*, *Information*, or *Question*
38
- - πŸ’¬ Generates an LLM-powered suggestion for you
39
  - πŸ’Ύ Lets you save your session for reflection
40
 
41
  This is here to support β€” not replace β€” your clinical instincts πŸ’š
@@ -45,7 +45,6 @@ This is here to support β€” not replace β€” your clinical instincts πŸ’š
45
  df = pd.read_csv("dataset/Kaggle_Mental_Health_Conversations_train.csv")
46
  df = df[['Context', 'Response']].dropna().copy()
47
 
48
- # Auto-labeling: heuristics for labeling responses
49
  keywords_to_labels = {
50
  'advice': ['try', 'should', 'suggest', 'recommend'],
51
  'validation': ['understand', 'feel', 'valid', 'normal'],
@@ -67,16 +66,12 @@ df['combined_text'] = df['Context'] + " " + df['Response']
67
  le = LabelEncoder()
68
  y = le.fit_transform(df['response_type'])
69
 
70
- # TF-IDF vectorizer on combined text
71
  vectorizer = TfidfVectorizer(max_features=2000, ngram_range=(1, 2))
72
  X = vectorizer.fit_transform(df['combined_text'])
 
73
 
74
- # Train-test split
75
- X_train, X_test, y_train, y_test = train_test_split(
76
- X, y, test_size=0.2, stratify=y, random_state=42
77
- )
78
-
79
- # XGBoost Classifier
80
  xgb_model = XGBClassifier(
81
  objective='multi:softmax',
82
  num_class=len(le.classes_),
@@ -88,20 +83,12 @@ xgb_model = XGBClassifier(
88
  )
89
  xgb_model.fit(X_train, y_train)
90
 
91
- # --- Select Model Option ---
92
- model_options = {
93
- "google/flan-t5-base": "βœ… Flan-T5 (Fast, Clean)",
94
- "declare-lab/flan-alpaca-gpt4-xl": "πŸ’¬ Flan Alpaca GPT4 (Human-sounding)",
95
- "google/flan-ul2": "🧠 Flan-UL2 (Deeper reasoning)"
96
- }
97
-
98
- model_choice = st.selectbox("🧠 Choose a Response Model", list(model_options.keys()), format_func=lambda x: model_options[x])
99
-
100
- @st.cache_resource(show_spinner="Loading selected language model...")
101
- def load_llm(model_name):
102
- return pipeline("text2text-generation", model=model_name)
103
 
104
- llm = load_llm(model_choice)
105
 
106
  # --- Utility Functions ---
107
  def predict_response_type(user_input):
@@ -112,29 +99,44 @@ def predict_response_type(user_input):
112
  return label, proba
113
 
114
  def build_prompt(user_input, response_type):
115
- prompts = {
116
- "advice": f"A patient said: \"{user_input}\". What advice should a mental health counselor give to support them?",
117
- "validation": f"A patient said: \"{user_input}\". How can a counselor validate and empathize with their emotions?",
118
- "information": f"A patient said: \"{user_input}\". Explain what might be happening from a mental health perspective.",
119
- "question": f"A patient said: \"{user_input}\". What thoughtful follow-up questions should a counselor ask?"
120
  }
121
- return prompts.get(response_type, prompts["information"])
 
 
 
 
 
122
 
123
  def generate_llm_response(user_input, response_type):
124
  prompt = build_prompt(user_input, response_type)
125
  start = time.time()
126
  with st.spinner("Thinking through a helpful response for your patient..."):
127
- result = llm(prompt, max_length=150, do_sample=True, temperature=0.7)
 
 
 
 
 
 
 
 
128
  end = time.time()
129
  st.info(f"Response generated in {end - start:.1f} seconds")
130
  return result[0]["generated_text"].strip()
131
 
 
 
132
  def trim_memory(history, max_turns=6):
133
  return history[-max_turns * 2:]
134
 
135
  def save_conversation(history):
136
  now = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
137
- with open(f"chat_log_{now}.csv", "w", newline='') as f:
138
  writer = csv.writer(f)
139
  writer.writerow(["Role", "Content", "Intent", "Confidence"])
140
  for entry in history:
@@ -146,13 +148,13 @@ def save_conversation(history):
146
  ])
147
  st.success(f"Saved to chat_log_{now}.csv")
148
 
149
- # --- Session State Setup ---
150
  if "history" not in st.session_state:
151
  st.session_state.history = []
152
  if "user_input" not in st.session_state:
153
  st.session_state.user_input = ""
154
 
155
- # --- Display Sample Prompts ---
156
  with st.expander("πŸ’‘ Sample inputs you can try"):
157
  st.markdown("""
158
  - My patient is constantly feeling overwhelmed at work.
@@ -160,7 +162,7 @@ with st.expander("πŸ’‘ Sample inputs you can try"):
160
  - Someone told me they think they’ll never feel okay again.
161
  """)
162
 
163
- # --- Text Area + Word Counter ---
164
  MAX_WORDS = 1000
165
  word_count = len(st.session_state.user_input.split())
166
  st.markdown(f"**πŸ“ Input Length:** {word_count} / {MAX_WORDS} words")
@@ -172,7 +174,7 @@ st.session_state.user_input = st.text_area(
172
  height=100
173
  )
174
 
175
- # --- Button Layout ---
176
  col1, col2, col3 = st.columns([2, 1, 1])
177
  with col1:
178
  send = st.button("πŸ’‘ Suggest Response")
@@ -181,7 +183,7 @@ with col2:
181
  with col3:
182
  reset = st.button("πŸ” Reset")
183
 
184
- # --- Button Logic ---
185
  if send and st.session_state.user_input:
186
  user_input = st.session_state.user_input
187
  predicted_type, confidence = predict_response_type(user_input)
@@ -204,7 +206,7 @@ if reset:
204
  st.session_state.user_input = ""
205
  st.success("Conversation has been cleared.")
206
 
207
- # --- Chat History Display ---
208
  st.markdown("---")
209
  for turn in st.session_state.history:
210
  if turn["role"] == "user":
 
1
+ # Streamlit App: Counselor Assistant (XGBoost + Flan-T5 Improved)
2
 
3
  import streamlit as st
4
  import os
 
35
 
36
  ### What it does:
37
  - 🧩 Predicts what type of support is best: *Advice*, *Validation*, *Information*, or *Question*
38
+ - πŸ’¬ Generates a suggestion using **Flan-T5**
39
  - πŸ’Ύ Lets you save your session for reflection
40
 
41
  This is here to support β€” not replace β€” your clinical instincts πŸ’š
 
45
  df = pd.read_csv("dataset/Kaggle_Mental_Health_Conversations_train.csv")
46
  df = df[['Context', 'Response']].dropna().copy()
47
 
 
48
  keywords_to_labels = {
49
  'advice': ['try', 'should', 'suggest', 'recommend'],
50
  'validation': ['understand', 'feel', 'valid', 'normal'],
 
66
  le = LabelEncoder()
67
  y = le.fit_transform(df['response_type'])
68
 
69
+ # TF-IDF + Train-test split
70
  vectorizer = TfidfVectorizer(max_features=2000, ngram_range=(1, 2))
71
  X = vectorizer.fit_transform(df['combined_text'])
72
+ X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)
73
 
74
+ # XGBoost model
 
 
 
 
 
75
  xgb_model = XGBClassifier(
76
  objective='multi:softmax',
77
  num_class=len(le.classes_),
 
83
  )
84
  xgb_model.fit(X_train, y_train)
85
 
86
+ # --- Load Flan-T5 Model ---
87
+ @st.cache_resource(show_spinner="Loading Flan-T5 model...")
88
+ def load_llm():
89
+ return pipeline("text2text-generation", model="google/flan-t5-base")
 
 
 
 
 
 
 
 
90
 
91
+ llm = load_llm()
92
 
93
  # --- Utility Functions ---
94
  def predict_response_type(user_input):
 
99
  return label, proba
100
 
101
  def build_prompt(user_input, response_type):
102
+ examples = {
103
+ "advice": 'Patient: "I’m having trouble sleeping."\nCounselor: "It might help to create a bedtime routine and avoid screens before sleep. Would you like to try that together?"',
104
+ "validation": 'Patient: "I feel like no one understands me."\nCounselor: "It makes sense that you feel that way β€” your feelings are valid and you deserve to be heard."',
105
+ "information": 'Patient: "Why do I feel this way for no reason?"\nCounselor: "Sometimes our brains respond to stress or trauma in ways that are hard to detect. It could be anxiety or depression, and we can work through it together."',
106
+ "question": 'Patient: "I don’t know what to do anymore."\nCounselor: "Can you tell me more about what’s been feeling difficult lately?"'
107
  }
108
+
109
+ return f"""{examples[response_type]}
110
+
111
+ Patient: "{user_input}"
112
+ Counselor:"""
113
+
114
 
115
  def generate_llm_response(user_input, response_type):
116
  prompt = build_prompt(user_input, response_type)
117
  start = time.time()
118
  with st.spinner("Thinking through a helpful response for your patient..."):
119
+ result = llm(
120
+ prompt,
121
+ max_length=256,
122
+ min_length=60, # forces longer responses
123
+ do_sample=True,
124
+ temperature=0.9,
125
+ top_p=0.95,
126
+ num_return_sequences=1
127
+ )
128
  end = time.time()
129
  st.info(f"Response generated in {end - start:.1f} seconds")
130
  return result[0]["generated_text"].strip()
131
 
132
+
133
+
134
  def trim_memory(history, max_turns=6):
135
  return history[-max_turns * 2:]
136
 
137
  def save_conversation(history):
138
  now = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
139
+ with open(f"logs/chat_log_{now}.csv", "w", newline='') as f:
140
  writer = csv.writer(f)
141
  writer.writerow(["Role", "Content", "Intent", "Confidence"])
142
  for entry in history:
 
148
  ])
149
  st.success(f"Saved to chat_log_{now}.csv")
150
 
151
+ # --- Session Setup ---
152
  if "history" not in st.session_state:
153
  st.session_state.history = []
154
  if "user_input" not in st.session_state:
155
  st.session_state.user_input = ""
156
 
157
+ # --- Sample Prompts ---
158
  with st.expander("πŸ’‘ Sample inputs you can try"):
159
  st.markdown("""
160
  - My patient is constantly feeling overwhelmed at work.
 
162
  - Someone told me they think they’ll never feel okay again.
163
  """)
164
 
165
+ # --- Text Input ---
166
  MAX_WORDS = 1000
167
  word_count = len(st.session_state.user_input.split())
168
  st.markdown(f"**πŸ“ Input Length:** {word_count} / {MAX_WORDS} words")
 
174
  height=100
175
  )
176
 
177
+ # --- Buttons ---
178
  col1, col2, col3 = st.columns([2, 1, 1])
179
  with col1:
180
  send = st.button("πŸ’‘ Suggest Response")
 
183
  with col3:
184
  reset = st.button("πŸ” Reset")
185
 
186
+ # --- Main Logic ---
187
  if send and st.session_state.user_input:
188
  user_input = st.session_state.user_input
189
  predicted_type, confidence = predict_response_type(user_input)
 
206
  st.session_state.user_input = ""
207
  st.success("Conversation has been cleared.")
208
 
209
+ # --- Display Chat History ---
210
  st.markdown("---")
211
  for turn in st.session_state.history:
212
  if turn["role"] == "user":
app_with_FlanT5_FlanAlpacaGPT4_FlanUL2.py ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Streamlit App: Counselor Assistant (XGBoost + Selectable LLMs from Hugging Face)
2
+
3
+ import streamlit as st
4
+ import os
5
+ import pandas as pd
6
+ import json
7
+ import time
8
+ import csv
9
+ from datetime import datetime
10
+ from sklearn.feature_extraction.text import TfidfVectorizer
11
+ from sklearn.preprocessing import LabelEncoder
12
+ from sklearn.model_selection import train_test_split
13
+ from xgboost import XGBClassifier
14
+ from transformers import pipeline
15
+
16
+ # --- Page Setup ---
17
+ st.set_page_config(page_title="Counselor Assistant", layout="centered")
18
+
19
+ # --- Styling ---
20
+ st.markdown("""
21
+ <style>
22
+ .main { background-color: #f9f9f9; padding: 1rem 2rem; border-radius: 12px; }
23
+ h1 { color: #2c3e50; text-align: center; font-size: 2.4rem; }
24
+ .user { color: #1f77b4; font-weight: bold; }
25
+ .assistant { color: #2ca02c; font-weight: bold; }
26
+ </style>
27
+ """, unsafe_allow_html=True)
28
+
29
+ # --- App Header ---
30
+ st.title("🧠 Mental Health Counselor Assistant")
31
+ st.markdown("""
32
+ Welcome, counselor πŸ‘‹
33
+
34
+ This tool offers **AI-powered suggestions** to support you when responding to your patients.
35
+
36
+ ### What it does:
37
+ - 🧩 Predicts what type of support is best: *Advice*, *Validation*, *Information*, or *Question*
38
+ - πŸ’¬ Generates an LLM-powered suggestion for you
39
+ - πŸ’Ύ Lets you save your session for reflection
40
+
41
+ This is here to support β€” not replace β€” your clinical instincts πŸ’š
42
+ """)
43
+
44
+ # --- Load and label dataset ---
45
+ df = pd.read_csv("dataset/Kaggle_Mental_Health_Conversations_train.csv")
46
+ df = df[['Context', 'Response']].dropna().copy()
47
+
48
+ # Auto-labeling: heuristics for labeling responses
49
+ keywords_to_labels = {
50
+ 'advice': ['try', 'should', 'suggest', 'recommend'],
51
+ 'validation': ['understand', 'feel', 'valid', 'normal'],
52
+ 'information': ['cause', 'often', 'disorder', 'symptom'],
53
+ 'question': ['how', 'what', 'why', 'have you']
54
+ }
55
+
56
+ def auto_label_response(response):
57
+ response = response.lower()
58
+ for label, keywords in keywords_to_labels.items():
59
+ if any(word in response for word in keywords):
60
+ return label
61
+ return 'information'
62
+
63
+ df['response_type'] = df['Response'].apply(auto_label_response)
64
+ df['combined_text'] = df['Context'] + " " + df['Response']
65
+
66
+ # Encode labels
67
+ le = LabelEncoder()
68
+ y = le.fit_transform(df['response_type'])
69
+
70
+ # TF-IDF vectorizer on combined text
71
+ vectorizer = TfidfVectorizer(max_features=2000, ngram_range=(1, 2))
72
+ X = vectorizer.fit_transform(df['combined_text'])
73
+
74
+ # Train-test split
75
+ X_train, X_test, y_train, y_test = train_test_split(
76
+ X, y, test_size=0.2, stratify=y, random_state=42
77
+ )
78
+
79
+ # XGBoost Classifier
80
+ xgb_model = XGBClassifier(
81
+ objective='multi:softmax',
82
+ num_class=len(le.classes_),
83
+ eval_metric='mlogloss',
84
+ use_label_encoder=False,
85
+ max_depth=6,
86
+ learning_rate=0.1,
87
+ n_estimators=100
88
+ )
89
+ xgb_model.fit(X_train, y_train)
90
+
91
+ # --- Select Model Option ---
92
+ model_options = {
93
+ "google/flan-t5-base": "βœ… Flan-T5 (Fast, Clean)",
94
+ "declare-lab/flan-alpaca-gpt4-xl": "πŸ’¬ Flan Alpaca GPT4 (Human-sounding)",
95
+ "google/flan-ul2": "🧠 Flan-UL2 (Deeper reasoning)"
96
+ }
97
+
98
+ model_choice = st.selectbox("🧠 Choose a Response Model", list(model_options.keys()), format_func=lambda x: model_options[x])
99
+
100
+ @st.cache_resource(show_spinner="Loading selected language model...")
101
+ def load_llm(model_name):
102
+ return pipeline("text2text-generation", model=model_name)
103
+
104
+ llm = load_llm(model_choice)
105
+
106
+ # --- Utility Functions ---
107
+ def predict_response_type(user_input):
108
+ vec = vectorizer.transform([user_input])
109
+ pred = xgb_model.predict(vec)
110
+ proba = xgb_model.predict_proba(vec).max()
111
+ label = le.inverse_transform(pred)[0]
112
+ return label, proba
113
+
114
+ def build_prompt(user_input, response_type):
115
+ prompts = {
116
+ "advice": f"A patient said: \"{user_input}\". What advice should a mental health counselor give to support them?",
117
+ "validation": f"A patient said: \"{user_input}\". How can a counselor validate and empathize with their emotions?",
118
+ "information": f"A patient said: \"{user_input}\". Explain what might be happening from a mental health perspective.",
119
+ "question": f"A patient said: \"{user_input}\". What thoughtful follow-up questions should a counselor ask?"
120
+ }
121
+ return prompts.get(response_type, prompts["information"])
122
+
123
+ def generate_llm_response(user_input, response_type):
124
+ prompt = build_prompt(user_input, response_type)
125
+ start = time.time()
126
+ with st.spinner("Thinking through a helpful response for your patient..."):
127
+ result = llm(prompt, max_length=150, do_sample=True, temperature=0.7)
128
+ end = time.time()
129
+ st.info(f"Response generated in {end - start:.1f} seconds")
130
+ return result[0]["generated_text"].strip()
131
+
132
+ def trim_memory(history, max_turns=6):
133
+ return history[-max_turns * 2:]
134
+
135
+ def save_conversation(history):
136
+ now = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
137
+ with open(f"chat_log_{now}.csv", "w", newline='') as f:
138
+ writer = csv.writer(f)
139
+ writer.writerow(["Role", "Content", "Intent", "Confidence"])
140
+ for entry in history:
141
+ writer.writerow([
142
+ entry.get("role", ""),
143
+ entry.get("content", ""),
144
+ entry.get("label", ""),
145
+ round(float(entry.get("confidence", 0)) * 100)
146
+ ])
147
+ st.success(f"Saved to chat_log_{now}.csv")
148
+
149
+ # --- Session State Setup ---
150
+ if "history" not in st.session_state:
151
+ st.session_state.history = []
152
+ if "user_input" not in st.session_state:
153
+ st.session_state.user_input = ""
154
+
155
+ # --- Display Sample Prompts ---
156
+ with st.expander("πŸ’‘ Sample inputs you can try"):
157
+ st.markdown("""
158
+ - My patient is constantly feeling overwhelmed at work.
159
+ - A student says they panic every time they have to speak in class.
160
+ - Someone told me they think they’ll never feel okay again.
161
+ """)
162
+
163
+ # --- Text Area + Word Counter ---
164
+ MAX_WORDS = 1000
165
+ word_count = len(st.session_state.user_input.split())
166
+ st.markdown(f"**πŸ“ Input Length:** {word_count} / {MAX_WORDS} words")
167
+
168
+ st.session_state.user_input = st.text_area(
169
+ "πŸ’¬ What did your patient say?",
170
+ value=st.session_state.user_input,
171
+ placeholder="e.g. I just feel like I'm never going to get better.",
172
+ height=100
173
+ )
174
+
175
+ # --- Button Layout ---
176
+ col1, col2, col3 = st.columns([2, 1, 1])
177
+ with col1:
178
+ send = st.button("πŸ’‘ Suggest Response")
179
+ with col2:
180
+ save = st.button("πŸ“ Save This")
181
+ with col3:
182
+ reset = st.button("πŸ” Reset")
183
+
184
+ # --- Button Logic ---
185
+ if send and st.session_state.user_input:
186
+ user_input = st.session_state.user_input
187
+ predicted_type, confidence = predict_response_type(user_input)
188
+ reply = generate_llm_response(user_input, predicted_type)
189
+
190
+ st.session_state.history.append({"role": "user", "content": user_input})
191
+ st.session_state.history.append({
192
+ "role": "assistant",
193
+ "content": reply,
194
+ "label": predicted_type,
195
+ "confidence": confidence
196
+ })
197
+ st.session_state.history = trim_memory(st.session_state.history)
198
+
199
+ if save:
200
+ save_conversation(st.session_state.history)
201
+
202
+ if reset:
203
+ st.session_state.history = []
204
+ st.session_state.user_input = ""
205
+ st.success("Conversation has been cleared.")
206
+
207
+ # --- Chat History Display ---
208
+ st.markdown("---")
209
+ for turn in st.session_state.history:
210
+ if turn["role"] == "user":
211
+ st.markdown(f"πŸ§β€β™€οΈ **Patient:** {turn['content']}")
212
+ else:
213
+ st.markdown(f"πŸ‘©β€βš•οΈπŸ‘¨β€βš•οΈ **Suggested Counselor Response:** {turn['content']}")
214
+ st.caption(f"_Intent: {turn['label']} (Confidence: {turn['confidence']:.0%})_")
215
+ st.markdown("---")
216
+
app_slow_version.py β†’ app_with_Mistral7B.py RENAMED
@@ -1,43 +1,42 @@
1
- # Streamlit App: Counselor Assistant using XGBoost + Flan-T5 (Cloud Version)
2
-
3
  import streamlit as st
 
4
  import os
5
  import pandas as pd
6
  import json
7
  import time
8
  import csv
 
9
  from sklearn.feature_extraction.text import TfidfVectorizer
10
  from sklearn.preprocessing import LabelEncoder
11
  from sklearn.model_selection import train_test_split
12
  from xgboost import XGBClassifier
13
- from transformers import pipeline
14
 
15
  st.set_page_config(page_title="Counselor Assistant", layout="centered")
16
 
17
  st.markdown("""
18
  <style>
19
- .main { background-color: #f4f4f9; padding: 1rem 2rem; border-radius: 12px; }
20
  h1 { color: #2c3e50; text-align: center; font-size: 2.4rem; }
21
  .user { color: #1f77b4; font-weight: bold; }
22
  .assistant { color: #2ca02c; font-weight: bold; }
23
  </style>
24
  """, unsafe_allow_html=True)
25
 
26
- st.title("Mental Health Counselor Assistant")
27
  st.markdown("""
28
- Welcome, counselor πŸ‘©β€βš•οΈπŸ‘¨β€βš•οΈ
29
 
30
- This assistant is designed to provide you with **supportive, evidence-based suggestions** when you're unsure how to best respond to a patient’s concerns.
31
 
32
- Just enter what your patient shared with you, and this tool will:
33
- - Predict the type of support that fits best (e.g., advice, validation, information, and question)
34
- - Generate a suggested counselor reply
35
- - Let you save the conversation for your records
36
 
37
- This is not a diagnostic tool β€” it’s here to support **your clinical intuition**.
38
  """)
39
 
40
- # Load and prepare the dataset
41
  df = pd.read_csv("dataset/Kaggle_Mental_Health_Conversations_train.csv")
42
  df = df[['Context', 'Response']].dropna().copy()
43
 
@@ -64,9 +63,7 @@ y = le.fit_transform(df['response_type'])
64
  vectorizer = TfidfVectorizer(max_features=2000, ngram_range=(1, 2))
65
  X = vectorizer.fit_transform(df['combined_text'])
66
 
67
- X_train, X_test, y_train, y_test = train_test_split(
68
- X, y, test_size=0.2, stratify=y, random_state=42
69
- )
70
 
71
  xgb_model = XGBClassifier(
72
  objective='multi:softmax',
@@ -79,10 +76,10 @@ xgb_model = XGBClassifier(
79
  )
80
  xgb_model.fit(X_train, y_train)
81
 
82
- # Replace Mistral-7B with Flan-T5 hosted model
83
- @st.cache_resource(show_spinner="Loading Flan-T5 model...")
84
  def load_llm():
85
- return pipeline("text2text-generation", model="google/flan-t5-base")
86
 
87
  llm = load_llm()
88
 
@@ -106,47 +103,54 @@ def generate_llm_response(user_input, response_type):
106
  prompt = build_prompt(user_input, response_type)
107
  start = time.time()
108
  with st.spinner("Thinking through a helpful response for your patient..."):
109
- result = llm(prompt, max_length=150, do_sample=True, temperature=0.7)
110
  end = time.time()
111
  st.info(f"Response generated in {end - start:.1f} seconds")
112
- return result[0]["generated_text"].strip()
113
 
114
  def trim_memory(history, max_turns=6):
115
  return history[-max_turns * 2:]
116
 
117
  def save_conversation(history):
118
- with open("chat_history.json", "w") as f:
119
- json.dump(history, f, indent=2)
120
- with open("chat_log.csv", "w", newline='') as f:
121
  writer = csv.writer(f)
122
- writer.writerow(["Role", "Content"])
123
  for entry in history:
124
- writer.writerow([entry.get("role", ""), entry.get("content", "")])
125
- st.success("Saved to chat_history.json and chat_log.csv")
 
 
 
 
 
 
126
 
127
- # Streamlit UI
128
  if "history" not in st.session_state:
129
  st.session_state.history = []
130
-
131
- with st.expander("πŸ’‘ Sample inputs you can try"):
132
- st.markdown("""
133
- - My patient is constantly feeling overwhelmed at work.
134
- - A student says they panic every time they have to speak in class.
135
- - Someone told me they think they’ll never feel okay again.
136
- """)
137
-
138
- user_input = st.text_area("πŸ’¬ What did your patient say?", placeholder="e.g. I just feel like I'm never going to get better.", height=100)
 
 
 
139
 
140
  col1, col2, col3 = st.columns([2, 1, 1])
141
-
142
  with col1:
143
- send = st.button("Suggest Response")
144
  with col2:
145
  save = st.button("πŸ“ Save This")
146
  with col3:
147
  reset = st.button("πŸ” Reset")
148
 
149
- if send and user_input:
 
150
  predicted_type, confidence = predict_response_type(user_input)
151
  reply = generate_llm_response(user_input, predicted_type)
152
 
@@ -159,6 +163,7 @@ if save:
159
 
160
  if reset:
161
  st.session_state.history = []
 
162
  st.success("Conversation has been cleared.")
163
 
164
  st.markdown("---")
@@ -166,7 +171,7 @@ for turn in st.session_state.history:
166
  if turn["role"] == "user":
167
  st.markdown(f"πŸ§β€β™€οΈ **Patient:** {turn['content']}")
168
  else:
169
- st.markdown(f"πŸ‘©β€βš•οΈπŸ‘¨β€βš•οΈ **Suggested Counselor Response:** {turn['content']}")
170
  st.caption(f"_Intent: {turn['label']} (Confidence: {turn['confidence']:.0%})_")
171
  st.markdown("---")
172
 
 
 
 
1
  import streamlit as st
2
+ from utils.helper_functions import *
3
  import os
4
  import pandas as pd
5
  import json
6
  import time
7
  import csv
8
+ from datetime import datetime
9
  from sklearn.feature_extraction.text import TfidfVectorizer
10
  from sklearn.preprocessing import LabelEncoder
11
  from sklearn.model_selection import train_test_split
12
  from xgboost import XGBClassifier
13
+ from llama_cpp import Llama
14
 
15
  st.set_page_config(page_title="Counselor Assistant", layout="centered")
16
 
17
  st.markdown("""
18
  <style>
19
+ .main { background-color: #f9f9f9; padding: 1rem 2rem; border-radius: 12px; }
20
  h1 { color: #2c3e50; text-align: center; font-size: 2.4rem; }
21
  .user { color: #1f77b4; font-weight: bold; }
22
  .assistant { color: #2ca02c; font-weight: bold; }
23
  </style>
24
  """, unsafe_allow_html=True)
25
 
26
+ st.title("🧠 Mental Health Counselor Assistant")
27
  st.markdown("""
28
+ Hi there, counselor πŸ‘‹
29
 
30
+ This tool is here to offer **supportive, AI-generated suggestions** when you’re not quite sure how to respond to a patient.
31
 
32
+ ### How it helps:
33
+ - 🧩 Predicts the type of support your patient might need (advice, validation, information, & question.)
34
+ - πŸ’¬ Generates a supportive counselor response
35
+ - πŸ“ Lets you save and track conversations for reflection
36
 
37
+ It's a sidekick, not a substitute for your clinical judgment πŸ’š
38
  """)
39
 
 
40
  df = pd.read_csv("dataset/Kaggle_Mental_Health_Conversations_train.csv")
41
  df = df[['Context', 'Response']].dropna().copy()
42
 
 
63
  vectorizer = TfidfVectorizer(max_features=2000, ngram_range=(1, 2))
64
  X = vectorizer.fit_transform(df['combined_text'])
65
 
66
+ X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)
 
 
67
 
68
  xgb_model = XGBClassifier(
69
  objective='multi:softmax',
 
76
  )
77
  xgb_model.fit(X_train, y_train)
78
 
79
+ MODEL_PATH = os.path.expanduser("/Users/Pi/models/mistral/mistral-7b-instruct-v0.1.Q4_K_M.gguf")
80
+ @st.cache_resource(show_spinner=True)
81
  def load_llm():
82
+ return Llama(model_path=MODEL_PATH, n_ctx=2048, n_threads=os.cpu_count())
83
 
84
  llm = load_llm()
85
 
 
103
  prompt = build_prompt(user_input, response_type)
104
  start = time.time()
105
  with st.spinner("Thinking through a helpful response for your patient..."):
106
+ result = llm(prompt, max_tokens=300, temperature=0.7)
107
  end = time.time()
108
  st.info(f"Response generated in {end - start:.1f} seconds")
109
+ return result['choices'][0]['text'].strip()
110
 
111
  def trim_memory(history, max_turns=6):
112
  return history[-max_turns * 2:]
113
 
114
  def save_conversation(history):
115
+ now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
116
+ with open("logs/chat_log_combined.csv", "w", newline='') as f:
 
117
  writer = csv.writer(f)
118
+ writer.writerow(["Timestamp", "Role", "Content", "Intent", "Confidence"])
119
  for entry in history:
120
+ writer.writerow([
121
+ now,
122
+ entry.get("role", ""),
123
+ entry.get("content", ""),
124
+ entry.get("label", ""),
125
+ round(float(entry.get("confidence", 0)), 2)
126
+ ])
127
+ st.success("Saved to chat_log_combined.csv")
128
 
 
129
  if "history" not in st.session_state:
130
  st.session_state.history = []
131
+ if "user_input" not in st.session_state:
132
+ st.session_state.user_input = ""
133
+
134
+ MAX_WORDS = 1000
135
+ word_count = len(st.session_state.user_input.split())
136
+ st.markdown(f"**πŸ“ Input Length:** {word_count} / {MAX_WORDS} words")
137
+ st.session_state.user_input = st.text_area(
138
+ "πŸ’¬ What did your patient say?",
139
+ value=st.session_state.user_input,
140
+ placeholder="e.g. I just feel like I'm never going to get better.",
141
+ height=100
142
+ )
143
 
144
  col1, col2, col3 = st.columns([2, 1, 1])
 
145
  with col1:
146
+ send = st.button("πŸ’‘ Suggest Response")
147
  with col2:
148
  save = st.button("πŸ“ Save This")
149
  with col3:
150
  reset = st.button("πŸ” Reset")
151
 
152
+ if send and st.session_state.user_input:
153
+ user_input = st.session_state.user_input
154
  predicted_type, confidence = predict_response_type(user_input)
155
  reply = generate_llm_response(user_input, predicted_type)
156
 
 
163
 
164
  if reset:
165
  st.session_state.history = []
166
+ st.session_state.user_input = ""
167
  st.success("Conversation has been cleared.")
168
 
169
  st.markdown("---")
 
171
  if turn["role"] == "user":
172
  st.markdown(f"πŸ§β€β™€οΈ **Patient:** {turn['content']}")
173
  else:
174
+ st.markdown(f"πŸ‘¨β€βš•οΈ **Suggested Counselor Response:** {turn['content']}")
175
  st.caption(f"_Intent: {turn['label']} (Confidence: {turn['confidence']:.0%})_")
176
  st.markdown("---")
177
 
logs/chat_log_2025-04-11_05-31-25.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ Role,Content,Intent,Confidence
2
+ user,i'm nervous,,0
3
+ assistant,"I don't like things that's ""off-the-cuff"" and ""i'm worried about myself."" Then I think about myself that way and know that something's happened. It's not always the case, but you can't say you're feeling this way because it's ""off the cuff.""",information,85
logs/chat_log_combined.csv DELETED
@@ -1,12 +0,0 @@
1
- Timestamp,Role,Content,Intent,Confidence
2
- 2025-04-11 04:03:53,user,i'm sick physically and mentall ,,0.0
3
- 2025-04-11 04:03:53,assistant,"A person who says they are sick physically and mentally may be experiencing a range of symptoms and conditions that could be affecting their mental health. Some possible factors that could contribute to this might include:
4
-
5
- 1. Stress: Chronic stress can cause physical symptoms such as muscle tension, headaches, and fatigue, as well as mental symptoms such as anxiety, depression, and difficulty sleeping.
6
- 2. Depression: Depression is a common mental health disorder that can cause physical symptoms such as fatigue and decreased appetite, as well as mental symptoms such as sadness, hopelessness, and difficulty concentrating.
7
- 3. Anxiety: Anxiety is another common mental health disorder that can cause physical symptoms such as sweating, trembling, and difficulty breathing, as well as mental symptoms such as worry, fear, and avoidance.
8
- 4. Trauma: Traumatic experiences, such as abuse, neglect, or natural disasters, can cause mental health symptoms such as anxiety, depression, and difficulty trusting others.
9
- 5. Substance use: The use of drugs or alcohol can have a negative impact on mental health, leading to symptoms such as depression, anxiety, and difficulty functioning in daily life.
10
- 6. Bipolar disorder: Bipolar disorder is a mental health condition characterized by extreme mood swings, including periods of intense happiness (mania) and intense sadness (depression).
11
-
12
- It's important to",information,0.85