argmin commited on
Commit
984fcd8
·
1 Parent(s): 7325cee
Files changed (2) hide show
  1. README.md +47 -1
  2. app/main.py +269 -222
README.md CHANGED
@@ -11,4 +11,50 @@ license: mit
11
  short_description: Zero-Shot Classifier
12
  ---
13
 
14
- # LLM classifier
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  short_description: Zero-Shot Classifier
12
  ---
13
 
14
+ This app allows you to use large language models (LLMs) for text classification on your custom datasets.
15
+
16
+ ## 🚀 Key Features
17
+
18
+ 1. **Custom Labels and Descriptions**
19
+ - The system allows end-users to define their own labels and provide descriptive text for each label.
20
+
21
+ 2. **Binary and Multi-Class Classification**
22
+ - The system supports both **binary classification** (e.g., spam vs. not spam) and **multi-class classification** (e.g., positive, negative, neutral).
23
+
24
+ 3. **Few-Shot Learning**
25
+ - Users can enable **few-shot prompting** by selecting example rows from the dataset to guide the model's understanding.
26
+ - The system automatically selects and excludes these examples from the main dataset to improve prediction accuracy without affecting evaluation.
27
+
28
+ 4. **Additional Utility Features**
29
+ - **Cost-aware**: Limits max tokens generated by the LLM and sends only as many rows as the user specifes to minimize costs during experimentation.
30
+ - **Inference Mode**: Automatically adapts when no target column is specified, providing label distribution statistics instead of evaluation metrics.
31
+ - **Verbose Mode**: Users can inspect raw prompts sent to the LLM and responses received, enabling transparency and debugging.
32
+ - **Progress Tracking**: A progress bar shows the classification status row-by-row.
33
+
34
+ ## 📦 How It Works
35
+
36
+ 1. **Upload Data**: Drag and drop a CSV file to load data into the system.
37
+ 2. **Select Target Column**: Choose the column to classify or run in inference mode (no target column).
38
+ 3. **Define Labels**: Add custom labels and their descriptions to guide classification.
39
+ 4. **Choose Features**: Select the features (columns) that should be used for classification.
40
+ 5. **Few-Shot Examples**: Optionally enable few-shot learning by providing examples from the dataset.
41
+ 6. **Run Classification**: View predictions, evaluate metrics (if labels are provided), or analyze label distribution (in inference mode).
42
+
43
+ ## Example Datasets
44
+ 1. (Binary) https://www.kaggle.com/datasets/ozlerhakan/spam-or-not-spam-dataset
45
+ 2. (Multi-class) https://www.kaggle.com/datasets/mdismielhossenabir/sentiment-analysis
46
+ 3. (Multi-class) https://www.kaggle.com/datasets/pashupatigupta/emotion-detection-from-text
47
+
48
+ ## 📘 Notes
49
+ - Ensure your **OpenAI API** key is valid and has sufficient quota.
50
+ - If your CSV includes a target column you can take advantage of few-shot prompting.
51
+
52
+ ## 💡 Ideas for future
53
+ - (**Clustering + LLM hybrid**) I was considering implementing clustering (say with K-Means) and a specific k and then asking the LLM to associate provided labels with those k clusters.
54
+ - (**Multi-modal** support) Would be nice to support images, audio, etc. so the beloved cats vs dogs classification could be feasible. We'd use one of the multi-modal LLMs from OpenAI to base64-encode the image and send it along with the rest of the conversation.
55
+
56
+ ## 🥑 Needs work
57
+ - **Evaluation** needs a lot of work. If I had more time, I'd start there. We'd have to both show that the selected LLM configuration + PROMPT achives good performance on standard classification datasets. The hope is then it will do well on datasets with explicit supervision signal.
58
+ - The UI is still pretty clunky. There is a lot of logic that's mixed in with visual elements.
59
+ - Tests, which I've generated entirely with an LLM, are not at all sufficient.
60
+ - The system prompt can be improved, I didn't make any modifications from the initial one.
app/main.py CHANGED
@@ -10,239 +10,286 @@ from config.model_params import DEFAULT_PARAMS
10
 
11
  st.set_page_config(layout="wide")
12
 
13
- # Streamlit App Title
14
- st.title("LLM-based Classifier")
15
-
16
- # Upload Dataset
17
- uploaded_file = st.sidebar.file_uploader("Upload a CSV file", type=["csv"])
18
- if uploaded_file:
19
- df = pd.read_csv(uploaded_file)
20
- st.write("### Data Preview", df.head())
21
-
22
- # Select Target Column
23
- label_column = st.selectbox(
24
- "Select target column (if available):",
25
- ["None"] + df.columns.tolist(),
26
- index=0
27
- )
28
-
29
- if label_column == "None":
30
- st.warning("No target column selected. The app will run in inference mode.")
31
- label_column = None
32
- filtered_columns = df.columns.tolist()
33
- else:
34
- # Ensure the label column is defined and excluded from features
35
- df[label_column] = df[label_column].astype(str) # Convert to string
36
- filtered_columns = [col for col in df.columns if col != label_column]
37
-
38
- # Feature Selection
39
- features = st.multiselect(
40
- "Select features:",
41
- filtered_columns,
42
- default=filtered_columns if label_column is None else filtered_columns,
43
- )
44
-
45
- # Validate Features
46
- if label_column in features:
47
- st.error(f"Target column '{label_column}' cannot be included in features. Please remove it.")
48
- st.stop()
49
-
50
- if not features:
51
- st.error("Please select at least one feature to proceed.")
52
- st.stop()
53
-
54
- # Specify Prediction Column Name
55
- prediction_column = st.text_input(
56
- "Enter the name of the column to store predictions:", "Predicted Label"
57
- )
58
-
59
- # Define Labels and Descriptions
60
- st.write(f"### Describe the values {prediction_column} can take")
61
- num_labels = st.number_input("Number of unique labels:", min_value=2, step=1)
62
-
63
- # Create columns for labels and descriptions
64
- col1, col2 = st.columns(2)
65
-
66
- label_descriptions = {}
67
- for i in range(int(num_labels)):
68
- with col1:
69
- label = st.text_input(f"Label {i+1} name:", key=f"label_name_{i}")
70
- with col2:
71
- description = st.text_input(f"Label {i+1} description:", key=f"label_desc_{i}")
72
- label_descriptions[label] = description
73
-
74
- # Compare user-provided labels with unique target values
75
- if label_column:
76
- # Convert label column to string
77
- df[label_column] = df[label_column].astype(str)
78
-
79
- # Get unique values in the target column
80
- unique_target_values = set(df[label_column].unique())
81
- n_unique_target_values = len(unique_target_values)
82
-
83
- if n_unique_target_values > 20:
84
- st.warning(
85
- f"The selected column '{label_column}' has {n_unique_target_values} unique values, "
86
- f"which may not be ideal as a target for classification."
87
- )
88
- proceed = st.checkbox(
89
- f"I understand and still want to use '{label_column}' as the target column."
90
- )
91
- if not proceed:
92
- st.stop()
93
-
94
- # Get user-provided labels
95
- user_provided_labels = set(label_descriptions.keys())
96
 
97
- # Identify missing and extra labels
98
- missing_labels = unique_target_values - user_provided_labels
99
- extra_labels = user_provided_labels - unique_target_values
 
 
 
 
 
 
 
 
 
 
 
 
100
 
101
- # Display warnings for discrepancies
102
- if missing_labels:
103
- st.warning(
104
- f"The following values in the target column are not accounted for in the labels: {', '.join(map(str, missing_labels))}."
105
- )
106
- if extra_labels:
107
- st.warning(
108
- f"The following user-provided labels do not match any values in the target column: {', '.join(map(str, extra_labels))}."
109
- )
110
 
111
- # Few-Shot Prompting
112
- use_few_shot = st.checkbox("Use few-shot prompting with examples from the target column", value=False)
 
113
 
114
- if use_few_shot and label_column:
115
- st.info("Few-shot prompting is enabled. Examples will be selected from the dataset.")
116
-
117
- # Group by target column and select 2 examples per class
118
- few_shot_examples = (
119
- df.groupby(label_column, group_keys=False)
120
- .apply(lambda group: group.sample(min(2, len(group)), random_state=42))
121
  )
122
 
123
- # Show the few-shot examples for reference
124
- st.write("### Few-Shot Examples")
125
- st.write(few_shot_examples[[*features, label_column]])
126
-
127
- # Remove few-shot examples from the dataset
128
- remaining_data = df.drop(few_shot_examples.index)
129
- else:
130
- few_shot_examples = None
131
- remaining_data = df
132
-
133
- # Limit rows based on user input to control costs
134
- num_rows_to_send = st.number_input('Select number of rows to send to OpenAI ($$)',
135
- min_value=1, max_value=len(remaining_data),
136
- value=min(20, len(remaining_data)))
137
- if len(remaining_data) > num_rows_to_send:
138
- st.warning(f"Only the first {num_rows_to_send} rows of the remaining dataset will be sent to OpenAI to minimize costs.")
139
-
140
- # Apply the limit correctly
141
- limited_data = remaining_data.head(num_rows_to_send)
142
-
143
- # Prepare Few-Shot Examples for Prompting
144
- example_rows = []
145
- if use_few_shot and few_shot_examples is not None:
146
- for _, example in few_shot_examples.iterrows():
147
- example_rows.append({
148
- "features": {feature: example[feature] for feature in features},
149
- "label": example[label_column],
150
- })
151
-
152
- # API Key and Model Parameters
153
- openai_api_key = st.sidebar.text_input("Enter your OpenAI API Key:", type="password")
154
- model_params = {
155
- "model": st.selectbox(
156
- "Model:",
157
- DEFAULT_PARAMS["available_models"],
158
- index=DEFAULT_PARAMS["available_models"].index(DEFAULT_PARAMS["model"])
159
- ),
160
- "temperature": st.slider("Temperature:", min_value=0.0, max_value=1.0, value=DEFAULT_PARAMS["temperature"]),
161
- "max_tokens": DEFAULT_PARAMS["max_tokens"],
162
- }
163
-
164
- display_model_config(DEFAULT_PARAMS)
165
-
166
- verbose = st.checkbox("Verbose", value=False)
167
-
168
- # Classification Button
169
- if st.button("Run Classification"):
170
- if not openai_api_key:
171
- st.error("Please provide a valid OpenAI API Key.")
172
  else:
173
- # Initialize OpenAI client
174
- client = get_openai_client(api_key=openai_api_key)
175
-
176
- # Dynamically create the Pydantic model for validation
177
- ClassificationOutput = generate_classification_model(list(label_descriptions.keys()))
178
-
179
- # Create a placeholder for the progress bar
180
- progress_bar = st.progress(0)
181
- progress_text = st.empty()
182
-
183
- # Function to classify a single row
184
- def classify_row(row, index, total_rows):
185
- # Update progress bar
186
- progress_bar.progress((index + 1) / total_rows)
187
- progress_text.text(f"Processing row {index + 1}/{total_rows}...")
188
-
189
- # Generate system and user prompts
190
- system_prompt, user_prompt = generate_prompts(
191
- row=row.to_dict(),
192
- label_descriptions=label_descriptions,
193
- features=features,
194
- example_rows=example_rows,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
195
  )
196
-
197
- # Show the prompts in an expander for transparency
198
- if verbose:
199
- with st.expander(f"OpenAI Call Input for Row Index {row.name}"):
200
- st.write("**System Prompt:**")
201
- st.code(system_prompt)
202
- st.write(f"Token Count (System Prompt): {estimate_token_count(system_prompt, model_params['model'])}")
203
- st.write("**User Prompt:**")
204
- st.code(user_prompt)
205
- st.write(f"Token Count (User Prompt): {estimate_token_count(user_prompt, model_params['model'])}")
206
-
207
- # Make the OpenAI call and validate the output
208
- return apply_classification(
209
- client=client,
210
- model_params=model_params,
211
- ClassificationOutput=ClassificationOutput,
212
- system_prompt=system_prompt,
213
- user_prompt=user_prompt,
214
- verbose=verbose,
215
- st=st
216
  )
217
 
218
- # Apply the classification to each row in the limited data
219
- total_rows = len(limited_data)
220
- predictions = []
221
 
222
- for index, row in limited_data.iterrows():
223
- prediction = classify_row(row, index, total_rows)
224
- predictions.append(prediction)
225
-
226
- # Add predictions to the DataFrame
227
- limited_data[prediction_column] = predictions
228
-
229
- # Reset progress bar and text
230
- progress_bar.empty()
231
- progress_text.empty()
232
 
233
- # Display Predictions
234
- st.write(f"### Predictions ({prediction_column})", limited_data)
 
235
 
236
- # Evaluate if ground truth is available
237
- if label_column in limited_data.columns:
238
- from utils.evaluation import evaluate_predictions
239
- report = evaluate_predictions(limited_data[label_column], limited_data[prediction_column])
240
- st.write("### Evaluation Metrics")
241
- display_metrics_as_table(report)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
242
  else:
243
- st.warning(f"Inference mode: No target column provided, so no evaluation metrics are available.")
244
- # Count predictions
245
- label_counts = limited_data[prediction_column].value_counts().reset_index()
246
- label_counts.columns = ["Label", "Count"]
247
- st.subheader("Prediction Statistics")
248
- st.table(label_counts)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  st.set_page_config(layout="wide")
12
 
13
+ # Define the tabs
14
+ tab1, tab2 = st.tabs(["📖 Documentation", "🤖 Classifier"])
15
+
16
+ # Tab 1: Readme
17
+ with tab1:
18
+ readme_content = ''.join(open('README.md').read().split('---')[2:])
19
+ st.markdown(readme_content)
20
+
21
+ # Tab 2: Classifier
22
+ with tab2:
23
+
24
+ # Streamlit App Title
25
+ st.title("🤖 LLM-based Classifier")
26
+
27
+ # Upload Dataset
28
+ uploaded_file = st.sidebar.file_uploader("Upload a CSV file", type=["csv"])
29
+ if uploaded_file:
30
+ df = pd.read_csv(uploaded_file)
31
+ st.write("### Data Preview", df.head())
32
+
33
+ # Select Target Column
34
+ label_column = st.selectbox(
35
+ "Select target column (if available):",
36
+ ["None"] + df.columns.tolist(),
37
+ index=0
38
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
+ if label_column == "None":
41
+ st.warning("No target column selected. The app will run in inference mode.")
42
+ label_column = None
43
+ filtered_columns = df.columns.tolist()
44
+ else:
45
+ # Ensure the label column is defined and excluded from features
46
+ df[label_column] = df[label_column].astype(str) # Convert to string
47
+ filtered_columns = [col for col in df.columns if col != label_column]
48
+
49
+ # Feature Selection
50
+ features = st.multiselect(
51
+ "Select features:",
52
+ filtered_columns,
53
+ default=filtered_columns if label_column is None else filtered_columns,
54
+ )
55
 
56
+ # Validate Features
57
+ if label_column in features:
58
+ st.error(f"Target column '{label_column}' cannot be included in features. Please remove it.")
59
+ st.stop()
 
 
 
 
 
60
 
61
+ if not features:
62
+ st.error("Please select at least one feature to proceed.")
63
+ st.stop()
64
 
65
+ # Specify Prediction Column Name
66
+ prediction_column = st.text_input(
67
+ "Enter the name of the column to store predictions:", "Predicted Label"
 
 
 
 
68
  )
69
 
70
+ # Define Labels and Descriptions
71
+ if label_column:
72
+ # Automatically fetch unique values from the target column
73
+ unique_labels = df[label_column].unique()
74
+
75
+ # Initialize number of labels based on unique values
76
+ num_labels = len(unique_labels)
77
+ st.write(f"Automatically detected {num_labels} unique values in the target column.")
78
+
79
+ # Create columns for labels and descriptions
80
+ col1, col2 = st.columns(2)
81
+
82
+ # Populate labels and descriptions
83
+ label_descriptions = {}
84
+ for i, value in enumerate(unique_labels):
85
+ with col1:
86
+ label = st.text_input(
87
+ f"Label {i+1} name:",
88
+ value=str(value), # Auto-populate with unique value
89
+ key=f"label_name_{i}"
90
+ )
91
+ with col2:
92
+ description = st.text_input(
93
+ f"Label {i+1} description:",
94
+ value=f"", # Default description
95
+ key=f"label_desc_{i}"
96
+ )
97
+ label_descriptions[label] = description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
  else:
99
+ # Fallback for manual entry if no target column is selected
100
+ num_labels = st.number_input("Number of unique labels:", min_value=2, step=1)
101
+
102
+ # Create columns for labels and descriptions
103
+ col1, col2 = st.columns(2)
104
+
105
+ label_descriptions = {}
106
+ for i in range(int(num_labels)):
107
+ with col1:
108
+ label = st.text_input(f"Label {i+1} name:", key=f"label_name_{i}")
109
+ with col2:
110
+ description = st.text_input(f"Label {i+1} description:", key=f"label_desc_{i}")
111
+ label_descriptions[label] = description
112
+
113
+ # Compare user-provided labels with unique target values
114
+ if label_column:
115
+ # Convert label column to string
116
+ df[label_column] = df[label_column].astype(str)
117
+
118
+ # Get unique values in the target column
119
+ unique_target_values = set(df[label_column].unique())
120
+ n_unique_target_values = len(unique_target_values)
121
+
122
+ if n_unique_target_values > 20:
123
+ st.warning(
124
+ f"The selected column '{label_column}' has {n_unique_target_values} unique values, "
125
+ f"which may not be ideal as a target for classification."
126
+ )
127
+ proceed = st.checkbox(
128
+ f"I understand and still want to use '{label_column}' as the target column."
129
+ )
130
+ if not proceed:
131
+ st.stop()
132
+
133
+ # Get user-provided labels
134
+ user_provided_labels = set(label_descriptions.keys())
135
+
136
+ # Identify missing and extra labels
137
+ missing_labels = unique_target_values - user_provided_labels
138
+ extra_labels = user_provided_labels - unique_target_values
139
+
140
+ # Display warnings for discrepancies
141
+ if missing_labels:
142
+ st.warning(
143
+ f"The following values in the target column are not accounted for in the labels: {', '.join(map(str, missing_labels))}."
144
  )
145
+ if extra_labels:
146
+ st.warning(
147
+ f"The following user-provided labels do not match any values in the target column: {', '.join(map(str, extra_labels))}."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
  )
149
 
150
+ # Few-Shot Prompting
151
+ use_few_shot = st.checkbox("Use few-shot prompting with examples from the target column", value=False)
 
152
 
153
+ if use_few_shot and label_column:
154
+ st.info("Few-shot prompting is enabled. Examples will be selected from the dataset.")
155
+
156
+ # Group by target column and select 2 examples per class
157
+ few_shot_examples = (
158
+ df.groupby(label_column, group_keys=False)
159
+ .apply(lambda group: group.sample(min(2, len(group)), random_state=42))
160
+ )
 
 
161
 
162
+ # Show the few-shot examples for reference
163
+ st.write("### Few-Shot Examples")
164
+ st.write(few_shot_examples[[*features, label_column]])
165
 
166
+ # Remove few-shot examples from the dataset
167
+ remaining_data = df.drop(few_shot_examples.index)
168
+ else:
169
+ few_shot_examples = None
170
+ remaining_data = df
171
+
172
+ # Limit rows based on user input to control costs
173
+ num_rows_to_send = st.number_input('Select number of rows to send to OpenAI ($$)',
174
+ min_value=1, max_value=len(remaining_data),
175
+ value=min(20, len(remaining_data)))
176
+ if len(remaining_data) > num_rows_to_send:
177
+ st.warning(f"Only the first {num_rows_to_send} rows of the remaining dataset will be sent to OpenAI to minimize costs.")
178
+
179
+ # Apply the limit correctly
180
+ limited_data = remaining_data.head(num_rows_to_send)
181
+
182
+ # Prepare Few-Shot Examples for Prompting
183
+ example_rows = []
184
+ if use_few_shot and few_shot_examples is not None:
185
+ for _, example in few_shot_examples.iterrows():
186
+ example_rows.append({
187
+ "features": {feature: example[feature] for feature in features},
188
+ "label": example[label_column],
189
+ })
190
+
191
+ # API Key and Model Parameters
192
+ openai_api_key = st.sidebar.text_input("Enter your OpenAI API Key:", type="password")
193
+ model_params = {
194
+ "model": st.selectbox(
195
+ "Model:",
196
+ DEFAULT_PARAMS["available_models"],
197
+ index=DEFAULT_PARAMS["available_models"].index(DEFAULT_PARAMS["model"])
198
+ ),
199
+ "temperature": st.slider("Temperature:", min_value=0.0, max_value=1.0, value=DEFAULT_PARAMS["temperature"]),
200
+ "max_tokens": DEFAULT_PARAMS["max_tokens"],
201
+ }
202
+
203
+ display_model_config(DEFAULT_PARAMS)
204
+
205
+ verbose = st.checkbox("Verbose", value=False)
206
+
207
+ # Classification Button
208
+ if st.button("Run Classification"):
209
+ if not openai_api_key:
210
+ st.error("Please provide a valid OpenAI API Key.")
211
  else:
212
+ # Initialize OpenAI client
213
+ client = get_openai_client(api_key=openai_api_key)
214
+
215
+ # Dynamically create the Pydantic model for validation
216
+ ClassificationOutput = generate_classification_model(list(label_descriptions.keys()))
217
+
218
+ # Create a placeholder for the progress bar
219
+ progress_bar = st.progress(0)
220
+ progress_text = st.empty()
221
+
222
+ # Function to classify a single row
223
+ def classify_row(row, index, total_rows):
224
+ # Update progress bar
225
+ progress_bar.progress((index + 1) / total_rows)
226
+ progress_text.text(f"Processing row {index + 1}/{total_rows}...")
227
+
228
+ # Generate system and user prompts
229
+ system_prompt, user_prompt = generate_prompts(
230
+ row=row.to_dict(),
231
+ label_descriptions=label_descriptions,
232
+ features=features,
233
+ example_rows=example_rows,
234
+ )
235
+
236
+ # Show the prompts in an expander for transparency
237
+ if verbose:
238
+ with st.expander(f"OpenAI Call Input for Row Index {row.name}"):
239
+ st.write("**System Prompt:**")
240
+ st.code(system_prompt)
241
+ st.write(f"Token Count (System Prompt): {estimate_token_count(system_prompt, model_params['model'])}")
242
+ st.write("**User Prompt:**")
243
+ st.code(user_prompt)
244
+ st.write(f"Token Count (User Prompt): {estimate_token_count(user_prompt, model_params['model'])}")
245
+
246
+ # Make the OpenAI call and validate the output
247
+ return apply_classification(
248
+ client=client,
249
+ model_params=model_params,
250
+ ClassificationOutput=ClassificationOutput,
251
+ system_prompt=system_prompt,
252
+ user_prompt=user_prompt,
253
+ verbose=verbose,
254
+ st=st
255
+ )
256
+
257
+ # Apply the classification to each row in the limited data
258
+ total_rows = len(limited_data)
259
+ predictions = []
260
+
261
+ for index, row in limited_data.iterrows():
262
+ prediction = classify_row(row, index, total_rows)
263
+ predictions.append(prediction)
264
+
265
+ # Add predictions to the DataFrame
266
+ limited_data[prediction_column] = predictions
267
+
268
+ # Reset progress bar and text
269
+ progress_bar.empty()
270
+ progress_text.empty()
271
+
272
+ # Display Predictions
273
+ st.write(f"### Predictions ({prediction_column})", limited_data)
274
+
275
+ # Evaluate if ground truth is available
276
+ if label_column in limited_data.columns:
277
+ from utils.evaluation import evaluate_predictions
278
+ report = evaluate_predictions(limited_data[label_column], limited_data[prediction_column])
279
+ st.write("### Evaluation Metrics")
280
+ display_metrics_as_table(report)
281
+ else:
282
+ st.warning(f"Inference mode: No target column provided, so no evaluation metrics are available.")
283
+ # Count predictions
284
+ label_counts = limited_data[prediction_column].value_counts().reset_index()
285
+ label_counts.columns = ["Label", "Count"]
286
+ st.subheader("Prediction Statistics")
287
+ st.table(label_counts)
288
+ else:
289
+ st.write('Drag and drop a CSV to get started.')
290
+ st.markdown("""
291
+ Some ideas here:
292
+ - (Binary) https://www.kaggle.com/datasets/ozlerhakan/spam-or-not-spam-dataset
293
+ - (Multi-class) https://www.kaggle.com/datasets/mdismielhossenabir/sentiment-analysis
294
+ - (Multi-class) https://www.kaggle.com/datasets/pashupatigupta/emotion-detection-from-text
295
+ """)