Spaces:

rwillats
/

Contextual-Policy-Engine-Hate-Speech-Classification

Running

App Files Files Community

Contextual-Policy-Engine-Hate-Speech-Classification / README.md

rwillats

Upload folder using huggingface_hub

0886c09 verified 8 months ago

preview code

raw

history blame

3.66 kB

	---
	title: guardrails
	app_file: hate_speech_demo.py
	sdk: gradio
	sdk_version: 5.23.3
	---
	# Guardrails API Call

	This script allows you to input a version-controlled dataset and test selected queries against a specified model.
	The input must be in `.xlsx` format, enabling the testing of filtered rows. The output will be a `.csv` file, ready for copying into a response dataset template.

	---

	## Installation

	1. Clone the repository:
	```bash
	git clone https://github.com/ContextualAI/guardrails
	cd guardrails
	```

	2. Install dependencies:
	```bash
	pip install requests pandas openpyxl python-dotenv tqdm
	```

	---

	## Setting Environment Variables

	API Key, Application ID, and API Endpoint URL must be defined in the `key.env` file.

	1. Copy the example file to create your `key.env`:
	```bash
	cp key.env.template key.env
	```

	2. Open the newly created `key.env` file in a text editor.

	3. Input the required values for the following variables:
	```env
	API_KEY=your_api_key_here
	ENDPOINT_URL=your_endpoint_url_here
	APPLICATION_ID=your_application_id_here
	```

	4. Save the file.

	---

	## Testing the Environment Variables

	A simple test script is included to help verify that your environment variables are correctly configured before running a full-scale evaluation.

	1. Run
	```bash
	python test.py
	```
	2. Input 'clm'
	3. Enter your desired prompt and press Enter:
	- If the response is successfully retrieved, your environment variables are correctly set up.
	- If not, double-check the key.env file for any errors.

	You can also test the environment variables by running the full script, but only selecting just one row from the version-controlled dataset. However, the test script remains a useful tool for quickly interacting with the model and retrieving single responses.


	## Downloading the Dataset

	1. Navigate to the latest [version-controlled dataset](https://docs.google.com/spreadsheets/d/1fW3Ohyq2VdX5mmFgjSvqzj1hcPYCqQae7_sEcaEXA2U/edit?usp=drive_link).

	2. On the Customer Selection tab, select the required customer to load the customer information into the brand safety templates.

	3. On both the Brand Safety Prompts and Generic Prompts tabs, use column B (`filter`) to select rows for inclusion in the evaluation run.
	Simply input "yes" into the rows you wish to include.

	4. Download the file in `.xlsx` format.

	5. Important: After downloading your `.xlsx` file, unfilter all columns in both tabs and remove your selections from the `filter` column to reset the dataset.

	---

	## Running the Script

	Run the script from the command line:
	```bash
	python api_call.py
	```

	1. Input the file path to the `.xlsx` file, or drag and drop the file into the command line.

	2. Input the desired name of the output `.csv` file (without the `.csv` extension).

	3. The script will process the selected rows, send them to the model, and generate an output file formatted in `.csv`.

	---

	## Using the Output File

	1. Navigate to the [response dataset template](https://docs.google.com/spreadsheets/d/1w9F9NEXAvRtSpNNUUFs91HhPGG0gvwki4ilYSr5GC8M/edit?usp=drive_link) and make a copy.

	2. Add the desired number of rows to the new copied sheet.

	3. Copy and paste all rows from the output `.csv` as values. All columns will align directly with the response dataset template.

	4. Use the Policy Assessment and Response Tags columns to annotate the response data.

	Note: Blank rows in the `jailbreaking technique` and `sector` columns were originally `n/a` in the version-controlled dataset. Adjust these cells as needed to match your requirements.

	---