|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- mistralai/Mistral-7B-v0.1 |
|
tags: |
|
- legal |
|
--- |
|
|
|
# reglab-rrc/mistral-rrc |
|
|
|
**Paper:** [AI for Scaling Legal Reform: Mapping and Redacting Racial Covenants in Santa Clara County]() |
|
|
|
|
|
## Usage |
|
|
|
Here is an example of how to use the model to find racial covenants in a page of text: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import re |
|
|
|
# Load model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("reglab/mistral-rrc") |
|
model = AutoModelForCausalLM.from_pretrained("reglab/mistral-rrc") |
|
|
|
def format_prompt(document): |
|
return f"""### Instruction: |
|
Determine whether the property deed contains a racial covenant. A racial covenant is a clause in a document that \ |
|
restricts who can reside, own, or occupy a property on the basis of race, ethnicity, national origin, or religion. \ |
|
Answer "Yes" or "No". If "Yes", provide the exact text of the relevant passage and then a quotation of the passage \ |
|
with spelling and formatting errors fixed. |
|
|
|
### Input: |
|
{document} |
|
|
|
### Response:""" |
|
|
|
def parse_output(output): |
|
answer_match = re.search(r"\[ANSWER\](.*?)\[/ANSWER\]", output, re.DOTALL) |
|
raw_passage_match = re.search(r"\[RAW PASSAGE\](.*?)\[/RAW PASSAGE\]", output, re.DOTALL) |
|
quotation_match = re.search(r"\[CORRECTED QUOTATION\](.*?)\[/CORRECTED QUOTATION\]", output, re.DOTALL) |
|
|
|
answer = answer_match.group(1).strip() if answer_match else None |
|
raw_passage = raw_passage_match.group(1).strip() if raw_passage_match else None |
|
quotation = quotation_match.group(1).strip() if quotation_match else None |
|
|
|
return { |
|
"answer": answer == "Yes", |
|
"raw_passage": raw_passage, |
|
"quotation": quotation |
|
} |
|
|
|
# Example usage |
|
document = "Your property deed text here..." |
|
prompt = format_prompt(document) |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_new_tokens=512) |
|
result = tokenizer.decode(outputs[0]) |
|
parsed_result = parse_output(result) |
|
|
|
print(parsed_result) |
|
``` |
|
|
|
The model was trained with the given input and output formats, so be sure to use them |
|
when performing inference. |
|
|
|
## Intended Use |
|
|
|
This model is designed to detect racial covenants in property deeds. |
|
|
|
## Training Data |
|
|
|
|
|
## Performance |
|
|
|
|
|
## Limitations |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
|
|
## Citation |
|
|
|
``` |
|
@article{suranisuzgun2024, |
|
title={AI for Scaling Legal Reform: Mapping and Redacting Racial Covenants in Santa Clara County}, |
|
author={Surani, Faiz and Suzgun, Mirac and Raman, Vyoma and Manning, Christopher D. and Henderson, Peter and Ho, Daniel E.}, |
|
journal={}, |
|
year={2024} |
|
} |
|
``` |
|
|