Update README.md
Browse files
README.md
CHANGED
@@ -4,8 +4,6 @@ language:
|
|
4 |
tags:
|
5 |
- text2text-generation
|
6 |
- summarization
|
7 |
-
- legal-ai
|
8 |
-
- italian-law
|
9 |
license: mit
|
10 |
datasets:
|
11 |
- joelniklaus/Multi_Legal_Pile
|
@@ -28,6 +26,8 @@ They build upon **BART-IT** ([`morenolq/bart-it`](https://huggingface.co/morenol
|
|
28 |
- **Trained on legal documents** such as **statutes, case law, and contracts** π
|
29 |
- **Not fine-tuned for specific tasks** (requires further adaptation)
|
30 |
|
|
|
|
|
31 |
## π Available Models
|
32 |
|
33 |
| Model | Description | Link |
|
@@ -38,8 +38,8 @@ They build upon **BART-IT** ([`morenolq/bart-it`](https://huggingface.co/morenol
|
|
38 |
| **LEGIT-SCRATCH-BART** | Trained from scratch on **Italian legal texts** | [π Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART) |
|
39 |
| **LEGIT-SCRATCH-BART-LSG-4096** | Trained from scratch with **LSG attention**, supporting **4,096 tokens** | [π Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-4096) |
|
40 |
| **LEGIT-SCRATCH-BART-LSG-16384** | Trained from scratch with **LSG attention**, supporting **16,384 tokens** | [π Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-16384) |
|
41 |
-
| **BART-IT-LSG-4096** | `morenolq/bart-it` with **LSG attention**, supporting **4,096 tokens** (no legal adaptation) | [π Link](https://huggingface.co/morenolq/BART-IT-LSG-4096)
|
42 |
-
| **BART-IT-LSG-16384** | `morenolq/bart-it` with **LSG attention**, supporting **16,384 tokens** (no legal adaptation) | [π Link](https://huggingface.co/morenolq/BART-IT-LSG-16384) |
|
43 |
|
44 |
---
|
45 |
|
@@ -74,10 +74,10 @@ model = BartForConditionalGeneration.from_pretrained(model_name)
|
|
74 |
input_text = "<mask> 1234: Il contratto si intende concluso quando..."
|
75 |
inputs = tokenizer(input_text, return_tensors="pt", max_length=4096, truncation=True)
|
76 |
|
77 |
-
#
|
78 |
-
|
79 |
-
|
80 |
-
print("
|
81 |
```
|
82 |
|
83 |
---
|
|
|
4 |
tags:
|
5 |
- text2text-generation
|
6 |
- summarization
|
|
|
|
|
7 |
license: mit
|
8 |
datasets:
|
9 |
- joelniklaus/Multi_Legal_Pile
|
|
|
26 |
- **Trained on legal documents** such as **statutes, case law, and contracts** π
|
27 |
- **Not fine-tuned for specific tasks** (requires further adaptation)
|
28 |
|
29 |
+
β οΈ This specific model is pre-trained on general-purpose Italian text! Please select the best model from the table below.
|
30 |
+
|
31 |
## π Available Models
|
32 |
|
33 |
| Model | Description | Link |
|
|
|
38 |
| **LEGIT-SCRATCH-BART** | Trained from scratch on **Italian legal texts** | [π Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART) |
|
39 |
| **LEGIT-SCRATCH-BART-LSG-4096** | Trained from scratch with **LSG attention**, supporting **4,096 tokens** | [π Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-4096) |
|
40 |
| **LEGIT-SCRATCH-BART-LSG-16384** | Trained from scratch with **LSG attention**, supporting **16,384 tokens** | [π Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-16384) |
|
41 |
+
| **BART-IT-LSG-4096** | `morenolq/bart-it` with **LSG attention**, supporting **4,096 tokens** (β οΈ no legal adaptation) | [π Link](https://huggingface.co/morenolq/BART-IT-LSG-4096)
|
42 |
+
| **BART-IT-LSG-16384** | `morenolq/bart-it` with **LSG attention**, supporting **16,384 tokens** (β οΈ no legal adaptation) | [π Link](https://huggingface.co/morenolq/BART-IT-LSG-16384) |
|
43 |
|
44 |
---
|
45 |
|
|
|
74 |
input_text = "<mask> 1234: Il contratto si intende concluso quando..."
|
75 |
inputs = tokenizer(input_text, return_tensors="pt", max_length=4096, truncation=True)
|
76 |
|
77 |
+
# Generate summary
|
78 |
+
summary_ids = model.generate(inputs.input_ids, max_length=150, num_beams=4, early_stopping=True)
|
79 |
+
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
|
80 |
+
print("π Summary:", summary)
|
81 |
```
|
82 |
|
83 |
---
|