update demo link
Browse files
README.md
CHANGED
@@ -280,11 +280,13 @@ model-index:
|
|
280 |
|
281 |
# Longformer Encoder-Decoder (LED) fine-tuned on Booksum
|
282 |
|
283 |
-
|
284 |
-
|
285 |
-
|
286 |
-
|
287 |
-
-
|
|
|
|
|
288 |
|
289 |
> Note: the API is set to generate a max of 64 tokens for runtime reasons, so the summaries may be truncated (depending on length of input text). For best results use python as below.
|
290 |
|
@@ -321,8 +323,7 @@ result = summarizer(
|
|
321 |
max_length=256,
|
322 |
no_repeat_ngram_size=3,
|
323 |
encoder_no_repeat_ngram_size =3,
|
324 |
-
|
325 |
-
repetition_penalty=3.7,
|
326 |
num_beams=4,
|
327 |
early_stopping=True,
|
328 |
)
|
@@ -331,8 +332,10 @@ result = summarizer(
|
|
331 |
```
|
332 |
|
333 |
|
334 |
-
|
335 |
-
|
|
|
|
|
336 |
|
337 |
## Training and evaluation data
|
338 |
|
|
|
280 |
|
281 |
# Longformer Encoder-Decoder (LED) fine-tuned on Booksum
|
282 |
|
283 |
+
demo:
|
284 |
+
|
285 |
+
[![colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/gist/pszemraj/d9a0495861776168fd5cdcd7731bc4ee/example-long-t5-tglobal-base-16384-book-summary.ipynb)
|
286 |
+
|
287 |
+
- A fine-tuned version of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the BookSum dataset.
|
288 |
+
- Goal: a model that can generalize well and is useful in summarizing long text in academic and daily usage.
|
289 |
+
- works well on lots of text and can handle 16384 tokens/batch (_if you have the GPU memory to handle that_)
|
290 |
|
291 |
> Note: the API is set to generate a max of 64 tokens for runtime reasons, so the summaries may be truncated (depending on length of input text). For best results use python as below.
|
292 |
|
|
|
323 |
max_length=256,
|
324 |
no_repeat_ngram_size=3,
|
325 |
encoder_no_repeat_ngram_size =3,
|
326 |
+
repetition_penalty=3.5,
|
|
|
327 |
num_beams=4,
|
328 |
early_stopping=True,
|
329 |
)
|
|
|
332 |
```
|
333 |
|
334 |
|
335 |
+
**Important:** To generate the best quality summaries, you should use the global attention mask when decoding, as demonstrated in [this community notebook here](https://colab.research.google.com/drive/12INTTR6n64TzS4RrXZxMSXfrOd9Xzamo?usp=sharing), see the definition of `generate_answer(batch)`.
|
336 |
+
|
337 |
+
If having computing constraints, try the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary)
|
338 |
+
- all the parameters for generation on the API here are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
|
339 |
|
340 |
## Training and evaluation data
|
341 |
|