pszemraj
/

led-large-book-summary

@@ -280,11 +280,13 @@ model-index:
 # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
-- This model is a fine-tuned version of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the booksum dataset.
-- the goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage.
-- [I don't want to read anymore give me a notebook implementation](https://colab.research.google.com/gist/pszemraj/3eba944ddc9fc9a4a1bfb21e83b57620/summarization-token-batching.ipynb)
-- all the parameters for generation on the API are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
-- works well on lots of text, can hand 16384 tokens/batch.
 > Note: the API is set to generate a max of 64 tokens for runtime reasons, so the summaries may be truncated (depending on length of input text). For best results use python as below.
@@ -321,8 +323,7 @@ result = summarizer(
            max_length=256,
            no_repeat_ngram_size=3,
            encoder_no_repeat_ngram_size =3,
-           clean_up_tokenization_spaces=True,
-           repetition_penalty=3.7,
            num_beams=4,
            early_stopping=True,
     )
@@ -331,8 +332,10 @@ result = summarizer(
 ```
-- **Important:** To generate the best quality summaries, you should use the global attention mask when decoding, as demonstrated in [this community notebook here](https://colab.research.google.com/drive/12INTTR6n64TzS4RrXZxMSXfrOd9Xzamo?usp=sharing), see the definition of `generate_answer(batch)`.
-- If you run into compute constraints, try the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary)
 ## Training and evaluation data

 # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
+demo:
+ [![colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/gist/pszemraj/d9a0495861776168fd5cdcd7731bc4ee/example-long-t5-tglobal-base-16384-book-summary.ipynb)
+- A fine-tuned version of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the BookSum dataset.
+- Goal: a model that can generalize well and is useful in summarizing long text in academic and daily usage.
+- works well on lots of text and can handle 16384 tokens/batch (_if you have the GPU memory to handle that_)
 > Note: the API is set to generate a max of 64 tokens for runtime reasons, so the summaries may be truncated (depending on length of input text). For best results use python as below.
            max_length=256,
            no_repeat_ngram_size=3,
            encoder_no_repeat_ngram_size =3,
+           repetition_penalty=3.5,
            num_beams=4,
            early_stopping=True,
     )
 ```
+**Important:** To generate the best quality summaries, you should use the global attention mask when decoding, as demonstrated in [this community notebook here](https://colab.research.google.com/drive/12INTTR6n64TzS4RrXZxMSXfrOd9Xzamo?usp=sharing), see the definition of `generate_answer(batch)`.
+If having computing constraints, try the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary)
+- all the parameters for generation on the API here are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
 ## Training and evaluation data