ltg
/

deberta-xxlarge-fixed

Text Generation

Model card Files Files and versions Community

davda54 commited on Jun 6, 2024

Commit

6dda0f3

·

verified ·

1 Parent(s): 46e6d95

Update README.md

Files changed (1) hide show

README.md +31 -1

README.md CHANGED Viewed

@@ -7,7 +7,37 @@ thumbnail: https://huggingface.co/front/thumbnails/microsoft.png
 license: mit
 ---
-## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
 [DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on  majority of NLU tasks with 80GB training data.

 license: mit
 ---
+## DeBERTa-fixed: Decoding-enhanced BERT with Disentangled Attention
+### Example code
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("ltg/deberta-xxlarge-fixed", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained("ltg/deberta-xxlarge-fixed", trust_remote_code=True).cuda().eval()
+prompt = """German: Hallo, wie geht es Ihnen heute?
+English:"""
+prompt = prompt.replace('\n', '\\n ')
+input_ids = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).input_ids.cuda()
+prediction = model.generate(
+    input_ids,
+    num_beams=4,
+    do_sample=False,
+    use_cache=None,
+    max_new_tokens=64,
+    eos_token_id=tokenizer(".\\", add_special_tokens=False).input_ids[1:]
+)
+prediction = prediction[0, input_ids.size(1):]
+prediction = tokenizer.decode(prediction).rstrip('\\')
+# Expected output: "Hello, how are you doing today?"
+print(prediction)
+```
+## Old README below:
 [DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on  majority of NLU tasks with 80GB training data.