him1411
/

EDGAR-Tk-instruct-base-inst-tune

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

him1411 commited on May 12, 2023

Commit

b774219

•

1 Parent(s): c78c553

Update README.md

Files changed (1) hide show

README.md +80 -0

README.md CHANGED Viewed

@@ -1,3 +1,83 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+datasets:
+- him1411/EDGAR10-Q
+language:
+- en
+metrics:
+- rouge
 ---
+license: mit
+language:
+- en
+tags:
+- finance
+- ContextNER
+- language models
+datasets:
+- him1411/EDGAR10-Q
+metrics:
+- rouge
+---
+EDGAR-Tk-instruct-base-inst-tune
+=============
+Tk-Instruct model instruction tuned on [EDGAR10-Q dataset](https://huggingface.co/datasets/him1411/EDGAR10-Q)
+You may want to check out
+* Our paper: [CONTEXT-NER: Contextual Phrase Generation at Scale](https://arxiv.org/abs/2109.08079/)
+* GitHub: [Click Here](https://github.com/him1411/edgar10q-dataset)
+Direct Use
+=============
+It is possible to use this model to generate text, which is useful for experimentation and understanding its capabilities. **It should not be directly used for production or work that may directly impact people.**
+How to Use
+=============
+You can very easily load the models with Transformers, instead of downloading them manually. The [Tk-Instruct-base model](https://huggingface.co/allenai/tk-instruct-base-def-pos) is the backbone of our model. Here is how to use the model in PyTorch:
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
+model = AutoModelForSeq2SeqLM.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
+```
+Or just clone the model repo
+```
+git lfs install
+git clone https://huggingface.co/him1411/EDGAR-Tk-instruct-base-inst-tune
+```
+Inference Example
+=============
+Here, we provide an example for the "ContextNER" task. Below is an example of one instance.
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
+model = AutoModelForSeq2SeqLM.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
+# Input shows how we have appended instruction from our file for HoC dataset with instance.
+input = "14.5 years . The definite lived intangible assets related to the contracts and trade names had estimated weighted average useful lives of 5.9 years and 14.5 years, respectively, at acquisition."
+tokenized_input= tokenizer(input)
+# Ideal output for this input is 'Definite lived intangible assets weighted average remaining useful life'
+output = model(tokenized_input)
+```
+BibTeX Entry and Citation Info
+===============
+If you are using our model, please cite our paper:
+```bibtex
+@article{gupta2021context,
+  title={Context-NER: Contextual Phrase Generation at Scale},
+  author={Gupta, Himanshu and Verma, Shreyas and Kumar, Tarun and Mishra, Swaroop and Agrawal, Tamanna and Badugu, Amogh and Bhatt, Himanshu Sharad},
+  journal={arXiv preprint arXiv:2109.08079},
+  year={2021}
+}
+```