Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,83 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- him1411/EDGAR10-Q
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
metrics:
|
8 |
+
- rouge
|
9 |
---
|
10 |
+
license: mit
|
11 |
+
language:
|
12 |
+
- en
|
13 |
+
tags:
|
14 |
+
- finance
|
15 |
+
- ContextNER
|
16 |
+
- language models
|
17 |
+
datasets:
|
18 |
+
- him1411/EDGAR10-Q
|
19 |
+
metrics:
|
20 |
+
- rouge
|
21 |
+
---
|
22 |
+
|
23 |
+
EDGAR-Tk-instruct-base-inst-tune
|
24 |
+
=============
|
25 |
+
|
26 |
+
Tk-Instruct model instruction tuned on [EDGAR10-Q dataset](https://huggingface.co/datasets/him1411/EDGAR10-Q)
|
27 |
+
|
28 |
+
You may want to check out
|
29 |
+
* Our paper: [CONTEXT-NER: Contextual Phrase Generation at Scale](https://arxiv.org/abs/2109.08079/)
|
30 |
+
* GitHub: [Click Here](https://github.com/him1411/edgar10q-dataset)
|
31 |
+
|
32 |
+
|
33 |
+
|
34 |
+
Direct Use
|
35 |
+
=============
|
36 |
+
|
37 |
+
It is possible to use this model to generate text, which is useful for experimentation and understanding its capabilities. **It should not be directly used for production or work that may directly impact people.**
|
38 |
+
|
39 |
+
How to Use
|
40 |
+
=============
|
41 |
+
|
42 |
+
You can very easily load the models with Transformers, instead of downloading them manually. The [Tk-Instruct-base model](https://huggingface.co/allenai/tk-instruct-base-def-pos) is the backbone of our model. Here is how to use the model in PyTorch:
|
43 |
+
|
44 |
+
```python
|
45 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
46 |
+
tokenizer = AutoTokenizer.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
|
47 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
|
48 |
+
```
|
49 |
+
Or just clone the model repo
|
50 |
+
```
|
51 |
+
git lfs install
|
52 |
+
git clone https://huggingface.co/him1411/EDGAR-Tk-instruct-base-inst-tune
|
53 |
+
```
|
54 |
+
|
55 |
+
Inference Example
|
56 |
+
=============
|
57 |
+
|
58 |
+
Here, we provide an example for the "ContextNER" task. Below is an example of one instance.
|
59 |
+
|
60 |
+
```python
|
61 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
62 |
+
tokenizer = AutoTokenizer.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
|
63 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("him1411/EDGAR-Tk-instruct-base-inst-tune")
|
64 |
+
# Input shows how we have appended instruction from our file for HoC dataset with instance.
|
65 |
+
input = "14.5 years . The definite lived intangible assets related to the contracts and trade names had estimated weighted average useful lives of 5.9 years and 14.5 years, respectively, at acquisition."
|
66 |
+
tokenized_input= tokenizer(input)
|
67 |
+
# Ideal output for this input is 'Definite lived intangible assets weighted average remaining useful life'
|
68 |
+
output = model(tokenized_input)
|
69 |
+
```
|
70 |
+
|
71 |
+
|
72 |
+
BibTeX Entry and Citation Info
|
73 |
+
===============
|
74 |
+
If you are using our model, please cite our paper:
|
75 |
+
|
76 |
+
```bibtex
|
77 |
+
@article{gupta2021context,
|
78 |
+
title={Context-NER: Contextual Phrase Generation at Scale},
|
79 |
+
author={Gupta, Himanshu and Verma, Shreyas and Kumar, Tarun and Mishra, Swaroop and Agrawal, Tamanna and Badugu, Amogh and Bhatt, Himanshu Sharad},
|
80 |
+
journal={arXiv preprint arXiv:2109.08079},
|
81 |
+
year={2021}
|
82 |
+
}
|
83 |
+
```
|