Upload processor

Browse files

Files changed (6) hide show

README.md +199 -0
added_tokens.json +5 -0
preprocessor_config.json +10 -0
special_tokens_map.json +6 -0
tokenizer_config.json +49 -0
vocab.json +208 -0

README.md ADDED Viewed

	@@ -0,0 +1,199 @@

+---
+library_name: transformers
+tags: []
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]

added_tokens.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "</s>": 205,
+  "<s>": 204,
+  "[PAD]": 203
+}

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "do_normalize": true,
+  "feature_extractor_type": "Wav2Vec2FeatureExtractor",
+  "feature_size": 1,
+  "padding_side": "right",
+  "padding_value": 0.0,
+  "processor_class": "Wav2Vec2Processor",
+  "return_attention_mask": true,
+  "sampling_rate": 16000
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "bos_token": "<s>",
+  "eos_token": "</s>",
+  "pad_token": "[PAD]",
+  "unk_token": "[UNK]"
+}

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,49 @@

+{
+  "added_tokens_decoder": {
+    "202": {
+      "content": "[UNK]",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": true,
+      "single_word": false,
+      "special": false
+    },
+    "203": {
+      "content": "[PAD]",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": true,
+      "single_word": false,
+      "special": false
+    },
+    "204": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "205": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "do_lower_case": false,
+  "eos_token": "</s>",
+  "extra_special_tokens": {},
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "processor_class": "Wav2Vec2Processor",
+  "replace_word_delimiter_char": " ",
+  "target_lang": "wol",
+  "tokenizer_class": "Wav2Vec2CTCTokenizer",
+  "unk_token": "[UNK]",
+  "word_delimiter_token": "|"
+}

vocab.json ADDED Viewed

	@@ -0,0 +1,208 @@

+{
+  "wol": {
+    "!": 1,
+    "#": 2,
+    "$": 3,
+    "&": 4,
+    "'": 5,
+    "*": 6,
+    "+": 7,
+    "-": 8,
+    "/": 9,
+    "0": 10,
+    "1": 11,
+    "2": 12,
+    "3": 13,
+    "4": 14,
+    "5": 15,
+    "6": 16,
+    "7": 17,
+    "8": 18,
+    "9": 19,
+    "<": 20,
+    "=": 21,
+    ">": 22,
+    "?": 23,
+    "@": 24,
+    "[": 25,
+    "[PAD]": 203,
+    "[UNK]": 202,
+    "\\": 26,
+    "]": 27,
+    "^": 28,
+    "`": 29,
+    "a": 30,
+    "b": 31,
+    "c": 32,
+    "d": 33,
+    "e": 34,
+    "f": 35,
+    "g": 36,
+    "h": 37,
+    "i": 38,
+    "j": 39,
+    "k": 40,
+    "l": 41,
+    "m": 42,
+    "n": 43,
+    "o": 44,
+    "p": 45,
+    "q": 46,
+    "r": 47,
+    "s": 48,
+    "t": 49,
+    "u": 50,
+    "v": 51,
+    "w": 52,
+    "x": 53,
+    "y": 54,
+    "z": 55,
+    "{": 56,
+    "|": 0,
+    "}": 58,
+    "~": 59,
+    "£": 60,
+    "¨": 61,
+    "°": 62,
+    "²": 63,
+    "µ": 64,
+    "à": 65,
+    "á": 66,
+    "â": 67,
+    "ã": 68,
+    "ä": 69,
+    "ç": 70,
+    "è": 71,
+    "é": 72,
+    "ê": 73,
+    "ë": 74,
+    "ì": 75,
+    "í": 76,
+    "î": 77,
+    "ï": 78,
+    "ñ": 79,
+    "ò": 80,
+    "ó": 81,
+    "ô": 82,
+    "õ": 83,
+    "ö": 84,
+    "ù": 85,
+    "ú": 86,
+    "û": 87,
+    "ü": 88,
+    "ā": 89,
+    "ă": 90,
+    "ą": 91,
+    "đ": 92,
+    "ĩ": 93,
+    "ł": 94,
+    "ń": 95,
+    "ŋ": 96,
+    "ş": 97,
+    "ƭ": 98,
+    "ɐ": 99,
+    "ɓ": 100,
+    "ɗ": 101,
+    "ə": 102,
+    "ˈ": 103,
+    "ː": 104,
+    "̀": 105,
+    "̃": 106,
+    "̈": 107,
+    "έ": 108,
+    "α": 109,
+    "β": 110,
+    "γ": 111,
+    "η": 112,
+    "ι": 113,
+    "μ": 114,
+    "ν": 115,
+    "ξ": 116,
+    "ο": 117,
+    "ρ": 118,
+    "σ": 119,
+    "ό": 120,
+    "а": 121,
+    "г": 122,
+    "з": 123,
+    "й": 124,
+    "к": 125,
+    "м": 126,
+    "н": 127,
+    "о": 128,
+    "р": 129,
+    "с": 130,
+    "т": 131,
+    "у": 132,
+    "ц": 133,
+    "ы": 134,
+    "я": 135,
+    "ё": 136,
+    "ї": 137,
+    "ء": 138,
+    "آ": 139,
+    "ا": 140,
+    "ت": 141,
+    "ح": 142,
+    "خ": 143,
+    "د": 144,
+    "ر": 145,
+    "ق": 146,
+    "ل": 147,
+    "م": 148,
+    "ن": 149,
+    "ه": 150,
+    "و": 151,
+    "ي": 152,
+    "ẽ": 153,
+    "ị": 154,
+    "ồ": 155,
+    "ớ": 156,
+    "ὀ": 157,
+    "ῆ": 158,
+    "‍": 159,
+    "‎": 160,
+    "–": 161,
+    "—": 162,
+    "’": 163,
+    "…": 164,
+    "′": 165,
+    "″": 166,
+    "€": 167,
+    "↓": 168,
+    "①": 169,
+    "②": 170,
+    "③": 171,
+    "④": 172,
+    "⑧": 173,
+    "▪": 174,
+    "●": 175,
+    "☞": 176,
+    "☢": 177,
+    "♀": 178,
+    "♂": 179,
+    "♥": 180,
+    "⛔": 181,
+    "✅": 182,
+    "✊": 183,
+    "✍": 184,
+    "❛": 185,
+    "❜": 186,
+    "《": 187,
+    "天": 188,
+    "安": 189,
+    "州": 190,
+    "市": 191,
+    "广": 192,
+    "庆": 193,
+    "武": 194,
+    "汉": 195,
+    "沈": 196,
+    "津": 197,
+    "西": 198,
+    "重": 199,
+    "阳": 200,
+    "︎": 201,
+    "️": 202
+  }
+}