MariaFjodorowa commited on
Commit
c3c464f
·
verified ·
1 Parent(s): 19e3ca9

fix language name

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  language:
3
- - en
4
  inference: false
5
  tags:
6
  - BERT
@@ -9,9 +9,10 @@ tags:
9
  license: apache-2.0
10
  datasets:
11
  - HPLT/HPLT2.0_cleaned
 
12
  ---
13
 
14
- # HPLT v2.0 for English
15
 
16
  <img src="https://hplt-project.org/_next/static/media/logo-hplt.d5e16ca5.svg" width=12.5%>
17
 
@@ -40,8 +41,8 @@ This model currently needs a custom wrapper from `modeling_ltgbert.py`, you shou
40
  import torch
41
  from transformers import AutoTokenizer, AutoModelForMaskedLM
42
 
43
- tokenizer = AutoTokenizer.from_pretrained("HPLT/hplt_bert_base_eng-Latn")
44
- model = AutoModelForMaskedLM.from_pretrained("HPLT/hplt_bert_base_eng-Latn", trust_remote_code=True)
45
 
46
  mask_id = tokenizer.convert_tokens_to_ids("[MASK]")
47
  input_text = tokenizer("It's a beautiful[MASK].", return_tensors="pt")
@@ -60,13 +61,13 @@ We are releasing 10 intermediate checkpoints for each model at intervals of ever
60
 
61
  You can load a specific model revision with `transformers` using the argument `revision`:
62
  ```python
63
- model = AutoModelForMaskedLM.from_pretrained("HPLT/hplt_bert_base_eng-Latn", revision="step21875", trust_remote_code=True)
64
  ```
65
 
66
  You can access all the revisions for the models with the following code:
67
  ```python
68
  from huggingface_hub import list_repo_refs
69
- out = list_repo_refs("HPLT/hplt_bert_base_eng-Latn")
70
  print([b.name for b in out.branches])
71
  ```
72
 
@@ -102,4 +103,4 @@ print([b.name for b in out.branches])
102
  primaryClass={cs.CL},
103
  url={https://arxiv.org/abs/2503.10267},
104
  }
105
- ```
 
1
  ---
2
  language:
3
+ - sw
4
  inference: false
5
  tags:
6
  - BERT
 
9
  license: apache-2.0
10
  datasets:
11
  - HPLT/HPLT2.0_cleaned
12
+ pipeline_tag: fill-mask
13
  ---
14
 
15
+ # HPLT v2.0 for Swahili
16
 
17
  <img src="https://hplt-project.org/_next/static/media/logo-hplt.d5e16ca5.svg" width=12.5%>
18
 
 
41
  import torch
42
  from transformers import AutoTokenizer, AutoModelForMaskedLM
43
 
44
+ tokenizer = AutoTokenizer.from_pretrained("HPLT/hplt_bert_base_swh-Latn")
45
+ model = AutoModelForMaskedLM.from_pretrained("HPLT/hplt_bert_base_swh-Latn", trust_remote_code=True)
46
 
47
  mask_id = tokenizer.convert_tokens_to_ids("[MASK]")
48
  input_text = tokenizer("It's a beautiful[MASK].", return_tensors="pt")
 
61
 
62
  You can load a specific model revision with `transformers` using the argument `revision`:
63
  ```python
64
+ model = AutoModelForMaskedLM.from_pretrained("HPLT/hplt_bert_base_swh-Latn", revision="step21875", trust_remote_code=True)
65
  ```
66
 
67
  You can access all the revisions for the models with the following code:
68
  ```python
69
  from huggingface_hub import list_repo_refs
70
+ out = list_repo_refs("HPLT/hplt_bert_base_swh-Latn")
71
  print([b.name for b in out.branches])
72
  ```
73
 
 
103
  primaryClass={cs.CL},
104
  url={https://arxiv.org/abs/2503.10267},
105
  }
106
+ ```