lingvanex commited on
Commit
f52ed52
·
verified ·
1 Parent(s): 0f176ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -3
README.md CHANGED
@@ -1,3 +1,92 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - sw
5
+ tags:
6
+ - translation
7
+ - ctranslate2
8
+ license: cc-by-nc-4.0
9
+ ---
10
+
11
+ # English to Swahili Translation
12
+
13
+ This repository provides pre-trained multilingual translation models designed for fast and accurate translations between various languages, such as Kurdish, Samoan, Xhosa, Lao, Corsican, Cebuano, Galician, Yiddish, Swahili, and Yoruba. These models can be used to translate texts from these languages into English and vice versa, making them suitable for machine translation tasks, language localization projects, and building custom translation tools.
14
+
15
+ # Key Features:
16
+
17
+ English to Swahili Translation
18
+ Support for multiple languages (see full list below)
19
+ Pre-trained and optimized for accuracy
20
+ Easy integration into existing translation workflows
21
+
22
+ # Other Languages:
23
+
24
+ Kurdish
25
+ Samoan
26
+ Xhosa
27
+ Lao
28
+ Corsican
29
+ Cebuano
30
+ Galician
31
+ Yiddish
32
+ Swahili
33
+ Yoruba
34
+
35
+ # Use Cases:
36
+
37
+ Machine translation of texts from underrepresented languages
38
+ Localization of websites, apps, or documents into multiple languages
39
+ Developing multilingual NLP tools for research and production environments
40
+
41
+
42
+ # Requirements:
43
+
44
+ To run the models, you need to install ctranslate2 and sentencepiece:
45
+
46
+ pip install ctranslate2 sentencepiece
47
+
48
+ # Simple Usage Example
49
+
50
+ The following code demonstrates how to load and use a model for translation from English to Swahili (en → sw).
51
+
52
+ ```python
53
+ import sentencepiece as spm
54
+ from ctranslate2 import Translator
55
+
56
+ path_to_model = <here_is_your_path_to_the_model>
57
+ source = 'en'
58
+ target = 'sw'
59
+
60
+ translator = Translator(path_to_model, compute_type='int8')
61
+ source_tokenizer = spm.SentencePieceProcessor(f'{path_to_model}/{source}.spm.model')
62
+ target_tokenizer = spm.SentencePieceProcessor(f'{path_to_model}/{target}.spm.model')
63
+
64
+ text = [
65
+ 'I need to make a phone call.',
66
+ 'Can I help you prepare food?',
67
+ 'We want to go for a walk.'
68
+ ]
69
+
70
+ input_tokens = source_tokenizer.EncodeAsPieces(text)
71
+ translator_output = translator.translate_batch(
72
+ input_tokens,
73
+ batch_type='tokens',
74
+ beam_size=2,
75
+ max_input_length=0,
76
+ max_decoding_length=256
77
+ )
78
+
79
+ output_tokens = [item.hypotheses[0] for item in translator_output]
80
+ translation = target_tokenizer.DecodePieces(output_tokens)
81
+ print('\n'.join(translation))
82
+ ```
83
+
84
+ # Keywords:
85
+ Kurdish to English Translation, Samoan to English Translation, Xhosa Translation, Lao to English, Corsican Translation, Cebuano Translation, Galician to English Translation, Yiddish to English Translation, Swahili Translation, Yoruba to English Translation, Multilingual Machine Translation, NLP, Neural Networks, eLearning
86
+
87
+ # License
88
+ This project is licensed under the cc-by-nc-4.0 License.
89
+
90
+ # Contact:
91
+ If you have any questions, just email [email protected]
92
+