It works for Chinese and English, is it possible to use for other languages, such as french, kerea
#2
by
sk2mm2
- opened
It works for Chinese and English, is it possible to use for other languages, such as french, kerea
Which language we can support depends mainly on two factors, one is whether the pretrained encoder model we use supports the language, and the other is whether our training set includes samples of the language. There are many good multilingual pretrained models on HuggingFace, but we do not have a multilingual training data set. If you can provide the relevant data set, we will be happy to train a more languages supporting the Embedding model.
The format of the training sample, just like this
@dataclass(slots=True)
class PairRecord:
text: str
text_pos: str
MokaHR
changed discussion status to
closed