tomaarsen/span-marker-mbert-base-multinerd · Model failed to predict if the word is lower

Sep 16, 2023

model can able to predict Pariisi as LOC but it failed to predict if the same word is in lower case pariisi , how to resolve this issue ?

tomaarsen

Owner Sep 16, 2023

Hello!
Great question. This model uses the bert-base-multilingual-cased model, meaning that it differentiates Pariisi and pariisi. Because it only sees the former during training, it doesn't work well for pariisi, as you've noticed. However, as the README says:

Is your data not (always) capitalized correctly? Then consider using this uncased variant of this model by @lxyuan for better performance:
lxyuan/span-marker-bert-base-multilingual-uncased-multinerd.

@lxyuan their model is equivalent to this one, with the exception that it is uncased, i.e. it works just as well for pariisi as Pariisi:

I recommend that one if your data isn't always correctly capitalized :) Hope this helps.

Tom Aarsen

vpkprasanna

Sep 17, 2023

got the answer thanks for the replay

tomaarsen changed discussion status to closed Sep 17, 2023