Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
3
5
7
Catherine Arnett
catherinearnett
Follow
lunarflu's profile picture
shirkey's profile picture
kw1ntti's profile picture
41 followers
·
22 following
https://catherinearnett.github.io/
linguist_cat
catherinearnett
catherinearnett.bsky.social
AI & ML interests
multilingual NLP, tokenization
Recent Activity
authored
a paper
21 days ago
BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization
authored
a paper
21 days ago
Evaluating Morphological Alignment of Tokenizers in 70 Languages
liked
a dataset
22 days ago
classla/ParlaSpeech-PL
View all activity
Organizations
catherinearnett
's datasets
1
Sort: Recently updated
catherinearnett/morphscore
Viewer
•
Updated
26 days ago
•
5.09M
•
363
•
1