Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!
This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D
Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect. Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3