Stefan Schweter's picture

Stefan Schweter PRO

stefan-it

·

https://schweter.bayern

AI & ML interests

Flair Library 💕, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models, German Language Models, Bavarian NLP

Recent Activity

updated a Space about 13 hours ago

bavarian-nlp/README

published a dataset about 13 hours ago

bavarian-nlp/bavarian-books-ocred-v0.1

updated a dataset about 13 hours ago

bavarian-nlp/bavarian-books-ocred-v0.1

View all activity

Organizations

commented a paper 8 days ago

KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications

Paper • 2503.17247 • Published Mar 21 • 1 •

commented 3 papers 11 days ago

German4All - A Dataset and Model for Readability-Controlled Paraphrasing in German

Paper • 2508.17973 • Published 12 days ago • 1 •

German4All - A Dataset and Model for Readability-Controlled Paraphrasing in German

Paper • 2508.17973 • Published 12 days ago • 1 •

German4All - A Dataset and Model for Readability-Controlled Paraphrasing in German

Paper • 2508.17973 • Published 12 days ago • 1 •

commented a paper 15 days ago

Tokens with Meaning: A Hybrid Tokenization Approach for NLP

Paper • 2508.14292 • Published 18 days ago • 1 •

New activity in flair/ner-french 25 days ago

docs: fix import and usage of French NER dataset

#2 opened 25 days ago by

commented a paper 25 days ago

GLiClass: Generalist Lightweight Model for Sequence Classification Tasks

Paper • 2508.07662 • Published 26 days ago • 8 •

commented 3 papers about 1 month ago

Do Construction Distributions Shape Formal Language Learning In German BabyLMs?

Paper • 2503.11593 • Published Mar 14 • 1 •

GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface

Paper • 2507.18546 • Published Jul 24 • 20 •

A New Pair of GloVes

Paper • 2507.18103 • Published Jul 24 • 7 •

commented 4 papers about 2 months ago

GG-BBQ: German Gender Bias Benchmark for Question Answering

Paper • 2507.16410 • Published Jul 22 • 2 •

GG-BBQ: German Gender Bias Benchmark for Question Answering

Paper • 2507.16410 • Published Jul 22 • 2 •

On the Inevitability of Left-Leaning Political Bias in Aligned Language Models

Paper • 2507.15328 • Published Jul 21 •

On the Inevitability of Left-Leaning Political Bias in Aligned Language Models

Paper • 2507.15328 • Published Jul 21 •

New activity in stefan-it/ettin-encoder-400m-tokenizer-fix about 2 months ago

Adding `safetensors` variant of this model

#1 opened about 2 months ago by

commented 5 papers about 2 months ago

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Paper • 2507.11412 • Published Jul 15 • 25 •

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Paper • 2507.11412 • Published Jul 15 • 25 •

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Paper • 2507.11412 • Published Jul 15 • 25 •

Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation

Paper • 2504.06225 • Published Apr 8 • 3 •

A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation

Paper • 2506.02894 • Published Jun 3 • 2 •