
Aleph-Alpha/llama-tfree-hat-pretrained-7b-dpo
7B
•
Updated
•
196
•
5
Tokenizer free models based on Hierarchical Autoregressive Transformer (https://arxiv.org/abs/2501.10322) trained from scratch.