Trained using Eole for 5½ hours on an NVIDIA RTX PRO 6000 Blackwell Server Edition (CC 12.0, 96 GB VRAM, 600W max power) GPU with cuda_13.0.r13.0/compiler.36424714_0, torch 2.10.0+cu128, courtesy a Lightning.ai free trial.

Training Validation
embeddings 32317440 validation BLEU 43.41668106422167
encoder 75524096 Train perplexity 4.96248
decoder 100702208 Train accuracy 95.5266
generator 31560 Sentences processed 5.38617e+07
other 0 {'corpus_1' {'count': 53861702, 'index': 8095}}
* number of parameters 208575304 Average bsz 1866/2478/82
Trainable parameters {'torch.bfloat16': 208575304} Validation perplexity 22.17
Non trainable parameters {} Validation accuracy 71.8956
* src vocab size 31560
* tgt vocab size 31560
Downloads last month
42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Dataset used to train Geremia23/AquinasLatinEnglishModel