Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
18
3
15
John Locke
johnlockejrr
Follow
pburub's profile picture
starride-teklia's profile picture
2 followers
·
16 following
[email protected]
AI & ML interests
NLP, OCR, AI
Recent Activity
reacted
to
singhsidhukuldeep
's
post
with 🚀
3 days ago
Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: • Lightweight Local Encoder that converts bytes to patch representations • Powerful Global Latent Transformer that processes patches • Local Decoder that converts patches back to bytes >> Technical Advantages • Matches performance of Llama 3 at 8B parameters while being more efficient • Superior handling of non-English languages and rare character sequences • Remarkable 99.9% accuracy on spelling tasks • Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
liked
a Space
12 days ago
sivan22/Ituria
new
activity
15 days ago
Gabriel/Qwen2-VL-2B-Instruct:
Model inference
View all activity
Organizations
None yet
spaces
5
Sort: Recently updated
Sleeping
🐢
Kraken Ocr
Kraken OCR for Samaritan MSS
Sleeping
📉
Kraken Syr Seg
Syriac region/textline segmentation with Kraken
Sleeping
🦀
PyLaia-heb_sam_v1
Sleeping
🐢
PyLaia Arabic Handwritten v1
Sleeping
1
🐢
PyLaia Hebrew McDonald v2
models
15
Sort: Recently updated
johnlockejrr/trocr_syr_v1
Updated
Nov 2
•
6
johnlockejrr/syrnt_v2_20-epoch
Updated
Oct 23
johnlockejrr/syrnt_v2_13-epoch
Updated
Oct 21
johnlockejrr/syrnt_v2
Updated
Oct 21
•
6
johnlockejrr/sbb_pixelwise_v1
Updated
Oct 18
johnlockejrr/pylaia-samaritan_v1
Image-to-Text
•
Updated
Oct 11
johnlockejrr/yolov8-samaritan-segmentation
Updated
Sep 8
johnlockejrr/doc_ufcn_samaritan_v2
Image Segmentation
•
Updated
Sep 8
johnlockejrr/pylaia-heb_sam_v1
Updated
Sep 7
johnlockejrr/doc_ufcn_samaritan_v1
Image Segmentation
•
Updated
Aug 30
Expand 15 models
datasets
11
Sort: Recently updated
johnlockejrr/sofer_mahir_v1
Viewer
•
Updated
Jul 2
•
11.6k
•
35
johnlockejrr/RASAM
Viewer
•
Updated
Jul 2
•
4.65k
•
66
•
2
johnlockejrr/KHATT_v1.0_dataset
Preview
•
Updated
Jul 1
•
44
•
2
johnlockejrr/samaritan_v1
Viewer
•
Updated
Jul 1
•
4.97k
•
41
johnlockejrr/sofer_mahir
Viewer
•
Updated
May 1
•
3.61k
•
45
johnlockejrr/sam_gt
Viewer
•
Updated
Apr 27
•
1.94k
•
40
johnlockejrr/sam_gt_sivan22
Viewer
•
Updated
Apr 24
•
1.94k
•
40
johnlockejrr/sam3
Viewer
•
Updated
Apr 17
•
49.1k
•
7
johnlockejrr/samv2
Viewer
•
Updated
Apr 14
•
52.7k
•
5
johnlockejrr/sam
Viewer
•
Updated
Apr 13
•
28.7k
•
7
Expand 11 datasets