John Locke

johnlockejrr

AI & ML interests

NLP, OCR, AI

Recent Activity

Organizations

None yet

johnlockejrr's activity

reacted to onekq's post with πŸ‘ 5 days ago
view post
Post
3222
QwQ-32B is amazing!

It ranks below o1-preview, but beats DeepSeek v3 and all Gemini models.
onekq-ai/WebApp1K-models-leaderboard

Now we have such a powerful model that can fit into a single GPU, can someone finetune a web app model to push SOTA of my leaderboard? πŸ€—
  • 1 reply
Β·
reacted to MohamedRashad's post with ❀️ 10 days ago
view post
Post
3411
I think we have released the best Arabic model under 25B at least based on inceptionai/AraGen-Leaderboard

Yehia = ALLaM-AI/ALLaM-7B-Instruct-preview + GRPO

and its ranked number one model under the 25B parameter size mark.

Now, i said "i think" not "i am sure" because this model used the same metric of evaluation the AraGen developers use (the 3C3H) as a reward model to improve its responses and this sparks the question. Is this something good for users or is it another type of overfitting that we don't want ?

I don't know if this is a good thing or a bad thing but what i know is that you can try it from here:
Navid-AI/Yehia-7B-preview

or Download it for your personal experiments from here:
Navid-AI/Yehia-7B-preview

Ramadan Kareem πŸŒ™
  • 1 reply
Β·
reacted to victor's post with πŸš€ about 1 month ago
view post
Post
3053
Finally, an open-source AI that turns your lyrics into full songs is hereβ€”meet YuE! Unlike other tools that only create short clips, YuE can make entire songs (up to 5 minutes) with vocals, melody, and instruments all working together. Letsss go!

m-a-p/YuE-s1-7B-anneal-en-cot
New activity in Teklia/pylaia-belfort about 2 months ago

PyLaia enhancement

2
#7 opened about 2 months ago by
johnlockejrr
reacted to singhsidhukuldeep's post with πŸš€ 3 months ago
view post
Post
3702
Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
β€’ Lightweight Local Encoder that converts bytes to patch representations
β€’ Powerful Global Latent Transformer that processes patches
β€’ Local Decoder that converts patches back to bytes

>> Technical Advantages
β€’ Matches performance of Llama 3 at 8B parameters while being more efficient
β€’ Superior handling of non-English languages and rare character sequences
β€’ Remarkable 99.9% accuracy on spelling tasks
β€’ Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
Β·
New activity in Gabriel/Qwen2-VL-2B-Instruct 3 months ago

Model inference

1
#1 opened 3 months ago by
johnlockejrr
reacted to MohamedRashad's post with ❀️❀️ 3 months ago
New activity in MohamedRashad/arabic-small-nougat 3 months ago

Arabic Small Nougat

10
#1 opened 11 months ago by
johnlockejrr
reacted to MohamedRashad's post with πŸ€—πŸš€ 3 months ago
upvoted an article 4 months ago
view article
Article

HTRflow - A tool for HTR and OCR

By Gabriel and 3 others β€’
β€’ 16