Robert Agee
RobAgrees
AI & ML interests
None yet
Recent Activity
reacted
to
ProCreations's
post
with π
19 days ago
Eyyyy 50 followers π€―
new activity
21 days ago
google/gemma-3n-E4B-it-litert-preview:Driver Code or GemmaCPP support?
new activity
21 days ago
w4r10ck/SOLAR-10.7B-Instruct-v1.0-uncensored:Adding Evaluation Results
Organizations
None yet
RobAgrees's activity
reacted to
ProCreations's
post with π
19 days ago
Driver Code or GemmaCPP support?
π€
8
3
#7 opened 25 days ago
by
yoyou446
Adding Evaluation Results
1
#4 opened 8 months ago
by
leaderboard-pr-bot

This thing is hardly evil at all
#1 opened 22 days ago
by
RobAgrees
Is video generation broken?
1
#2 opened 29 days ago
by
RobAgrees
reacted to
codys12's
post with ππ
29 days ago
Post
1935
Introducing bitnet-r1-llama-8b and bitnet-r1-qwen-32b preview! These models are the first successful sub 1-billion-token finetune to BitNet architecture. We discovered that by adding an aditional input RMSNorm to each linear, you can finetune directly to BitNet with fast convergence to original model performance!
We are working on a pull request to use this extra RMS for any model.
To test these models now, install this fork of transformers:
Then load the models and test:
bitnet-r1-llama-8b and bitnet-r1-llama-32b were trained on ~ 300M and 200M tokens of the open-thoughts/OpenThoughts-114k dataset respectively, and were still significantly improving at the end of training. This preview simply demonstrates that the concept works, for future training runs we will leave the lm_head unquantized and align the last hidden state with the original model.
Huge thanks to the team that made this possible:
Gavin Childress, Aaron Herbst, Gavin Jones, Jasdeep Singh, Eli Vang, and Keagan Weinstock from the MSOE AI Club.
We are working on a pull request to use this extra RMS for any model.
To test these models now, install this fork of transformers:
pip install git+https://github.com/Codys12/transformers.git
Then load the models and test:
from transformers import (AutoModelForCausalLM, AutoTokenizer)
model_id = "codys12/bitnet-r1-qwen-32b"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="cuda",
)
tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side="left")
bitnet-r1-llama-8b and bitnet-r1-llama-32b were trained on ~ 300M and 200M tokens of the open-thoughts/OpenThoughts-114k dataset respectively, and were still significantly improving at the end of training. This preview simply demonstrates that the concept works, for future training runs we will leave the lm_head unquantized and align the last hidden state with the original model.
Huge thanks to the team that made this possible:
Gavin Childress, Aaron Herbst, Gavin Jones, Jasdeep Singh, Eli Vang, and Keagan Weinstock from the MSOE AI Club.
when it will be available for open source community
β
3
17
#1 opened 7 months ago
by
arpitsh018

5.21k
IllusionDiffusion
π
Generate stunning high quality illusion artwork
1.98k
Stable Diffusion XL on TPUv5e
π
Generate images from text prompts with various styles
64
Playground V2.5
π
1.11k
Playground V2.5
π
Generate highly aesthetic images
598
Real-Time Text-to-Image SDXL Lightning
β‘
Real-Time Image Generation with SDXL Lightning