BitTransformerLM: Model info, R&D direction, and use cases.
BitTransformerLM is a novel, experimental transformer architecture. Rather than working on the token or byte level, BitTransformerLM works directly on binary sequences, through parity-padded-and-checked text_to_bits and bits_to_text on the I/O. This allows for round trip input of any text, or really any data converted to binary and run with parity (or optionally disable parity for direct, raw binary I/O, however this may be difficult as it's designed natively around this text_to_bits + parity and back pipeline for exploration of bit-native transformer use as a language model, with inherent multi-modal capabilities thanks to the bit-native design).
BitTransformerLM is designed to be very modular and toggleable for testing, with an entire end-to-end training, testing, and deployment platform built in and around it. It has MCP server functionality, Flask server functionality, a dashboard UI and docker image build, start.sh, and lots of tooling and functionalities further explained in the documentation inside the model repo. This model is best used with Claude Code or another coding agent within a Hugging Face Space or similar environment. You can download the zip of this model, upload it into your environment, and run Claude Code in your command console for relatively easy setup, testing, training, any of your R&D or hobby needs. We would love for the wider community and users to download and play around with and upgrade and improve BitTransformerLM. Explore all the features and integrations and directions this model can go! While it is highly experimental and (quite obviously) in early stages, I have made sure that it works and can be scaled and tested for serious R&D, or perhaps even trained and used for certain use cases already. The model is in an untrained state, a blank canvas for anyone to use and train and modify and research and improve Open Source under AGPLv3.
All redeployments, modifications, derivatives, or other non-commercial uses of BitTransformerLM must be redeployed with exact licensing structure intact, and the license file unmodified. See licensing within the model repo files for details, BitTransformerLM/LICENSE.
For commercial use or deployment of BitTransformerLM outside of the AGPLv3 licensing, contact us at [email protected] for inquiries into the commercial version under commercial licensing.
Copyright (C) 2025 WCNEGENTROPY HOLDINGS LLC