jpacifico/Aramis-2B-BitNet-b1.58-i2s-GGUF

Model Summary

Family: BitNet b1.58 (ternary weights {-1, 0, +1} with abs-mean scaling)
Post-training recipe: bilingual DPO (FR+EN) + ModelStock/TIES merges to combine FR-centric and EN-centric variants (agent-oriented behaviors; pragmatic reasoning).
This repo: GGUF weights for efficient local inference with bitnet.cpp.
Training & provenance: see the BF16 model card for full details of datasets, merges, and configuration.

Upstream references

Technical Report: BitNet b1.58 2B4T Technical Report (Microsoft Research, 2025). Contains the official description of the GGUF variant “used for bitnet.cpp” and the lossless-inference note.
Official GGUF base model (Microsoft): microsoft/bitnet-b1.58-2B-4T-gguf
bitnet.cpp (official inference framework): microsoft/BitNet on GitHub

About “lossless” (what it means here)

Microsoft’s report states that the CPU reference implementation “ensur[es] numerical accuracy (lossless inference relative to the training procedure)” when running BitNet b1.58 models via bitnet.cpp.

In practice, this means the 1.58-bit packed weights used at train time are executed as-is by the specialized kernels; the GGUF container is simply the delivery format consumed by bitnet.cpp for these kernels.
Microsoft’s GGUF model card also explicitly presents the GGUF variant as the format “compatible with the bitnet.cpp library”.

Note: Efficiency claims (memory/latency/energy) and the “lossless” inference property apply when using bitnet.cpp. Running the model through generic paths (e.g., vanilla Transformers) doesn’t unlock those kernel-level advantages. See Microsoft’s GGUF page and bitnet.cpp README.

Intended Use

Great for: agent-oriented assistants, bilingual instruction following, pragmatic reasoning, and everyday knowledge tasks — on CPUs or modest GPUs using bitnet.cpp.
Not optimized for: formal math or code generation (see BF16 card for details and alternatives).

Files

*.gguf — 1.58-bit GGUF weights for BitNet b1.58 (Aramis-2B).
Check the Files tab for filenames and sizes.

How to run (bitnet.cpp)

You can run this model using my demo Colab Notebook (TBD)

Please refer to the bitnet.cpp GitHub repository for detailed compilation steps, usage examples, and command-line options.

Disclamer
This model is intended for research and development purposes only and should not be used in commercial or real-world applications without further testing. While the Microsoft Research team has applied SFT and DPO to align the BitNet base model, it may still produce unexpected, biased, or inaccurate outputs. Please use responsibly.

Developed by: Jonathan Pacifico, 2025
Model type: LLM
Language(s) (NLP): French, English
License: MIT

Made with ❤️ in France