
AgGPT-13 nano
New. Nano. Nimble.
BETA
AgGPT-13 nano is the lightweight beta release of the AgGPT-13 model β built to handle everything from quick, simple queries to more complex reasoning and problem-solving.
Powered by Gemma-2 and trained on high-quality datasets (including an inner world model) using the AG artificial generative world model architecture, it delivers capable performance in a compact package.
This version is quantized to INT8 for speed and efficiency, then dequantized on load for use β making it nimble without sacrificing capability.
Features
- Lightweight β Optimized for lower memory usage with INT8 quantization.
- Fast startup β Loads and dequantizes directly into a usable PyTorch model.
- Flexible β Works on CPU or GPU.
- Interactive β Simple
ask()
method for quick prompting. - Based on Gemma-2 β Benefits from state-of-the-art NLP and ML research.
Installation & Usage
pip install torch transformers safetensors
Example:
from aggpt13 import AgGPT
agent = AgGPT(model_path="aggpt13/")
response = agent.ask("Hey, who are you?")
print(response)
How It Works
- Loads tokenizer and model config from
transformers
. - Reads quantized weights (
.safetensors
) and quantization parameters (.json
). - Dequantizes weights into
float32
and manually loads them into the model. - Runs entirely in PyTorch, supporting both CPU and CUDA.
License
This project is distributed under the MIT License. For details, see the LICENSE file.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support