AgGPT-13 nano

New. Nano. Nimble.

BETA

AgGPT-13 nano is the lightweight beta release of the AgGPT-13 model — built to handle everything from quick, simple queries to more complex reasoning and problem-solving.
Powered by Gemma-2 and trained on high-quality datasets (including an inner world model) using the AG artificial generative world model architecture, it delivers capable performance in a compact package.

This version is quantized to INT8 for speed and efficiency, then dequantized on load for use — making it nimble without sacrificing capability.

Features

Lightweight – Optimized for lower memory usage with INT8 quantization.
Fast startup – Loads and dequantizes directly into a usable PyTorch model.
Flexible – Works on CPU or GPU.
Interactive – Simple ask() method for quick prompting.
Based on Gemma-2 – Benefits from state-of-the-art NLP and ML research.

Installation & Usage

pip install torch transformers safetensors

Example:

from aggpt13 import AgGPT

agent = AgGPT(model_path="aggpt13/")
response = agent.ask("Hey, who are you?")
print(response)

How It Works

Loads tokenizer and model config from transformers.
Reads quantized weights (.safetensors) and quantization parameters (.json).
Dequantizes weights into float32 and manually loads them into the model.
Runs entirely in PyTorch, supporting both CPU and CUDA.

License

This project is distributed under the MIT License. For details, see the LICENSE file.