metadata
license: mit
tags:
- arc
- abstract-reasoning
- neural-network
- transformer
- pytorch
library_name: pytorch
pipeline_tag: other
ARC Neural Network - Rule Paradigm
This model is part of an Abstraction and Reasoning Corpus (ARC) neural network project focused on developing models capable of solving abstract reasoning challenges.
Model Information
- Checkpoint: model_rule_paradigm_epoch_20_20250626_125402_best_val_loss.pth
- Type: rule_paradigm
- Framework: PyTorch
- Architecture: Two-stage rule-imprinted attention with RuleGenerator and RuleApplier transformers
Training Details
- Epoch: 20
- Loss: 0.909375
Architecture Details
RuleGenerator Configuration
- Model Dimension: 1024
- Attention Heads: N/A
- Encoder Layers: N/A
- Rule Token Dimension: 256
RuleApplier Configuration
- Model Dimension: 256
- Attention Heads: 16
- Rule Imprint Layers: 4
- Spatial Layers: 8
Usage
import torch
from huggingface_hub import hf_hub_download
# Download the checkpoint
checkpoint_path = hf_hub_download(
repo_id="artnoage/your-model-name",
filename="model.pth"
)
# Load the checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')
# The checkpoint contains model configurations for easy loading
if 'rule_generator_config' in checkpoint:
# Model configs are included - can reconstruct architecture automatically
print("Model configurations found in checkpoint")
print(f"Epoch: {checkpoint['epoch']}, Loss: {checkpoint['loss']}")
else:
# Legacy checkpoint - requires manual architecture specification
print("Legacy checkpoint - manual architecture specification needed")
Project Overview
This is part of an ARC (Abstraction and Reasoning Corpus) neural network project that implements a novel rule-based paradigm using a revolutionary two-stage rule-imprinted attention architecture.
Key Features
- Two-stage rule-imprinted attention: Stage 1 for rule imprinting via cross-attention, Stage 2 for spatial reasoning via self-attention
- Rule consistency training: Multiple rule extractions per task with consistency loss
- Configurable tokenization: Row-based (30 tokens) or meta-pixel (900 tokens) strategies
- Mixed precision training: AMP training with robust gradient scaling
Citation
If you use this model in your research, please cite:
@misc{arc-neural-network,
title={ARC Neural Network with Rule-Imprinted Attention},
author={Your Name},
year={2025},
url={https://github.com/your-username/ARC_NN}
}
License
MIT License - See repository for full license details.