arc_loss / README.md
artnoage's picture
Upload README.md with huggingface_hub
c210164 verified
metadata
license: mit
tags:
  - arc
  - abstract-reasoning
  - neural-network
  - transformer
  - pytorch
library_name: pytorch
pipeline_tag: other

ARC Neural Network - Rule Paradigm

This model is part of an Abstraction and Reasoning Corpus (ARC) neural network project focused on developing models capable of solving abstract reasoning challenges.

Model Information

  • Checkpoint: model_rule_paradigm_epoch_20_20250626_125402_best_val_loss.pth
  • Type: rule_paradigm
  • Framework: PyTorch
  • Architecture: Two-stage rule-imprinted attention with RuleGenerator and RuleApplier transformers

Training Details

  • Epoch: 20
  • Loss: 0.909375

Architecture Details

RuleGenerator Configuration

  • Model Dimension: 1024
  • Attention Heads: N/A
  • Encoder Layers: N/A
  • Rule Token Dimension: 256

RuleApplier Configuration

  • Model Dimension: 256
  • Attention Heads: 16
  • Rule Imprint Layers: 4
  • Spatial Layers: 8

Usage

import torch
from huggingface_hub import hf_hub_download

# Download the checkpoint
checkpoint_path = hf_hub_download(
    repo_id="artnoage/your-model-name",
    filename="model.pth"
)

# Load the checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')

# The checkpoint contains model configurations for easy loading
if 'rule_generator_config' in checkpoint:
    # Model configs are included - can reconstruct architecture automatically
    print("Model configurations found in checkpoint")
    print(f"Epoch: {checkpoint['epoch']}, Loss: {checkpoint['loss']}")
else:
    # Legacy checkpoint - requires manual architecture specification
    print("Legacy checkpoint - manual architecture specification needed")

Project Overview

This is part of an ARC (Abstraction and Reasoning Corpus) neural network project that implements a novel rule-based paradigm using a revolutionary two-stage rule-imprinted attention architecture.

Key Features

  • Two-stage rule-imprinted attention: Stage 1 for rule imprinting via cross-attention, Stage 2 for spatial reasoning via self-attention
  • Rule consistency training: Multiple rule extractions per task with consistency loss
  • Configurable tokenization: Row-based (30 tokens) or meta-pixel (900 tokens) strategies
  • Mixed precision training: AMP training with robust gradient scaling

Citation

If you use this model in your research, please cite:

@misc{arc-neural-network,
  title={ARC Neural Network with Rule-Imprinted Attention},
  author={Your Name},
  year={2025},
  url={https://github.com/your-username/ARC_NN}
}

License

MIT License - See repository for full license details.