--- license: mit tags: - arc - abstract-reasoning - neural-network - transformer - pytorch library_name: pytorch pipeline_tag: other --- # ARC Neural Network - Rule Paradigm This model is part of an Abstraction and Reasoning Corpus (ARC) neural network project focused on developing models capable of solving abstract reasoning challenges. ## Model Information - **Checkpoint**: model_rule_paradigm_epoch_20_20250626_125402_best_val_loss.pth - **Type**: rule_paradigm - **Framework**: PyTorch - **Architecture**: Two-stage rule-imprinted attention with RuleGenerator and RuleApplier transformers ## Training Details - **Epoch**: 20 - **Loss**: 0.909375 ## Architecture Details ### RuleGenerator Configuration - **Model Dimension**: 1024 - **Attention Heads**: N/A - **Encoder Layers**: N/A - **Rule Token Dimension**: 256 ### RuleApplier Configuration - **Model Dimension**: 256 - **Attention Heads**: 16 - **Rule Imprint Layers**: 4 - **Spatial Layers**: 8 ## Usage ```python import torch from huggingface_hub import hf_hub_download # Download the checkpoint checkpoint_path = hf_hub_download( repo_id="artnoage/your-model-name", filename="model.pth" ) # Load the checkpoint checkpoint = torch.load(checkpoint_path, map_location='cpu') # The checkpoint contains model configurations for easy loading if 'rule_generator_config' in checkpoint: # Model configs are included - can reconstruct architecture automatically print("Model configurations found in checkpoint") print(f"Epoch: {checkpoint['epoch']}, Loss: {checkpoint['loss']}") else: # Legacy checkpoint - requires manual architecture specification print("Legacy checkpoint - manual architecture specification needed") ``` ## Project Overview This is part of an ARC (Abstraction and Reasoning Corpus) neural network project that implements a novel rule-based paradigm using a revolutionary two-stage rule-imprinted attention architecture. ### Key Features - **Two-stage rule-imprinted attention**: Stage 1 for rule imprinting via cross-attention, Stage 2 for spatial reasoning via self-attention - **Rule consistency training**: Multiple rule extractions per task with consistency loss - **Configurable tokenization**: Row-based (30 tokens) or meta-pixel (900 tokens) strategies - **Mixed precision training**: AMP training with robust gradient scaling ## Citation If you use this model in your research, please cite: ```bibtex @misc{arc-neural-network, title={ARC Neural Network with Rule-Imprinted Attention}, author={Your Name}, year={2025}, url={https://github.com/your-username/ARC_NN} } ``` ## License MIT License - See repository for full license details.