Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
tags:
|
4 |
+
- arc
|
5 |
+
- abstract-reasoning
|
6 |
+
- neural-network
|
7 |
+
- transformer
|
8 |
+
- pytorch
|
9 |
+
library_name: pytorch
|
10 |
+
pipeline_tag: other
|
11 |
+
---
|
12 |
+
|
13 |
+
# ARC Neural Network - Rule Paradigm
|
14 |
+
|
15 |
+
This model is part of an Abstraction and Reasoning Corpus (ARC) neural network project focused on developing models capable of solving abstract reasoning challenges.
|
16 |
+
|
17 |
+
## Model Information
|
18 |
+
|
19 |
+
- **Checkpoint**: model_rule_paradigm_epoch_10_20250629_081422_best_val_loss.pth
|
20 |
+
- **Type**: rule_paradigm
|
21 |
+
- **Framework**: PyTorch
|
22 |
+
- **Architecture**: Two-stage rule-imprinted attention with RuleGenerator and RuleApplier transformers
|
23 |
+
|
24 |
+
## Training Details
|
25 |
+
|
26 |
+
- **Epoch**: 10
|
27 |
+
- **Loss**: 1.165186
|
28 |
+
|
29 |
+
## Architecture Details
|
30 |
+
|
31 |
+
### RuleGenerator Configuration
|
32 |
+
- **Model Dimension**: 512
|
33 |
+
- **Attention Heads**: N/A
|
34 |
+
- **Encoder Layers**: N/A
|
35 |
+
- **Rule Token Dimension**: 512
|
36 |
+
|
37 |
+
### RuleApplier Configuration
|
38 |
+
- **Model Dimension**: 512
|
39 |
+
- **Attention Heads**: 16
|
40 |
+
- **Rule Imprint Layers**: 3
|
41 |
+
- **Spatial Layers**: 6
|
42 |
+
|
43 |
+
## Usage
|
44 |
+
|
45 |
+
```python
|
46 |
+
import torch
|
47 |
+
from huggingface_hub import hf_hub_download
|
48 |
+
|
49 |
+
# Download the checkpoint
|
50 |
+
checkpoint_path = hf_hub_download(
|
51 |
+
repo_id="artnoage/your-model-name",
|
52 |
+
filename="model.pth"
|
53 |
+
)
|
54 |
+
|
55 |
+
# Load the checkpoint
|
56 |
+
checkpoint = torch.load(checkpoint_path, map_location='cpu')
|
57 |
+
|
58 |
+
# The checkpoint contains model configurations for easy loading
|
59 |
+
if 'rule_generator_config' in checkpoint:
|
60 |
+
# Model configs are included - can reconstruct architecture automatically
|
61 |
+
print("Model configurations found in checkpoint")
|
62 |
+
print(f"Epoch: {checkpoint['epoch']}, Loss: {checkpoint['loss']}")
|
63 |
+
else:
|
64 |
+
# Legacy checkpoint - requires manual architecture specification
|
65 |
+
print("Legacy checkpoint - manual architecture specification needed")
|
66 |
+
```
|
67 |
+
|
68 |
+
## Project Overview
|
69 |
+
|
70 |
+
This is part of an ARC (Abstraction and Reasoning Corpus) neural network project that implements a novel rule-based paradigm using a revolutionary two-stage rule-imprinted attention architecture.
|
71 |
+
|
72 |
+
### Key Features
|
73 |
+
- **Two-stage rule-imprinted attention**: Stage 1 for rule imprinting via cross-attention, Stage 2 for spatial reasoning via self-attention
|
74 |
+
- **Rule consistency training**: Multiple rule extractions per task with consistency loss
|
75 |
+
- **Configurable tokenization**: Row-based (30 tokens) or meta-pixel (900 tokens) strategies
|
76 |
+
- **Mixed precision training**: AMP training with robust gradient scaling
|
77 |
+
|
78 |
+
## Citation
|
79 |
+
|
80 |
+
If you use this model in your research, please cite:
|
81 |
+
|
82 |
+
```bibtex
|
83 |
+
@misc{arc-neural-network,
|
84 |
+
title={ARC Neural Network with Rule-Imprinted Attention},
|
85 |
+
author={Your Name},
|
86 |
+
year={2025},
|
87 |
+
url={https://github.com/your-username/ARC_NN}
|
88 |
+
}
|
89 |
+
```
|
90 |
+
|
91 |
+
## License
|
92 |
+
|
93 |
+
MIT License - See repository for full license details.
|