artnoage commited on
Commit
f765f42
·
verified ·
1 Parent(s): dfc4968

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - arc
5
+ - abstract-reasoning
6
+ - neural-network
7
+ - transformer
8
+ - pytorch
9
+ library_name: pytorch
10
+ pipeline_tag: other
11
+ ---
12
+
13
+ # ARC Neural Network - Rule Paradigm
14
+
15
+ This model is part of an Abstraction and Reasoning Corpus (ARC) neural network project focused on developing models capable of solving abstract reasoning challenges.
16
+
17
+ ## Model Information
18
+
19
+ - **Checkpoint**: model_rule_paradigm_epoch_10_20250629_081422_best_val_loss.pth
20
+ - **Type**: rule_paradigm
21
+ - **Framework**: PyTorch
22
+ - **Architecture**: Two-stage rule-imprinted attention with RuleGenerator and RuleApplier transformers
23
+
24
+ ## Training Details
25
+
26
+ - **Epoch**: 10
27
+ - **Loss**: 1.165186
28
+
29
+ ## Architecture Details
30
+
31
+ ### RuleGenerator Configuration
32
+ - **Model Dimension**: 512
33
+ - **Attention Heads**: N/A
34
+ - **Encoder Layers**: N/A
35
+ - **Rule Token Dimension**: 512
36
+
37
+ ### RuleApplier Configuration
38
+ - **Model Dimension**: 512
39
+ - **Attention Heads**: 16
40
+ - **Rule Imprint Layers**: 3
41
+ - **Spatial Layers**: 6
42
+
43
+ ## Usage
44
+
45
+ ```python
46
+ import torch
47
+ from huggingface_hub import hf_hub_download
48
+
49
+ # Download the checkpoint
50
+ checkpoint_path = hf_hub_download(
51
+ repo_id="artnoage/your-model-name",
52
+ filename="model.pth"
53
+ )
54
+
55
+ # Load the checkpoint
56
+ checkpoint = torch.load(checkpoint_path, map_location='cpu')
57
+
58
+ # The checkpoint contains model configurations for easy loading
59
+ if 'rule_generator_config' in checkpoint:
60
+ # Model configs are included - can reconstruct architecture automatically
61
+ print("Model configurations found in checkpoint")
62
+ print(f"Epoch: {checkpoint['epoch']}, Loss: {checkpoint['loss']}")
63
+ else:
64
+ # Legacy checkpoint - requires manual architecture specification
65
+ print("Legacy checkpoint - manual architecture specification needed")
66
+ ```
67
+
68
+ ## Project Overview
69
+
70
+ This is part of an ARC (Abstraction and Reasoning Corpus) neural network project that implements a novel rule-based paradigm using a revolutionary two-stage rule-imprinted attention architecture.
71
+
72
+ ### Key Features
73
+ - **Two-stage rule-imprinted attention**: Stage 1 for rule imprinting via cross-attention, Stage 2 for spatial reasoning via self-attention
74
+ - **Rule consistency training**: Multiple rule extractions per task with consistency loss
75
+ - **Configurable tokenization**: Row-based (30 tokens) or meta-pixel (900 tokens) strategies
76
+ - **Mixed precision training**: AMP training with robust gradient scaling
77
+
78
+ ## Citation
79
+
80
+ If you use this model in your research, please cite:
81
+
82
+ ```bibtex
83
+ @misc{arc-neural-network,
84
+ title={ARC Neural Network with Rule-Imprinted Attention},
85
+ author={Your Name},
86
+ year={2025},
87
+ url={https://github.com/your-username/ARC_NN}
88
+ }
89
+ ```
90
+
91
+ ## License
92
+
93
+ MIT License - See repository for full license details.