--- license: apache-2.0 datasets: - FlameF0X/DialogMLM-50K language: - en pipeline_tag: text-generation library_name: transformers new_version: FlameF0X/SnowflakeCore-G0-Release-2 --- # SnowflakeCore-G0-Release This is the initial release of the pre-trained model SnowflakeCore-G0 series language models, trained on the DialogMLM-50K dataset with optimized memory usage. ## SUPPORT ME You can support me via https://ko-fi.com/flamef0x ## Model details - Architecture: SnowflakeCore - Hidden size: 384 - Number of attention heads: 6 - Number of layers: 4 - Feed-forward dimension: 768 - Maximum sequence length: 384 - Vocabulary size: 30522 ## HuggingFace Transformers Compatibility This model is fully compatible with the HuggingFace Transformers library. You can load it using: ```python from transformers import AutoConfig, AutoModel, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("FlameF0X/SnowflakeCore-G0-Release") config = AutoConfig.from_pretrained("FlameF0X/SnowflakeCore-G0-Release") model = AutoModel.from_pretrained("FlameF0X/SnowflakeCore-G0-Release") ``` ## Memory Optimization Techniques - Mixed precision training - Gradient accumulation (4 steps) - Fused QKV projection - Pre-norm architecture - Weight tying between embedding and output layers - Half-precision model storage The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.