The development of SnowflakeCore-G1-7B-MoE it getting delay. In the mean time I am working on SnowflakeCore-G1-1B-MoE witch would be a pre-train chatbot.
Hello! Important announcement, I will rename SnowflakeCore-G1-Medium to SnowflakeCore-G1-Tiny2 because it's going to have the same parameters as the Tiny version, but this one is trained on more data.
SnowflakeCore-G1 Update: Got it running and training! Context window is currently set to 2048 tokens. Training is active and stable. Will share results once I have some metrics to report.
SnowflakeCore-G1 development update: We're building a 24-layer transformer with 32K context and 1024 embedding dimensions - pretty ambitious! Even running at batch_size=1 with heavy gradient accumulation, we're hitting memory walls at 300GB RAM. Scaling up to ~1TB will take some time, but the architecture is looking promising. Thanks for following along with the journey! π
Hello there! I just find out that all the SnowflakeCore-G0 series are Mask Language Models instead of LLM's. The development of SnowflakeCore-G0-Releas-3 would be delayed even more.
Edit: I officially end the development of SnowflakeCore-G0 and start the development of SnowflakeCore-G1 what SHOULD be the text generator.
Edit-2: After some evaluation of the code, the models are actual Text Generator. So the development of G0 will continue.
Hi everyone! The release of https://huggingface.co/FlameF0X/SnowflakeCore-G0-Release-3-1B is currently delayed due to hardware limitationsβI'm currently lacking the compute resources needed to complete training. I'm exploring options and will keep you updated on any progress. Thank you for your patience and support!