You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Usage

from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM

model_id = "DeepMount00/Murai-350M-v0.1-beta"

model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)

t_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device_map="auto",
    return_full_text=True, 
    top_p = 0.95, 
    top_k = 50,
    repetition_penalty=1.2
)

SYSTEM_PROMPT = """Sei un assistente utile."""
TEMPERATURE = 0.1
MAX_NEW_TOKENS = 250

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": """Scrivi una funzione python che somma due numeri"""},
]

conv_template = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

outputs = t_pipeline(
    conv_template,
    max_new_tokens=MAX_NEW_TOKENS,
    do_sample=True,
    temperature=TEMPERATURE,
    num_return_sequences=1,
)
print(outputs[0]["generated_text"])

Training Details

This model uses a deep architecture optimized for parameter efficiency:

  • Pre-norm architecture with RMSNorm
  • Grouped Query Attention for memory efficiency
  • SwiGLU activation for improved performance
  • RoPE position encoding for better length generalization

Citation

@misc{deepmount_llm_2024,
  title={Deep LLM: A 350M Parameter Language Model with 42 Layers},
  author={MicheleMontebovi},
  year={2025},
  url={https://huggingface.co/DeepMount00/Murai-350M-v0.1-beta}
}

License

Apache 2.0

Downloads last month
175
Safetensors
Model size
376M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support