Tessa-Rust-T1, A Rust Focused Code Generation Model

Model Overview

Tessa-Rust-T1 is a transformer-based Rust code generation model, fine-tuned from the powerful Qwen2.5-Coder-7B-Instruct base model. Designed specifically for Rust development, Tessa-Rust-T1 leverages advanced reasoning to autonomously generate well-structured, idiomatic Rust code, including functions, structs, traits, and modules. Its integration into agent systems makes it a powerful tool for automating backend development, systems programming, CLI tool creation, and Rust code intelligence.

Collaborators

Work done by Ravi, Ernest, and the Tesslate Team.

Support Or if you want to talk about AI

Community - 100 Invites left

Model Highlights

Hybrid Reasoning: Turn on Reasoning with this system prompt (You can turn off reasoning by giving large instructions without instructing to generate the think tags):
Your role as a Rust assistant is tovi engage in deep, methodical reasoning and prode comprehensive, accurate solutions. Before arriving at a final answer, you must undertake a structured, multi-phase thinking process that emphasizes depth, verification, and clarity. This involves thoroughly analyzing the question, identifying key elements, summarizing relevant insights, generating hypotheses, iteratively refining thoughts, verifying assumptions, cross-checking with prior knowledge, and reevaluating earlier conclusions as necessary. Your response must be structured into two main sections: Thought and Solution. In the Thought section, rigorously document your reasoning in the following format: <|begin_of_thought|> {thought process with each logical step separated by '\n\n'} <|end_of_thought|>. Each step should reflect deep analysis—such as decomposing the problem, synthesizing relevant information, exploring different possibilities, validating each phase, correcting errors, and revisiting earlier assumptions. In the Solution section, consolidate all your insights and reasoned steps into a concise, well-structured final answer. Present it clearly and logically using this format: <|begin_of_solution|> Provide the entire solution here. <|end_of_solution|>. This approach ensures that the final output reflects a high-confidence answer that results from critical thinking and iteration. Now, try to solve the following question through the above guidelines:
Rust-specific Reasoning: Accurately generates functional and idiomatic Rust code.
Agent Integration: Seamlessly fits into AI-driven coding agents and autonomous development systems.
Context-Aware Generation: Effectively understands and utilizes Rust project context, dependencies (crates), and language features (lifetimes, borrowing, traits) to provide relevant code solutions.

Use Cases

Recommended Uses

Automatic Rust Code Generation: Quickly produce Rust functions, structs, modules, and boilerplate code from textual prompts.
Agent-based Rust Development: Integrate into automated coding systems for faster backend, systems, or tooling workflows.
Rust Code Refactoring: Automate the optimization and enhancement of Rust code for idiomaticity and performance.
Generating CLI Tools: Accelerate the creation of command-line applications.
Implementing API Endpoints: Speed up backend development by generating route handlers and data models.
Writing Unit Tests: Generate test cases for Rust functions and modules.

Limitations

Focused on Rust: Limited use outside the Rust ecosystem.
Complex Logic/Lifetimes: May require manual adjustments for highly complex asynchronous patterns, intricate lifetime management, or extensive unsafe code blocks.
Build Configuration: May not fully automate Cargo.toml management or complex build scripts.

How to Use

Inference Example

from transformers import AutoModelForCausalLM, AutoTokenizer

# Make sure to use the correct model name you decide on
model_name = "tesslate/Tessa-Rust-T1" # Adjusted hypothetical name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to("cuda") # Assumes CUDA availability

prompt = """<|im_start|>user
Create a Rust function using the `rayon` crate to parallelize summing a vector of integers.
Function signature: `fn parallel_sum(data: &[i32]) -> i32`
<|im_end|>
<|im_start|>assistant
<|im_start|>think
""" 

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
# Adjust generation parameters as needed
outputs = model.generate(**inputs, max_new_tokens=500, do_sample=True, temperature=0.6, top_p=0.9)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Performance and Evaluation

Strengths:
- Strong idiomatic Rust code generation.
- Excellent integration capabilities with agent-based systems.
- Understanding of common Rust patterns and standard library usage.
Weaknesses:
- Complex Rust logic (e.g., advanced generics, macros, intricate lifetimes, unsafe code) may require manual post-processing or refinement.
- May hallucinate non-existent crate features or incorrect API usage for less common libraries.

Technical Specifications

Architecture: Transformer-based LLM
Base Model: Qwen2.5-Coder-7B-Instruct
Precision: bf16 mixed precision (quantization options like q8 might be available depending on final model release)
Hardware Requirements: Recommended 12GB+ VRAM for bf16 (may vary with quantization)
Software Dependencies:
- Hugging Face Transformers (transformers>=4.34)
- PyTorch (torch>=2.0)
- Accelerate (accelerate) for optimized loading/inference

Citation

@misc{tesslate_Tessa-Rust-T1, # Adjusted name
  title={Tessa-Rust-T1: A Rust-Focused Code Generation Model},
  author={tesslate}, 
  year={2025}, # Placeholder year
  publisher={Hugging Face},
  url={https://huggingface.co/tesslate/Tessa-7B} 
}

Contact & Community

Team: Tesslate Website

GGUF Instructions

Tessa/Tessa-Rust-T1-7B-Q8_0-GGUF

This model was converted to GGUF format from Tesslate/Tessa-Rust-T1-7B using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo smirki/Tessa-Rust-T1-7B-Q8_0-GGUF --hf-file Tessa-Rust-T1-7b-q8_0.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo smirki/Tessa-Rust-T1-7B-Q8_0-GGUF --hf-file Tessa-Rust-T1-7b-q8_0.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo smirki/Tessa-Rust-T1-7B-Q8_0-GGUF --hf-file Tessa-Rust-T1-7b-q8_0.gguf -p "The meaning to life and the universe is"

./llama-server --hf-repo smirki/Tessa-Rust-T1-7B-Q8_0-GGUF --hf-file Tessa-Rust-T1-7b-q8_0.gguf -c 2048

Tesslate
/

Tessa-Rust-T1-7B-Q8_0-GGUF