qwen3-32b-verilog-lora

This is a LoRA (Low-Rank Adaptation) adapter for Qwen/Qwen3-32B fine-tuned for Verilog code generation.

Training Details

Base Model: Qwen/Qwen3-32B
Training Algorithm: GRPO (Group Relative Policy Optimization)
LoRA Rank: 32
LoRA Alpha: 32
Target Modules: o_proj, k_proj, up_proj, v_proj, gate_proj, q_proj, down_proj
Task: Verilog hardware description language code generation

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model and tokenizer
base_model_name = "Qwen/Qwen3-32B"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "sonyashijin/qwen3-32b-verilog-lora")

# Generate Verilog code
prompt = "Create a 4-bit D flip-flop with enable and asynchronous reset:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512, temperature=0.7)
generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_code)

Training Configuration

Data: Custom Verilog training dataset
Batch Size: 64
Learning Rate: 3e-5
KL Loss Coefficient: 0.001
Max Prompt Length: 1200 tokens
Max Response Length: 1200 tokens

Files

adapter_config.json: LoRA adapter configuration
adapter_model.safetensors: LoRA adapter weights (safe tensors format)

Citation

If you use this model, please cite the VERL (Verification Enhanced Reinforcement Learning) framework.

@misc{verl2024,
  title={VERL: Verification Enhanced Reinforcement Learning for Verilog Code Generation},
  author={Your Name},
  year={2024},
  url={https://huggingface.co/sonyashijin/qwen3-32b-verilog-lora}
}