sonyashijin
/

qwen3-32b-verilog-lora

@@ -1,25 +1,78 @@
-# Qwen3-32B Verilog LoRA Adapter
-This repository contains a LoRA adapter trained on the Qwen3-32B model for Verilog code generation. The adapter was trained using GRPO (Generalized Reinforcement Learning with Policy Optimization) on the VerilogEval dataset.
-## Model Details
 - **Base Model**: Qwen/Qwen3-32B
-- **Task**: Verilog Code Generation
-- **Training Method**: GRPO
-- **Framework**: vLLM
-## Usage with vLLM
-```bash
-python -m vllm.entrypoints.openai.api_server \
-    --model "Qwen/Qwen3-32B" \
-    --enable-lora \
-    --lora-modules verilog-sft="sonyashijin/qwen3-32b-verilog-lora" \
-    --max-loras 1 \
-    --max-lora-rank 32
 ```
-## License
-Apache 2.0

+---
+library_name: peft
+base_model: Qwen/Qwen3-32B
+tags:
+- verilog
+- code-generation
+- lora
+- qwen3
+- verl
+- grpo
+license: apache-2.0
+---
+# qwen3-32b-verilog-lora
+This is a LoRA (Low-Rank Adaptation) adapter for **Qwen/Qwen3-32B** fine-tuned for **Verilog code generation**.
+## Training Details
 - **Base Model**: Qwen/Qwen3-32B
+- **Training Algorithm**: GRPO (Group Relative Policy Optimization)
+- **LoRA Rank**: 32
+- **LoRA Alpha**: 32
+- **Target Modules**: o_proj, k_proj, up_proj, v_proj, gate_proj, q_proj, down_proj
+- **Task**: Verilog hardware description language code generation
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+# Load base model and tokenizer
+base_model_name = "Qwen/Qwen3-32B"
+tokenizer = AutoTokenizer.from_pretrained(base_model_name)
+base_model = AutoModelForCausalLM.from_pretrained(
+    base_model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+# Load LoRA adapter
+model = PeftModel.from_pretrained(base_model, "sonyashijin/qwen3-32b-verilog-lora")
+# Generate Verilog code
+prompt = "Create a 4-bit D flip-flop with enable and asynchronous reset:"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=512, temperature=0.7)
+generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(generated_code)
 ```
+## Training Configuration
+- **Data**: Custom Verilog training dataset
+- **Batch Size**: 64
+- **Learning Rate**: 3e-5
+- **KL Loss Coefficient**: 0.001
+- **Max Prompt Length**: 1200 tokens
+- **Max Response Length**: 1200 tokens
+## Files
+- `adapter_config.json`: LoRA adapter configuration
+- `adapter_model.safetensors`: LoRA adapter weights (safe tensors format)
+## Citation
+If you use this model, please cite the VERL (Verification Enhanced Reinforcement Learning) framework.
+```bibtex
+@misc{verl2024,
+  title={VERL: Verification Enhanced Reinforcement Learning for Verilog Code Generation},
+  author={Your Name},
+  year={2024},
+  url={https://huggingface.co/sonyashijin/qwen3-32b-verilog-lora}
+}
+```