--- license: apache-2.0 base_model: Qwen/Qwen2.5-7B-Instruct library_name: transformers tags: - translation - chinese - indonesian - qwen - lora - fine-tuned - traditional-chinese - news model-index: - name: Royal_ZhTW-ID_finetuned_101 results: [] language: - zh - id pipeline_tag: text2text-generation --- # Qwen2.5-7B Traditional Chinese ↔ Indonesian Translation Model This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) specifically optimized for Traditional Chinese ↔ Indonesian translation tasks. ## Model Description This model specializes in translating between Traditional Chinese and Indonesian, trained on Taiwan news corpus. It's particularly effective for news, formal documents, and general text translation between these language pairs. ### Key Features - 🌏 **Bidirectional Translation**: Traditional Chinese ↔ Indonesian - πŸ“° **News Domain Optimized**: Trained on Taiwan news corpus - ⚑ **Efficient Fine-tuning**: Uses LoRA (Low-Rank Adaptation) for faster training - 🎯 **Specialized Vocabulary**: Enhanced for Taiwan-specific terms and Indonesian equivalents ## Training Details ### Base Model - **Base Model**: Qwen/Qwen2.5-7B-Instruct - **Model Type**: Causal Language Model with Translation Capabilities ### Fine-tuning Configuration - **Method**: LoRA (Low-Rank Adaptation) - **LoRA Rank**: 8 - **LoRA Alpha**: 32 - **Learning Rate**: 2e-4 - **Training Epochs**: 3 - **Max Samples**: 1,000 (initial validation) - **Template**: Qwen conversation format ### Dataset - **Source**: Taiwan NEWS in Traditional Chinese with Indonesian translations - **Editor**: Chang, Yo Han - **Domain**: News articles and formal text - **Language Pair**: Traditional Chinese (zh-TW) ↔ Indonesian (id) - **Note**: Dataset is proprietary and not publicly available on HuggingFace ## Usage ### Installation ```bash pip install transformers torch ``` ### Basic Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # θΌ‰ε…₯ base model base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen2.5-7B-Instruct", torch_dtype=torch.float16, device_map="auto" ) # θΌ‰ε…₯ LoRA adapter model = PeftModel.from_pretrained( base_model, "roylin1003/Royal_ZhTW-ID_finetuned_101" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct") # Translation function def translate_text(text, source_lang="zh", target_lang="id"): if source_lang == "zh" and target_lang == "id": prompt = f"θ«‹ε°‡δ»₯δΈ‹δΈ­ζ–‡ηΏ»θ­―ζˆε°ε°Όζ–‡οΌš{text}" elif source_lang == "id" and target_lang == "zh": prompt = f"Terjemahkan teks bahasa Indonesia berikut ke bahasa Tionghoa: {text}" messages = [ {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=512, do_sample=True, temperature=0.7, pad_token_id=tokenizer.eos_token_id ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] return response # Example usage chinese_text = "ε°η£ηš„η§‘ζŠ€η”’ζ₯­η™Όε±•θΏ…ι€ŸοΌŒη‰Ήεˆ₯ζ˜―εœ¨εŠε°Žι«”ι ˜εŸŸγ€‚" indonesian_translation = translate_text(chinese_text, "zh", "id") print(f"Chinese: {chinese_text}") print(f"Indonesian: {indonesian_translation}") indonesian_text = "Indonesia adalah negara kepulauan terbesar di dunia." chinese_translation = translate_text(indonesian_text, "id", "zh") print(f"Indonesian: {indonesian_text}") print(f"Chinese: {chinese_translation}") ``` ### Advanced Usage with Custom Parameters ```python def translate_with_options(text, source_lang="zh", target_lang="id", temperature=0.7, max_tokens=512): # ... (same setup as above) generated_ids = model.generate( **model_inputs, max_new_tokens=max_tokens, do_sample=True, temperature=temperature, top_p=0.9, repetition_penalty=1.1, pad_token_id=tokenizer.eos_token_id ) # ... (same decoding as above) return response ``` ## Model Performance ### Training Metrics - **Training Loss**: Converged after 3 epochs - **Learning Rate**: 2e-4 with linear decay - **Batch Size**: Optimized for available GPU memory ### Evaluation This model has been trained on a curated dataset of Taiwan news articles with Indonesian translations. Performance evaluation is ongoing. ## Limitations and Considerations ### Known Limitations - **Domain Specificity**: Optimized for news and formal text; may not perform as well on casual conversation - **Training Data Size**: Initial training used 1,000 samples for quick validation - **Cultural Context**: May require additional fine-tuning for region-specific terminology ### Recommended Use Cases - πŸ“° News article translation - πŸ“„ Formal document translation - 🏒 Business communication between Taiwan and Indonesia - πŸ“š Educational content translation ### Not Recommended For - Real-time conversation (use specialized conversational models) - Medical or legal documents (requires domain-specific models) - Creative writing (may lack stylistic nuance) ## Training Infrastructure ### Hardware Requirements - **Minimum**: GPU with 16GB VRAM - **Recommended**: GPU with 24GB+ VRAM for optimal performance - **Training Time**: Approximately 2-3 hours on modern GPUs ### Software Dependencies ``` transformers>=4.36.0 torch>=2.0.0 peft>=0.7.0 datasets>=2.15.0 ``` ## Citation If you use this model in your research or applications, please cite: ```bibtex @misc{Royal_ZhTW-ID_finetuned_101, title={Qwen2.5-7B Traditional Chinese-Indonesian Translation Model}, author={Roy Lin}, year={2024}, howpublished={\url{https://huggingface.co/roylin1003/Royal_ZhTW-ID_finetuned_101}}, note={Fine-tuned on Taiwan news corpus edited by Chang, Yo Han} } ``` ## Acknowledgments - **Base Model**: Thanks to the Qwen team for the excellent Qwen2.5-7B-Instruct model - **Dataset**: Taiwan news corpus with Indonesian translations edited by Chang, Yo Han - **Framework**: Built using Hugging Face Transformers and PEFT libraries ## License This model is released under the Apache 2.0 License, consistent with the base Qwen2.5-7B-Instruct model. ## Contact For questions, issues, or collaborations, please open an issue in this repository or contact [your contact information]. --- **Model Version**: 1.0 **Last Updated**: [Current Date] **Status**: Initial Release - Validation Phase ---