ntgiaky's picture
Update README.md
fbbed0f verified
|
raw
history blame
5.77 kB
metadata
language: vi
tags:
  - intent-classification
  - smart-home
  - vietnamese
  - phobert
license: mit
datasets:
  - custom-vn-slu-augmented
metrics:
  - accuracy
  - f1
model-index:
  - name: PhoBERT Intent Classifier for Vietnamese Smart Home
    results:
      - task:
          type: text-classification
          name: Intent Classification
        dataset:
          name: VN-SLU Augmented Dataset
          type: custom
        metrics:
          - type: accuracy
            value: 98.3
            name: Accuracy
          - type: f1
            value: 97.72
            name: F1 Score (Weighted)
          - type: f1
            value: 71.9
            name: F1 Score (Macro)
widget:
  - text: bật đèn phòng khách
  - text: tắt quạt phòng ngủ lúc 10 giờ tối
  - text: kiểm tra tình trạng điều hòa
  - text: tăng độ sáng đèn bàn
  - text: mở cửa chính

PhoBERT Fine-tuned for Vietnamese Smart Home Intent Classification

This model is a fine-tuned version of vinai/phobert-base specifically trained for intent classification in Vietnamese smart home commands.

Model Description

  • Base Model: vinai/phobert-base
  • Task: Intent Classification for Smart Home Commands
  • Language: Vietnamese
  • Training Data: VN-SLU Augmented Dataset (4,000 training samples)
  • Number of Intent Classes: 13

Intended Uses & Limitations

Intended Uses

  • Classifying user intents in Vietnamese smart home voice commands
  • Integration with voice assistants for home automation
  • Research in Vietnamese NLP for IoT applications

Limitations

  • Optimized specifically for smart home domain
  • May not generalize well to other domains
  • Trained on Vietnamese language only

Performance

Based on evaluation with 1,000 test samples:

Metric Value
Accuracy 98.3%
F1 Score (Weighted) 97.72%
F1 Score (Macro) 71.90%
Eval Loss 0.0834

Training Details

Training Configuration

  • Learning Rate: 2e-5
  • Batch Size: 16
  • Number of Epochs: 3
  • Warmup Ratio: 0.1
  • Weight Decay: 0.01
  • Max Length: 128

Hardware

  • Trained on: NVIDIA GPU
  • Training Time: ~79 seconds
  • Optimization: Designed for deployment on Raspberry Pi 5

Intent Classes

The model can classify the following 13 intents:

  1. bật thiết bị (turn on device)
  2. tắt thiết bị (turn off device)
  3. mở thiết bị (open device)
  4. đóng thiết bị (close device)
  5. tăng độ sáng của thiết bị (increase device brightness)
  6. giảm độ sáng của thiết bị (decrease device brightness)
  7. kiểm tra tình trạng thiết bị (check device status)
  8. điều chỉnh nhiệt độ (adjust temperature)
  9. hẹn giờ (set timer)
  10. kích hoạt cảnh (activate scene)
  11. tắt tất cả thiết bị (turn off all devices)
  12. mở khóa (unlock)
  13. khóa (lock)

How to Use

Using Transformers Library

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import pickle

# Load model and tokenizer
model_name = "ntgiaky/phobert-intent-classifier-smart-home"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Load label encoder
with open('intent_encoder.pkl', 'rb') as f:
    label_encoder = pickle.load(f)

# Predict intent
def predict_intent(text):
    # Tokenize
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
    
    # Predict
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
        predicted_class = torch.argmax(predictions, dim=-1)
    
    # Decode label
    intent = label_encoder.inverse_transform(predicted_class.cpu().numpy())[0]
    confidence = predictions[0][predicted_class].item()
    
    return intent, confidence

# Example usage
text = "bật đèn phòng khách"
intent, confidence = predict_intent(text)
print(f"Intent: {intent}, Confidence: {confidence:.2f}")

Using Pipeline

from transformers import pipeline

# Load pipeline
classifier = pipeline(
    "text-classification",
    model="ntgiaky/phobert-intent-classifier-smart-home",
    device=0  # Use -1 for CPU
)

# Predict
result = classifier("tắt quạt phòng ngủ")
print(result)

Integration Example

# For Raspberry Pi deployment
import onnxruntime as ort
import numpy as np

# Convert to ONNX first (one-time)
from transformers import AutoModel
model = AutoModel.from_pretrained("ntgiaky/phobert-intent-classifier-smart-home")
# ... ONNX conversion code ...

# Then use ONNX Runtime for inference
session = ort.InferenceSession("model.onnx")
# ... inference code ...

Dataset

This model was trained on an augmented version of the VN-SLU dataset, which includes:

  • Original recordings from 240 Vietnamese speakers
  • Augmented samples using various techniques
  • Smart home specific vocabulary and commands

Citation

If you use this model, please cite:

@misc{phobert-smart-home-2025,
  author = {Trần Quang Huy and Nguyễn Trần Gia Kỳ},
  title = {PhoBERT Fine-tuned for Vietnamese Smart Home Intent Classification},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/ntgiaky/phobert-intent-classifier-smart-home}}
}

Authors

  • Trần Quang Huy
  • Nguyễn Trần Gia Kỳ
  • Advisor: TS. Đoàn Duy

License

This model is released under the MIT License.

Contact

For questions or issues, please open an issue on the model repository or contact the authors through the university.