ntgiaky
/

ner-smart-home

+---
+language: vi
+tags:
+- ner
+- named-entity-recognition
+- slot-filling
+- smart-home
+- vietnamese
+- phobert
+- token-classification
+license: mit
+datasets:
+- custom-vn-slu-augmented
+metrics:
+- accuracy
+- f1
+- precision
+- recall
+model-index:
+- name: PhoBERT NER for Vietnamese Smart Home Slot Filling
+  results:
+  - task:
+      type: token-classification
+      name: Named Entity Recognition
+    dataset:
+      name: VN-SLU Augmented Dataset
+      type: custom
+    metrics:
+    - type: accuracy
+      value: 96.64
+      name: Accuracy
+    - type: f1
+      value: 86.55
+      name: F1 Score (Weighted)
+    - type: f1
+      value: 67.04
+      name: F1 Score (Macro)
+widget:
+- text: "bật đèn phòng khách"
+- text: "tắt quạt phòng ngủ lúc 10 giờ tối"
+- text: "điều chỉnh nhiệt độ điều hòa 25 độ"
+- text: "mở cửa garage sau 5 phút"
+---
+# PhoBERT Fine-tuned for Vietnamese Smart Home NER/Slot Filling
+This model is a fine-tuned version of [vinai/phobert-base](https://huggingface.co/vinai/phobert-base) for Named Entity Recognition (NER) in Vietnamese smart home commands. It extracts slot values such as devices, locations, times, and numeric values from user commands.
+## Model Description
+- **Base Model**: vinai/phobert-base
+- **Task**: Token Classification / Slot Filling for Smart Home Commands
+- **Language**: Vietnamese
+- **Training Data**: VN-SLU Augmented Dataset (4,000 training samples)
+- **Number of Entity Types**: 13
+## Intended Uses & Limitations
+### Intended Uses
+- Extracting entities from Vietnamese smart home voice commands
+- Slot filling for voice assistant systems
+- Integration with intent classification for complete NLU pipeline
+- Research in Vietnamese NLP for IoT applications
+### Limitations
+- Optimized specifically for smart home domain
+- May not generalize well to other domains
+- Trained on Vietnamese language only
+- Best performance when used with corresponding intent classifier
+## Entity Types (Slot Labels)
+The model recognizes 13 types of entities:
+1. `B-device` / `I-device` - Device names (e.g., "đèn", "quạt", "điều hòa")
+2. `B-living_space` / `I-living_space` - Room/location names (e.g., "phòng khách", "phòng ngủ")
+3. `B-time_at` / `I-time_at` - Specific times (e.g., "10 giờ tối", "7 giờ sáng")
+4. `B-duration` / `I-duration` - Time durations (e.g., "5 phút", "2 giờ")
+5. `B-target_number` / `I-target_number` - Target values (e.g., "25 độ", "50%")
+6. `B-changing_value` / `I-changing_value` - Change amounts (e.g., "tăng 10%")
+7. `O` - Outside/No entity
+## How to Use
+### Using Transformers Library
+```python
+from transformers import AutoTokenizer, AutoModelForTokenClassification
+import torch
+import json
+# Load model and tokenizer
+model_name = "ntgiaky/phobert-ner-smart-home"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForTokenClassification.from_pretrained(model_name)
+# Load label mappings
+with open('label_mappings.json', 'r') as f:
+    label_mappings = json.load(f)
+    id2label = {int(k): v for k, v in label_mappings['id2label'].items()}
+def extract_entities(text):
+    # Tokenize
+    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
+    tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
+    # Predict
+    with torch.no_grad():
+        outputs = model(**inputs)
+        predictions = torch.argmax(outputs.logits, dim=2)
+    # Extract entities
+    entities = []
+    current_entity = None
+    current_tokens = []
+    for token, pred_id in zip(tokens, predictions[0]):
+        label = id2label[pred_id.item()]
+        if label.startswith('B-'):
+            # Save previous entity if exists
+            if current_entity:
+                entities.append({
+                    'type': current_entity,
+                    'text': tokenizer.convert_tokens_to_string(current_tokens)
+                })
+            # Start new entity
+            current_entity = label[2:]
+            current_tokens = [token]
+        elif label.startswith('I-') and current_entity == label[2:]:
+            # Continue current entity
+            current_tokens.append(token)
+        else:
+            # End current entity
+            if current_entity:
+                entities.append({
+                    'type': current_entity,
+                    'text': tokenizer.convert_tokens_to_string(current_tokens)
+                })
+            current_entity = None
+            current_tokens = []
+    # Don't forget last entity
+    if current_entity:
+        entities.append({
+            'type': current_entity,
+            'text': tokenizer.convert_tokens_to_string(current_tokens)
+        })
+    return entities
+# Example usage
+text = "bật đèn phòng khách lúc 7 giờ tối"
+entities = extract_entities(text)
+print(f"Input: {text}")
+print(f"Entities: {entities}")
+```
+### Using Pipeline
+```python
+from transformers import pipeline
+# Load NER pipeline
+ner = pipeline(
+    "token-classification",
+    model="ntgiaky/phobert-ner-smart-home",
+    aggregation_strategy="simple"
+)
+# Extract entities
+result = ner("tắt quạt phòng ngủ sau 10 phút")
+print(result)
+```
+## Integration with Intent Classification
+For a complete NLU pipeline:
+```python
+from transformers import pipeline
+# Load both models
+intent_classifier = pipeline("text-classification", model="ntgiaky/phobert-intent-classifier-smart-home")
+ner = pipeline("token-classification", model="ntgiaky/phobert-ner-smart-home", aggregation_strategy="simple")
+def process_command(text):
+    # Get intent
+    intent_result = intent_classifier(text)
+    intent = intent_result[0]['label']
+    # Get entities
+    entities = ner(text)
+    # Combine results
+    return {
+        'text': text,
+        'intent': intent,
+        'entities': entities
+    }
+# Example
+command = "điều chỉnh nhiệt độ điều hòa 25 độ"
+result = process_command(command)
+print(result)
+```
+## Example Outputs
+```python
+# Input: "bật đèn phòng khách"
+# Entities: [
+#   {'type': 'device', 'text': 'đèn'},
+#   {'type': 'living_space', 'text': 'phòng khách'}
+# ]
+# Input: "tắt quạt phòng ngủ lúc 10 giờ tối"
+# Entities: [
+#   {'type': 'device', 'text': 'quạt'},
+#   {'type': 'living_space', 'text': 'phòng ngủ'},
+#   {'type': 'time_at', 'text': '10 giờ tối'}
+# ]
+# Input: "điều chỉnh nhiệt độ điều hòa 25 độ"
+# Entities: [
+#   {'type': 'device', 'text': 'điều hòa'},
+#   {'type': 'target_number', 'text': '25 độ'}
+# ]
+```
+## Citation
+If you use this model, please cite:
+```bibtex
+@misc{phobert-ner-smart-home-2025,
+  author = {Trần Quang Huy and Nguyễn Trần Gia Kỳ},
+  title = {PhoBERT Fine-tuned for Vietnamese Smart Home NER},
+  year = {2025},
+  publisher = {Hugging Face},
+  journal = {Hugging Face Model Hub},
+  howpublished = {\url{https://huggingface.co/ntgiaky/phobert-ner-smart-home}}
+}
+```
+## Authors
+- **Trần Quang Huy**
+- **Nguyễn Trần Gia Kỳ**
+- **Advisor**: TS. Đoàn Duy
+## License
+This model is released under the MIT License.