AMR-Fact Checking Indonesia
Collection
7 items
•
Updated
This model is fine-tuned on the IndoNLI dataset for natural language inference in Indonesian.
The model performance on different dataset splits:
Our benchmark indicates this model achieved the best performance on the validation set compared to other variants.
This model was fine-tuned from indo-roberta-base for 4 epochs on the IndoNLI training dataset with a classification head for the NLI task.
This model was trained on the IndoNLI dataset, which contains 10k sentence pairs as a benchmark for natural language inference (NLI) in Indonesian.
The dataset is split into:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("fabhiansan/indo-roberta-nli")
model = AutoModelForSequenceClassification.from_pretrained("fabhiansan/indo-roberta-nli")
# Prepare the input
premise = "Seorang wanita sedang makan di restoran."
hypothesis = "Seorang wanita sedang berada di luar ruangan."
# Tokenize the input
inputs = tokenizer(premise, hypothesis, return_tensors="pt")
# Get the prediction
outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=1)
# Map predictions to labels
id2label = {0: "entailment", 1: "neutral", 2: "contradiction"}
predicted_label = id2label[predictions.item()]
print(f"Predicted label: {predicted_label}")
## Citation
If you use this model, please cite the IndoNLI paper:
```bibtex
@inproceedings{mahendra-etal-2021-indonli,
title = {IndoNLI: A Natural Language Inference Dataset for Indonesian},
author = {Mahendra, Rahmad and Aji, Alham Fikri and Louvan, Samuel and Rahman, Fahrurrozi and Vania, Clara},
booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
year = {2021},
publisher = {Association for Computational Linguistics},
}