Edit model card

roberta-zh fine-tuned on human-annotated conversational model self-chat data. It supports 2-class calssification for multi-turn dialogue sensible detection. Usage example:

NOTE: it should be used under similar data distribution.

import torch
from transformers import BertTokenizer, BertForSequenceClassification

tokenizer = BertTokenizer.from_pretrained('thu-coai/roberta-zh-specific')
model = BertForSequenceClassification.from_pretrained('thu-coai/roberta-zh-specific', num_labels=2)
model.eva()

context = [
    "你大爱的冷门古诗词是什么?\t一枝红艳露凝香,云雨巫山枉断肠",
    "你大爱的冷门古诗词是什么?\t一枝红艳露凝香,云雨巫山枉断肠",
]

response = [
    "我也很喜欢,我觉得这句的意境很美",
    "我也很喜欢",
]

model_input = tokenizer(context, response, return_tensors='pt', padding=True)
with torch.no_grad():
    model_output = model(**model_input, return_dict=True)
logits = model_output.logits
preds_all = torch.argmax(logits, dim=-1).cpu()
print(preds_all) # 1 for specific response else 0

Downloads last month
7
Safetensors
Model size
102M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.