File size: 1,990 Bytes
c98b454
 
 
 
 
 
 
4a3c353
69a8cf9
c98b454
 
395cc18
c98b454
 
 
 
 
 
 
395cc18
c98b454
 
 
 
 
395cc18
c98b454
395cc18
c98b454
 
 
4843e13
 
 
c98b454
 
 
 
 
503a6b0
f40a244
4843e13
c98b454
 
 
503a6b0
 
c98b454
4843e13
c98b454
 
291f8c8
7529597
 
291f8c8
c98b454
4843e13
 
 
 
 
 
 
1dc7724
 
4843e13
 
 
 
1dc7724
 
4843e13
 
 
 
 
 
 
291f8c8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
language:
- en
license: mit
tags:
- text-classfication
- int8
- Intel® Neural Compressor
- neural-compressor
- PostTrainingStatic
datasets:
- nyu-mll/glue
metrics:
- f1
model-index:
- name: roberta-base-mrpc-int8-static
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: GLUE MRPC
      type: glue
      args: mrpc
    metrics:
    - type: f1
      value: 0.924693520140105
      name: F1
---
# INT8 roberta-base-mrpc

##  Post-training static quantization

### PyTorch

This is an INT8  PyTorch model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor). 

The original fp32 model comes from the fine-tuned model [roberta-base-mrpc](https://huggingface.co/Intel/roberta-base-mrpc).

The calibration dataloader is the train dataloader. The default calibration sampling size 100 isn't divisible exactly by batch size 8, so the real sampling size is 104.

#### Test result

|   |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.9177|0.9138|
| **Model size (MB)**  |127|499|

#### Load with Intel® Neural Compressor:

```python
from optimum.intel import INCModelForSequenceClassification

model_id = "Intel/roberta-base-mrpc-int8-static"
int8_model = INCModelForSequenceClassification.from_pretrained(model_id)
```

### ONNX

This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).

The original fp32 model comes from the fine-tuned model [roberta-base-mrpc](https://huggingface.co/Intel/roberta-base-mrpc).

The calibration dataloader is the eval dataloader. The calibration sampling size is 100.

#### Test result

|   |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.9100|0.9138|
| **Model size (MB)**  |294|476|


#### Load ONNX model:

```python
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained('Intel/roberta-base-mrpc-int8-static')
```