Model Details of TVL_GeneralLayerClassifier

Base Model

This model is fine-tuned from google-bert/bert-base-chinese.

Model Architecture

  • Type: BERT-based text classification model
  • Hidden Size: 768
  • Number of Layers: 12
  • Number of Attention Heads: 12
  • Intermediate Size: 3072
  • Max Sequence Length: 512
  • Vocabulary Size: 21,128

Key Components

  1. Embeddings

    • Word Embeddings
    • Position Embeddings
    • Token Type Embeddings
    • Layer Normalization
  2. Encoder

    • 12 layers of:
      • Self-Attention Mechanism
      • Intermediate Dense Layer
      • Output Dense Layer
      • Layer Normalization
  3. Pooler

    • Dense layer for sentence representation
  4. Classifier

    • Output layer with 4 classes

Training Hyperparameters

The model was trained using the following hyperparameters:

Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()

Training Infrastructure

  • Hardware Type: NVIDIA Quadro RTX8000
  • Library: PyTorch
  • Hours used: 2hr 56mins

Model Parameters

  • Total parameters: ~102M (estimated)
  • All parameters are in 32-bit floating point (F32) format

Input Processing

  • Uses BERT tokenization
  • Supports sequences up to 512 tokens

Output

  • 4-class multi-label classification

Performance Metrics

  • Accuracy score: 0.952902
  • F1 score (Micro): 0.968717
  • F1 score (Macro): 0.970818

Training Dataset

This model was trained on the scfengv/TVL-general-layer-dataset.

Testing Dataset

Usage

import torch
from transformers import BertForSequenceClassification, BertTokenizer

model = BertForSequenceClassification.from_pretrained("scfengv/TVL_GeneralLayerClassifier")
tokenizer = BertTokenizer.from_pretrained("scfengv/TVL_GeneralLayerClassifier")

# Prepare your text
text = "Your text here" ## Please refer to Dataset
inputs = tokenizer(text, return_tensors = "pt", padding = True, truncation = True, max_length = 512)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.sigmoid(outputs.logits)

# Print predictions
print(predictions)

Additional Notes

  • This model is specifically designed for TVL general layer classification tasks.

  • It's based on the Chinese BERT model, indicating it's optimized for Chinese text.

  • Hardware Type: NVIDIA Quadro RTX8000

  • Library: PyTorch

  • Hours used: 2hr 56mins

Training Data

Training Hyperparameters

The model was trained using the following hyperparameters:

Learning rate: 1e-05
Batch size: 32
Number of epochs: 10
Optimizer: Adam
Loss function: torch.nn.BCEWithLogitsLoss()

Evaluation

Testing Data

Results (validation)

  • Accuracy: 0.952902
  • F1 Score (Micro): 0.968717
  • F1 Score (Macro): 0.970818
Downloads last month
0
Safetensors
Model size
102M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for scfengv/TVL_GeneralLayerClassifier

Adapter
(1)
this model

Dataset used to train scfengv/TVL_GeneralLayerClassifier

Evaluation results

  • Accuracy on scfengv/TVL-general-layer-dataset
    self-reported
    0.953
  • F1 score (Micro) on scfengv/TVL-general-layer-dataset
    self-reported
    0.969
  • F1 score (Macro) on scfengv/TVL-general-layer-dataset
    self-reported
    0.971