Model Card for Model ID

This model card aims to be a baseline model for using RVL-CDIP with Donut. The model has been trained on small scale dataset of RVL-CDIP (specically 100 images from this dataset).

Model Details

The model using Donut with VisionEncoderDecoder and Transformers as the backbone model for an end-to-end Document Classification task

Downstream Use [optional]

This model can be use for fine-tuning task related Document Classification in different area like Food Document, Financial Document, etc. For further task downstream fine-tune, please related to the orignal model from Naver.

Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train sitloboi2012/donut-finetune-rvl-cdip