parthesh111 commited on
Commit
342b33d
·
verified ·
1 Parent(s): c3efc38

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -20
README.md CHANGED
@@ -24,17 +24,16 @@ This model is a fine-tuned version of `microsoft/layoutlmv3-base` designed for t
24
  ### Model Description
25
 
26
  * **Developed by:** Parthesh Ingale
27
- * **Funded by \[optional]:** Academic Research
28
- * **Shared by \[optional]:** [parthesh111](https://huggingface.co/parthesh111)
29
  * **Model type:** Token Classification (NER)
30
  * **Language(s) (NLP):** English
31
  * **License:** Apache-2.0
32
- * **Finetuned from model \[optional]:** `microsoft/layoutlmv3-base`
33
 
34
- ### Model Sources \[optional]
35
 
36
  * **Repository:** [https://huggingface.co/parthesh111/layoutlmv3-finetune-bioes-new](https://huggingface.co/parthesh111/layoutlmv3-finetune-bioes-new)
37
- * **Paper \[optional]:** N/A
38
 
39
  ## Uses
40
 
@@ -43,7 +42,7 @@ This model is a fine-tuned version of `microsoft/layoutlmv3-base` designed for t
43
  * Extract named entities from medical lab reports (scanned images).
44
  * Automate structured data extraction from semi-structured medical documents.
45
 
46
- ### Downstream Use \[optional]
47
 
48
  * Preprocessing step in EHR (Electronic Health Records).
49
  * PII-aware document processing.
@@ -81,7 +80,7 @@ import numpy as np
81
  import os
82
  from huggingface_hub import login
83
 
84
- # Login to Hugging Face using environment variable token
85
  HF_TOKEN = os.environ.get("HF_TOKEN")
86
  if not HF_TOKEN:
87
  st.error("Hugging Face token not found. Please set 'HF_TOKEN' as an environment variable.")
@@ -320,7 +319,7 @@ st.markdown("""
320
 
321
  ### Training Procedure
322
 
323
- #### Preprocessing \[optional]
324
 
325
  * Images were preprocessed using PaddleOCR.
326
  * Bounding boxes normalized to 1000-scale.
@@ -330,10 +329,10 @@ st.markdown("""
330
 
331
  * **Training regime:** fp16 mixed precision
332
  * **Epochs:** 20
333
- * **Batch size:** 8
334
  * **Learning rate:** 5e-5
335
 
336
- #### Speeds, Sizes, Times \[optional]
337
 
338
  * **Checkpoint size:** \~435 MB
339
  * **Training time:** \~2 hours on RTX 3060
@@ -367,7 +366,7 @@ LayoutLMv3 with token classification head using OCR input (image, text, and layo
367
 
368
  * PyTorch, Hugging Face Transformers, PaddleOCR, Streamlit
369
 
370
- ## Citation \[optional]
371
 
372
  **BibTeX:**
373
 
@@ -379,18 +378,10 @@ LayoutLMv3 with token classification head using OCR input (image, text, and layo
379
  howpublished = {\url{https://huggingface.co/parthesh111/layoutlmv3-finetune-bioes-new}},
380
  }
381
  ```
382
- ## Glossary \[optional]
383
 
384
  * **BIOES:** Beginning, Inside, Outside, End, Single tagging scheme used for NER.
385
 
386
- ## More Information \[optional]
387
-
388
- For demo, Streamlit app, or usage questions, contact below.
389
-
390
- ## Model Card Authors \[optional]
391
-
392
- * Parthesh Ingale
393
-
394
  ## Model Card Contact
395
 
396
  * **GitHub/HF:** [parthesh111](https://huggingface.co/parthesh111)
 
24
  ### Model Description
25
 
26
  * **Developed by:** Parthesh Ingale
27
+ * **Shared by:** [parthesh111](https://huggingface.co/parthesh111)
 
28
  * **Model type:** Token Classification (NER)
29
  * **Language(s) (NLP):** English
30
  * **License:** Apache-2.0
31
+ * **Finetuned from model:** `microsoft/layoutlmv3-base`
32
 
33
+ ### Model Sources
34
 
35
  * **Repository:** [https://huggingface.co/parthesh111/layoutlmv3-finetune-bioes-new](https://huggingface.co/parthesh111/layoutlmv3-finetune-bioes-new)
36
+ * **Paper:** N/A
37
 
38
  ## Uses
39
 
 
42
  * Extract named entities from medical lab reports (scanned images).
43
  * Automate structured data extraction from semi-structured medical documents.
44
 
45
+ ### Downstream Use
46
 
47
  * Preprocessing step in EHR (Electronic Health Records).
48
  * PII-aware document processing.
 
80
  import os
81
  from huggingface_hub import login
82
 
83
+ # Login to Hugging Face using the environment variable token
84
  HF_TOKEN = os.environ.get("HF_TOKEN")
85
  if not HF_TOKEN:
86
  st.error("Hugging Face token not found. Please set 'HF_TOKEN' as an environment variable.")
 
319
 
320
  ### Training Procedure
321
 
322
+ #### Preprocessing
323
 
324
  * Images were preprocessed using PaddleOCR.
325
  * Bounding boxes normalized to 1000-scale.
 
329
 
330
  * **Training regime:** fp16 mixed precision
331
  * **Epochs:** 20
332
+ * **Batch size:** 1
333
  * **Learning rate:** 5e-5
334
 
335
+ #### Speeds, Sizes, Times
336
 
337
  * **Checkpoint size:** \~435 MB
338
  * **Training time:** \~2 hours on RTX 3060
 
366
 
367
  * PyTorch, Hugging Face Transformers, PaddleOCR, Streamlit
368
 
369
+ ## Citation
370
 
371
  **BibTeX:**
372
 
 
378
  howpublished = {\url{https://huggingface.co/parthesh111/layoutlmv3-finetune-bioes-new}},
379
  }
380
  ```
381
+ ## Glossary
382
 
383
  * **BIOES:** Beginning, Inside, Outside, End, Single tagging scheme used for NER.
384
 
 
 
 
 
 
 
 
 
385
  ## Model Card Contact
386
 
387
  * **GitHub/HF:** [parthesh111](https://huggingface.co/parthesh111)