qwen25-05b-multiclinsum-standard / README.md

Update README.md

7c425f1 verified about 2 months ago

5.15 kB

	---
	library_name: transformers
	tags:
	- text-summarization
	- text-generation
	- clinical-report-summarization
	- document-summarization
	license: mit
	language:
	- en
	- fr
	- pt
	- es
	metrics:
	- bertscore
	- rouge
	base_model:
	- Qwen/Qwen2.5-0.5B-Instruct
	pipeline_tag: text-generation
	---

	# Model Details

	> Update 13'th July 2025: Added [video review on youtube](https://youtu.be/uOAiUvLghuE)

	This model represent a fine-tuned version of `Qwen/Qwen2.5-0.5B-Instruct` on [MultiClinSum](https://zenodo.org/records/15463353) training data
	for [BioASQ-2025](http://bioasq.org/) Workshop / [CLEF 2025](https://clef2025.clef-initiative.eu/).

	This model represent a baseline for the `distil` version:

	https://huggingface.co/nicolay-r/qwen25-05b-multiclinsum-distil

	### Video Overview

	<div align="center">

	[![](https://markdown-videos-api.jorgenkh.no/youtube/uOAiUvLghuE)](https://youtu.be/uOAiUvLghuE)

	</div>

	### Model Description

	- Model type: Decoder-based Model
	- Language(s) (NLP): Supported by Qwen2.5 + fine-tuned on summarries written in `en`, `fr`, `pt`, `es`
	- License: MIT
	- Finetuned from model [optional]: https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct

	### Model Sources [optional]
	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1TXGaz39o73nBucEQw12gbad7Tw11j2Ol?usp=sharing)

	- Repository: https://github.com/nicolay-r/distil-tuning-llm
	- Paper: TBA
	- Demo: https://colab.research.google.com/drive/1TXGaz39o73nBucEQw12gbad7Tw11j2Ol?usp=sharing

	## Usage

	We use [bulk-chain](https://github.com/nicolay-r/bulk-chain) for inference with the Qwen2 provider based on `transformers` pipelines API.

	Provider `huggingface_qwen.py`: https://github.com/nicolay-r/nlp-thirdgate/blob/9e46629792e9a53871710884f7b9e2fe42666aa7/llm/transformers_qwen2.py

	```python
	from bulk_chain.api import iter_content
	from bulk_chain.core.utils import dynamic_init

	content_it = iter_content(
	schema={"schema": [
	{"prompt": "Summarize: {input}", "out": "summary"}]
	},
	llm=dynamic_init(
	class_filepath="huggingface_qwen.py",
	class_name="Qwen2")(
	api_token="YOUR_HF_API_KEY_GOES_HERE",
	model_name="nicolay-r/qwen25-05b-multiclinsum-standard",
	temp=0.1,
	use_bf16=True,
	max_new_tokens=args.max_tokens,
	device=args.device
	),
	infer_mode="batch",
	batch_size=4,
	return_mode="record",
	# INPUT TEXTS:
	input_dicts_it=[
	{"input": "A patient 62 years old with ..."}
	],
	)

	for record in content_it:
	# here is the result dictionary that includes summary.
	print(record["summary"])
	```

	## Training Details

	### Training Data

	* MultiClinSum
	* We use the [following script](https://github.com/nicolay-r/distill-tuning-llm/blob/main/resources/download_dataset.sh) for downloading datasets.
	* Web: https://temu.bsc.es/multiclinsum
	* Data: https://zenodo.org/records/15463353
	* BioASQ: http://bioasq.org/

	### Training Procedure

	The training procedure involves:
	1. Preparation of the `rationale` for summaries distillation.
	2. Launch of the fine-tuning process.

	Fine-tuning: Please follow this script for using [`MultiClinSum` dataset](https://zenodo.org/records/15463353) for fine-tuning at GoogleColab A100 (40GB VRAM) + 80GB RAM:
	* https://github.com/nicolay-r/distil-tuning-llm/blob/master/distil_ft_qwen25_05b_A100-40GB_80GB_std.sh

	#### Preprocessing [optional]

	Refer to the following script for the `fine-tuning` pre-processing:
	* https://github.com/nicolay-r/distil-tuning-llm/blob/master/resources/make_dataset_mult.py

	#### Training Hyperparameters

	We refer to the original parameters here:
	* https://github.com/QwenLM/Qwen2.5-VL/tree/main/qwen-vl-finetune
	And use the following script:
	* https://github.com/nicolay-r/distil-tuning-llm/blob/master/distil_ft_qwen25_05b_A100-40GB_80GB_std.sh


	#### Speeds, Sizes, Times [optional]

	The fine-tuning procedure for `3` epochs takes around `~1 hour` using the GoogleColab A100.

	## Evaluation


	#### Testing Data

	We use evaluation split of the 20 documents out of the small portion the available training data across all the languages: `en`, `fr`, `pt`, `es`

	#### Metrics

	In this evaluation we use onle `rouge` score.

	### Results

	We launch 3 individual fine-tuning processes for `distil` and `standard` versions to showcase results variation among multiple runs.

	> Figure: the obtained results for this model correspond to the `standard` version 🟠

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e62d11d27a8292c3637f86/6wZ_klTgm-SmvZCGJOaC5.png)

	#### Summary

	#### Hardware

	We experiment with model inference and launching using GoolgeColab Notebook service and related resources:
	* Fine-tuning: A100 (40GB)
	* Inference: T4 (16GB)

	Follow the Google Codalab Notebook at the repository:
	* https://github.com/nicolay-r/distil-tuning-llm

	#### Software

	This is an official repository for this card:
	* https://github.com/nicolay-r/distil-tuning-llm

	## Citation [optional]

	BibTeX:

	> TO BE ADDED


	## Model Card Authors

	Nicolay Rusnachenko