Claris RF-Channel: DehazeFormer with Reference Frame Input
This repository provides a PyTorch implementation of a DehazeFormer-based model. This model uses a transformer-based backbone and processes a reference frame concatenated along the channel dimension with the input image to enhance visibility and remove smoke/haze artifacts in surgical endoscopic scenes.
This model version, also referred to as mct-diffusion-overlay-p40-v1-rf-channel.pth
, was trained with a combination of synthetic overlays and diffusion-generated smoke-haze image pairs.
Features
- Transformer-based architecture for image enhancement.
- Supports reference frame input via channel or spatial concatenation.
- Hugging Face Transformers-compatible interface.
- Example inference script included.
File Structure
claris_rf_channel/
βββ dehazeformer.py
βββ inference_example.py
βββ pytorch_model.bin
βββ config.json
βββ sample_img.png
βββ ref_img.png
Quick Start
Install Requirements
pip install torch torchvision transformers timm pillow
Inference Example
You can run the provided inference script to dehaze the sample image:
python inference_example.py
This will save the output as output_img_rfchannel.png
.
Or use the model in your own code, by loading the model as follows :
from transformers import AutoModel
# Load model
model = AutoModel.from_pretrained("vopeai/claris-RF-channel", trust_remote_code=True)
model.to(device)
model.eval()
# Inference
with torch.no_grad():
output = model(input_img, ref_img)
The model takes as input a pillow image or a tensor.
For more details, see the code files in this repository.
- Downloads last month
- 27