Papers
arxiv:2508.01928

IAUNet: Instance-Aware U-Net

Published on Aug 3
ยท Submitted by YaroslavPrytula on Aug 7
Authors:
,

Abstract

IAUNet, a query-based U-Net architecture with a lightweight convolutional Pixel decoder and Transformer decoder, outperforms state-of-the-art models in biomedical instance segmentation.

AI-generated summary

Instance segmentation is critical in biomedical imaging to accurately distinguish individual objects like cells, which often overlap and vary in size. Recent query-based methods, where object queries guide segmentation, have shown strong performance. While U-Net has been a go-to architecture in medical image segmentation, its potential in query-based approaches remains largely unexplored. In this work, we present IAUNet, a novel query-based U-Net architecture. The core design features a full U-Net architecture, enhanced by a novel lightweight convolutional Pixel decoder, making the model more efficient and reducing the number of parameters. Additionally, we propose a Transformer decoder that refines object-specific features across multiple scales. Finally, we introduce the 2025 Revvity Full Cell Segmentation Dataset, a unique resource with detailed annotations of overlapping cell cytoplasm in brightfield images, setting a new benchmark for biomedical instance segmentation. Experiments on multiple public datasets and our own show that IAUNet outperforms most state-of-the-art fully convolutional, transformer-based, and query-based models and cell segmentation-specific models, setting a strong baseline for cell instance segmentation tasks. Code is available at https://github.com/SlavkoPrytula/IAUNet

Community

Paper author Paper submitter
โ€ข
edited 1 day ago

IAUNet: Instance-Aware U-Net (CVPRW 2025)

In this work, we present:

  • A novel query-based U-Net model: IAUNet: Instance-Aware U-Net โญ๏ธ
  • A new cell instance segmentation dataset: Revvity-25 ๐Ÿ”ฅ

๐Ÿ”— GitHub: https://github.com/SlavkoPrytula/IAUNet
๐ŸŒ Project page: https://slavkoprytula.github.io/IAUNet/

If you find this work useful, consider giving it a โญ๏ธ on GitHub to support further open-source research!

Paper author Paper submitter
โ€ข
edited 1 day ago
IAUNet_v2-main_v2.png

Model overview. Overview of the IAUNet architecture, highlighting the Pixel and Transformer Decoder stages. Given an input image I, the encoder extracts multi-scale features as skip connections for the Pixel decoder. At each decoder block, we add skip connections X_s to the main features X and inject normalized coordinate features for CoordConv. Stacked depth-wise convolutions with an SE block refine spatial information, generating mask features X_m. The Transformer decoder then processes learnable queries q through three Transformer blocks per layer, iteratively refining them with X_m. Deep supervision loss is applied after each Transformer block using updated queries q_hat and high-resolution mask features.

Paper author Paper submitter

coco_dataset_images_set_3.png

Revvity-25. One of our key contributions in this paper is a novel cell instance segmentation dataset named Revvity-25. It includes 110 high-resolution 1080 x 1080 brightfield images, each containing, on average, 27 manually labeled and expert-validated cancer cells, totaling 2937 annotated cells. To our knowledge, this is the first dataset with accurate and detailed annotations for cell borders and overlaps, with each cell annotated using an average of 60 polygon points, reaching up to 400 points for more complex structures. Revvity-25 dataset provides a unique resource that opens new possibilities for testing and benchmarking models for modal and amodal semantic and instance segmentation.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.01928 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.01928 in a Space README.md to link it from this page.

Collections including this paper 2