[CVPR 2025] GFS-VL: Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

Overview

GFS-VL is a novel framework proposed in our CVPR 2025 paper: Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model.

Our approach leverages the synergy between:

Dense but noisy pseudo-labels from 3D Vision-Language Models
Precise yet sparse few-shot samples

by maximizing the strengths of both data sources for effective generalized few-shot 3D point cloud segmentation.

Released Model Weights

This repository contains the following pre-trained weights:

PTv3 Backbones: Our pre-trained point transformer v3 backbones
GFS-VL Models: Complete GFS_VL few-shot segmentation framework

Usage

For detailed usage instructions, model implementation, and training code, please refer to our GitHub repository.

Benchmarks

We introduce two new challenging GFS-PCS benchmarks with diverse novel classes for comprehensive generalization evaluation. These benchmarks lay a solid foundation for real-world GFS-PCS advancements.

The benchmark datasets can be downloaded from our Huggingface dataset repository.

Citation

If you find our work useful, please consider citing our paper:

@inproceedings{an2025generalized,
  title={Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model},
  author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Han, Junlin and Konukoglu, Ender and Belongie, Serge},
  booktitle=CVPR,
  year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for ZhaochongAn/GFS_VL

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

Paper • 2503.16282 • Published Mar 20, 2025 • 6