arxiv:2506.23542

Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention

Published on Jun 30

· Submitted by

wanhaoliu on Jul 1

Upvote

Authors:

Weida Wang ,

Jin Zeng ,

Abstract

A novel ToF depth denoising network uses motion-invariant graph fusion and adaptive filters to improve temporal stability and spatial sharpness, achieving state-of-the-art performance.

AI-generated summary

Depth images captured by Time-of-Flight (ToF) sensors are prone to noise, requiring denoising for reliable downstream applications. Previous works either focus on single-frame processing, or perform multi-frame processing without considering depth variations at corresponding pixels across frames, leading to undesirable temporal inconsistency and spatial ambiguity. In this paper, we propose a novel ToF depth denoising network leveraging motion-invariant graph fusion to simultaneously enhance temporal stability and spatial sharpness. Specifically, despite depth shifts across frames, graph structures exhibit temporal self-similarity, enabling cross-frame geometric attention for graph fusion. Then, by incorporating an image smoothness prior on the fused graph and data fidelity term derived from ToF noise distribution, we formulate a maximum a posterior problem for ToF denoising. Finally, the solution is unrolled into iterative filters whose weights are adaptively learned from the graph-informed geometric attention, producing a high-performance yet interpretable network. Experimental results demonstrate that the proposed scheme achieves state-of-the-art performance in terms of accuracy and consistency on synthetic DVToF dataset and exhibits robust generalization on the real Kinectv2 dataset. Source code will be released at https://github.com/davidweidawang/GIGA-ToF{https://github.com/davidweidawang/GIGA-ToF}.

View arXiv page View PDF Add to collection

Community

wanhaoliu

Paper submitter 2 days ago

weidawang

Paper author 2 days ago

•

edited 2 days ago

⭐️ Highlights
🚀 Cross-frame graph fusion for ToF denoising
Fuses motion-invariant graph structures across frames to achieve both temporal consistency and spatial sharpness.
🧠 Graph-informed geometric attention (GIGA)
Learns graph edges via attention from geometric features, enabling accurate cross-frame correspondence.
🔬Interpretable and robust design
Unrolls MAP optimization with graph Laplacian regularization into a network, achieving high denoising accuracy and generalization to real ToF data.
📈 State-of-the-art performance
Outperforms prior works by at least 37.9% MAE and 13.2% TEPE on DVToF; robust on real Kinect v2 data without fine-tuning.
this paper has been accept by ICCV 2025 :)