We propose VoxelFlow, a novel framework that effectively combines 2D mask semantics with 3D geometric features. Our core insight is to leverage the implicit semantic and spatial information embedded in 2D masks, using them to replace rigid 3D priors.
SGS-3D: High-Fidelity 3D Instance Segmentation via Reliable Semantic Mask Splitting and Growing
Chaolei Wang, Yang Luo, Jing Du, Siyu Chen, and 2 more authors
We propose splitting and growing reliable semantic mask for high-fidelity 3D instance segmentation (SGS-3D), a novel "split-then-grow" framework that first purifies and splits ambiguous lifted masks using geometric primitives, and then grows them into complete instances within the scene.
Vireo: Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation
Siyu Chen*, Ting Han*, Chaolei Wang, Chengzhen Fu, and 4 more authors
In Conference on Neural Information Processing Systems, 2025
We introduce Vireo, a novel single stage framework for OV-DGSS that unifies the strengths of OVSS and DGSS for the first time. Vireo builds upon frozen Visual Foundation Models (VFMs) and incorporates scene geometry via Depth VFMs to extract domain-invariant structural features.
Towards A New Era of Geo-Foundation Models: Expert-Guided Multimodal Alignment and Geospatial Context Awareness
Ting Han, Huan Chen, Chaolei Wang, Yilan Ren, and 1 more author
In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2025
This work develops CLIP4Geo, a unified GeoFM that integrates satellite imagery, LiDAR point clouds, geo-tagged points of interest (POIs), and textual descriptions into a shared representation space.
CityInsight: Incorporating dual-condition based diffusion model into building footprint segmentation from remote sensing imagery
Ting Han, Jin Ma, Chaolei Wang, Yang Luo, and 4 more authors
IEEE Transactions on Geoscience and Remote Sensing, 2025
We propose a framework named CityInsight for analyzing urban building morphology from remote sensing imagery. First, we establish a semantic segmentation network, dual-condition diffusion network (DC-Net), based on a diffusion model to accurately identify building footprints from remote sensing images. Second, we use uncertainty attention and condition attention to generate spatial and semantic priors. Finally, we design acondition injection module to incorporate spatial and semantic information into the diffusion learning.
Individual tree segmentation via contrastive learning and semantic priors in point clouds
Jin Ma, Ting Han, Chaolei Wang, Xiaohai Zhang, and 3 more authors
We propose an effective individual tree segmentation method capable of accurately extracting single trees in urban and forest scenes. The proposed approach consists of two primary steps: (1) We design the Semantic-Driven Instance Clustering to combine the semantic prior with the instance embeddings. (2) We introduce the Online Semantic Clustering for intra-class potential semantic discriminability, improving the instance representation within the same semantic class.
PhyDAWS: Physically-inspired data augmentation with weather simulation for domain-generalized point cloud segmentation
Jing Du, John Zelek, Chaolei Wang, Ting Han, and 3 more authors
International Journal of Applied Earth Observation and Geoinformation, 2025
Submitted to JAG
Progressive Camera-LiDAR Adaptation for Scene Flow Estimation
Ting Han, Yang Luo, Siyu Chen, Xiangyi Xie, and 3 more authors
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2025
3D scene flow aims to recover the dense geometry and 3D motion of dynamic scenes. This paper explores the transformation and adaptation of the 2D-3D feature space in the joint estimation of optical flow and scene flow. Our key insight is to fully leverage the unique characteristics of each modality and maximize their inter-modality complementarity. To achieve this, we propose a novel architecture, named PAFlow, which consists of Camera-LiDAR Adaptation and Spatial Characteristics Adaptation.
2024
Individual tree segmentation of terrestrial tropical mangrove forest point clouds based on multiple constrains at tree tops
Wangjun Liu, Yiping Chen, Chaolei Wang, Wuming Zhang, and 1 more author
To address these issues, we aim to propose an individual tree segmentation algorithm applicable for complex mangrove scenes.“Method” This study innovatively combines deep learning and traditional algorithms to propose a high-precision individual tree segmentation framework for TLS point clouds in complex mangrove scenes. The framework initially employs the deep learning network RandLA-Net for ground filtering and wood-leaf separation. Subsequently, mangrove main stems are segmented using a connected component segmentation method. Finally, individual tree segmentation is achieved through the multiple tree tops constraint module.“Results” To assess the accuracy of the algorithm, we use three measures: completeness, correctness, and accuracy. We also conduct a comparative analysis with two classical algorithms.