Publications | Chaolei Wang

An up-to-date list is available on Google Scholar.

2025

SGS-3D: High-Fidelity 3D Instance Segmentation via Reliable Semantic Mask Splitting and Growing

Chaolei Wang, Yang Luo, Jing Du, Siyu Chen, and 2 more authors

In Conference on Artificial Intelligence (AAAI), 2025

Abs PDF Code

We propose splitting and growing reliable semantic mask for high-fidelity 3D instance segmentation (SGS-3D), a novel "split-then-grow" framework that first purifies and splits ambiguous lifted masks using geometric primitives, and then grows them into complete instances within the scene.
VoxelFlow: 2D Semantic Mask-Guided Voxel Flow for Open-Vocabulary 3D Instance Segmentation

Chaolei Wang, Huan Chen, Jin Ma, Ting Han, and 1 more author

In International Conference on Cyberworlds, 2025

Awarded Abs

Best Paper Honorable Mention

We propose VoxelFlow, a novel framework that effectively combines 2D mask semantics with 3D geometric features. Our core insight is to leverage the implicit semantic and spatial information embedded in 2D masks, using them to replace rigid 3D priors.
Vireo: Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation

Siyu Chen^*, Ting Han^*, Chaolei Wang, Chengzhen Fu, and 4 more authors

In Conference on Neural Information Processing Systems (NeurIPS), 2025

Abs PDF Code

We introduce Vireo, a novel single stage framework for OV-DGSS that unifies the strengths of OVSS and DGSS for the first time. Vireo builds upon frozen Visual Foundation Models (VFMs) and incorporates scene geometry via Depth VFMs to extract domain-invariant structural features.
Towards A New Era of Geo-Foundation Models: Expert-Guided Multimodal Alignment and Geospatial Context Awareness

Ting Han, Huan Chen, Chaolei Wang, Yilan Ren, and 1 more author

In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2025

Abs PDF

This work develops CLIP4Geo, a unified GeoFM that integrates satellite imagery, LiDAR point clouds, geo-tagged points of interest (POIs), and textual descriptions into a shared representation space.
CityInsight: Incorporating dual-condition based diffusion model into building footprint segmentation from remote sensing imagery

Ting Han, Jin Ma, Chaolei Wang, Yang Luo, and 4 more authors

IEEE Transactions on Geoscience and Remote Sensing, 2025

Abs PDF

We propose a framework named CityInsight for analyzing urban building morphology from remote sensing imagery. First, we establish a semantic segmentation network, dual-condition diffusion network (DC-Net), based on a diffusion model to accurately identify building footprints from remote sensing images. Second, we use uncertainty attention and condition attention to generate spatial and semantic priors. Finally, we design acondition injection module to incorporate spatial and semantic information into the diffusion learning.
Individual tree segmentation via contrastive learning and semantic priors in point clouds

Jin Ma, Ting Han, Chaolei Wang, Xiaohai Zhang, and 3 more authors

Urban Forestry & Urban Greening, 2025

Abs PDF

We propose an effective individual tree segmentation method capable of accurately extracting single trees in urban and forest scenes. The proposed approach consists of two primary steps: (1) We design the Semantic-Driven Instance Clustering to combine the semantic prior with the instance embeddings. (2) We introduce the Online Semantic Clustering for intra-class potential semantic discriminability, improving the instance representation within the same semantic class.
PhyDAWS: Physically-inspired data augmentation with weather simulation for domain-generalized point cloud segmentation

Jing Du, John Zelek, Chaolei Wang, Ting Han, and 3 more authors

International Journal of Applied Earth Observation and Geoinformation, 2025

Submitted to JAG
Progressive Camera-LiDAR Adaptation for Scene Flow Estimation

Ting Han, Yang Luo, Siyu Chen, Xiangyi Xie, and 3 more authors

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2025

Abs PDF

3D scene flow aims to recover the dense geometry and 3D motion of dynamic scenes. This paper explores the transformation and adaptation of the 2D-3D feature space in the joint estimation of optical flow and scene flow. Our key insight is to fully leverage the unique characteristics of each modality and maximize their inter-modality complementarity. To achieve this, we propose a novel architecture, named PAFlow, which consists of Camera-LiDAR Adaptation and Spatial Characteristics Adaptation.

2024

Individual tree segmentation of terrestrial tropical mangrove forest point clouds based on multiple constrains at tree tops

Wangjun Liu, Yiping Chen, Chaolei Wang, Wuming Zhang, and 1 more author

National Remote Sensing Bulletin, 2024

Abs PDF

To address these issues, we aim to propose an individual tree segmentation algorithm applicable for complex mangrove scenes.“Method” This study innovatively combines deep learning and traditional algorithms to propose a high-precision individual tree segmentation framework for TLS point clouds in complex mangrove scenes. The framework initially employs the deep learning network RandLA-Net for ground filtering and wood-leaf separation. Subsequently, mangrove main stems are segmented using a connected component segmentation method. Finally, individual tree segmentation is achieved through the multiple tree tops constraint module.“Results” To assess the accuracy of the algorithm, we use three measures: completeness, correctness, and accuracy. We also conduct a comparative analysis with two classical algorithms.