I am broadly interested in computer vision and graphics, especially 3D vision. My past work focused on shape analysis, especially 3D shape matching. My recent research explores 3D generation for Embodied AI, with a long-term goal of building intelligent agents that perceive, generate, and interact with the 3D world.
If you would like to collaborate, please feel free to contact me.
Embodied AI and robotic systems increasingly depend on scalable, diverse, and physically grounded 3D content for simulation-based training and real-world deployment. While 3D generative modeling has advanced rapidly, embodied applications impose requirements far beyond visual realism: generated objects must carry kinematic structure and material properties, scenes must support interaction and task execution, and the resulting content must bridge the gap between simulation and reality. This survey reviews 3D generation for embodied AI and organizes the literature around three roles that 3D generation plays in embodied systems. In Data Generator, 3D generation produces simulation-ready objects and assets, including articulated, physically grounded, and deformable content for downstream interaction; in Simulation Environments, it constructs interactive and task-oriented worlds, spanning structure-aware, controllable, and agentic scene generation; and in Sim2Real Bridge, it supports digital twin reconstruction, data augmentation, and synthetic demonstrations for downstream robot learning and real-world transfer. We also show that the field is shifting from visual realism toward interaction readiness, and we identify the main bottlenecks, including limited physical annotations, the gap between geometric quality and physical validity, fragmented evaluation, and the persistent sim-to-real divide, that must be addressed for 3D generation to become a dependable foundation for embodied intelligence.
@article{ye2026survey,author={Ye, Tianwei and Mao, Yifan and Liao, Minwen and Liu, Jian and Guo, Chunchao and Du, Dazhao and Shou, Quanxin and Zhu, Fangqi and Guo, Song},title={3D Generation for Embodied AI and Robotic Simulation: A Survey},journal={arxiv},year={2026}}
arxiv 2026
SGMatch: Semantic-Guided Non-Rigid Shape Matching with Flow Regularization
Establishing accurate point-to-point correspondences between non-rigid 3D shapes remains a critical challenge, particularly under non-isometric deformations and topological noise. Existing functional map pipelines suffer from ambiguities that geometric descriptors alone cannot resolve, and spatial inconsistencies inherent in the projection of truncated spectral bases to dense pointwise correspondences. In this paper, we introduce SGMatch, a learning-based framework for semantic-guided non-rigid shape matching. Specifically, we design a Semantic-Guided Local Cross-Attention module that integrates semantic features from vision foundation models into geometric descriptors while preserving local structural continuity. Furthermore, we introduce a regularization objective based on conditional flow matching, which supervises a time-varying velocity field to encourage spatial smoothness of the recovered correspondences. Experimental results on multiple benchmarks demonstrate that SGMatch achieves competitive performance across near-isometric settings and consistent improvements under non-isometric deformations and topological noise.
@article{ye2026sgmatch,author={Ye, Tianwei and Mei, Xiaoguang and Xia, Yifan and Fan, Fan and Huang, Jun and Ma, Jiayi},title={SGMatch: Semantic-Guided Non-Rigid Shape Matching with Flow Regularization},journal={arxiv},year={2026}}
AAAI 2026
DcMatch: Unsupervised Multi-Shape Matching with Dual-Level Consistency
Establishing point-to-point correspondences across multiple 3D shapes is a fundamental problem in computer vision and graphics. In this paper, we introduce DcMatch, a novel unsupervised learning framework for non-rigid multishape matching. Unlike existing methods that learn a canonical embedding from a single shape, our approach leverages a shape graph attention network to capture the underlying manifold structure of the entire shape collection. This enables the construction of a more expressive and robust shared latent space, leading to more consistent shape-touniverse correspondences via a universe predictor. Simultaneously, we represent these correspondences in both the spatial and spectral domains and enforce their alignment in the shared universe space through a novel cycle consistency loss. This dual-level consistency fosters more accurate and coherent mappings. Extensive experiments on several challenging benchmarks demonstrate that our method consistently outperforms previous state-of-the-art approaches across diverse multi-shape matching scenarios. Code is available at https://github.com/YeTianwei/DcMatch.
@inproceedings{ye2025dcmatch,author={Ye, Tianwei and Ma, Yong and Mei, Xiaoguang},title={DcMatch: Unsupervised Multi-Shape Matching with Dual-Level Consistency},booktitle={AAAI},year={2026}}