Hi there 👋
My name is Shengli Zhou, and I am a postgraduate student majoring in Computer Science at The Chinese University of Hong Kong (CUHK). Prior to my master’s studies, I completed my undergraduate program at Southern University of Science and Technology (SUSTech), where I was awarded an Honors Bachelor of Engineering degree. I am actively seeking full-time job opportunities commencing in Fall 2027.
- 🔭 My personal website: fz-zsl.github.io
- 🌱 I’m currently working on multimodal learning for 3D vision-language tasks and short drama generation.
- 📫 How to reach me: zhousl2004@outlook.com
- 🔥 Enture forth to embrace the boundless unknown (去看看未见的广袤!)
Education
Postgraduate: Sep. 2026 - Jul. 2027 (Expected)
The Chinese University of Hong Kong (CUHK)
Undergraduate: Sep. 2022 - Jul. 2026
Southern University of Science and Technology (SUSTech)
Department of Computer Science and Engineering (CSE) & Zhiren College
Major: Computer Science and Technology (GPA: 3.98 / 4.00, rank: 1 / 167)
Member of Turing Class (designated for elite CS students at SUSTech)
Selected Publications
Scalable Object Relation Encoding for Better 3D Spatial Reasoning in Large Language Models
- Shengli Zhou, Minghang Zheng, Feng Zheng, Yang Liu✉
- Accepted by CVPR 2026 Main Conference: Project Page / Paper (CVF) / Paper (arXiv) / Code / 机器之心
- Presentation: CVPR Poster Page / Poster / Video / Slides
- This paper proposes QuatRoPE, a linear-scalable 3D positional embedding method that computes pairwise object spatial relations via quaternion rotations in Transformer attention layers, and the Isolated Gated RoPE Extension (IGRE) to minimize its interference with LLMs’ original language RoPE, while also introducing the ASR benchmark for pure 3D spatial reasoning evaluation; extensive experiments show that the proposed methods consistently boost the 3D spatial reasoning performance of LLMs on multiple 3D VL benchmarks and the ASR benchmark, outperforming strong baselines and validating their effectiveness.
CAPruner: Conceptual-Adjacent Scene Graph Pruner for Enhancing 3D Spatial Reasoning of Large Language Models
- Shengli Zhou, Xiangchen Wang, Guanhua Chen✉, and Feng Zheng✉
- Accepted by ACL 2026 Main Conference: Project Page / Paper (ACL Anthology) / Paper (arXiv) / Code
- Presentation: ACL Conference Page & Video / Poster / Slides
- Existing scene graph pruning for 3D vision-language tasks often discards task-critical relations, harming spatial reasoning. To address this issue, we propose CAPruner, which combines semantic relevance and spatial proximity to estimate relation importance under specific task context, trained without expensive relation-level annotations. Experiments show it preserves key spatial relations and significantly boosts LLM performance on 3D-VL tasks.
Learn 3D VQA Better with Active Selection and Reannotation
- Shengli Zhou, Yang Liu, Feng Zheng✉
- Accepted by ACM MM 2025: Paper (ACM Digital Library) / Paper (arXiv) / Code / 公众号
- To address the negative impact of inevitable improper annotation in 3D Visual Question-Answering and the scarcity of annotations, we propose a multi-turn interactive active learning strategy, combining semantic variance-based data selection with interactive oracle reannotation, enhancing answer quality and reducing training costs.
Click here to see the full list.
Research & Visiting Experience
Feb. 2025 - Jul. 2025
Wangxuan Institute of Computer Technology (WICT)
Multimedia Information Processing Lab (MIPL)
Supervisor: Prof. Yang Liu
May 2024 - Jul. 2024
National University of Singapore (NUS)
School of Computing (SoC) Summer Workshop (SWS) 2024
Visual Computing | Lecturer: Prof. Terence Sim | Poster
Selected Projects
When Active Learning and Data Augmentation Meet at Object Detection
- Autonomous driving systems face challenges in real-time object detection due to high labeling costs and limited dataset diversity, prompting optimization of the RT-DETR-v2 model.
- Integrated active learning (AL) and data augmentation to enhance detection accuracy while minimizing labeling efforts, focusing on evaluating AL strategies (random, entropy-based, information gain) and augmentation levels (e.g., motion blur, noise).
- Achieved 75.4% mAP (6% improvement) on KITTI, validating entropy-based AL as optimal and medium-strength augmentation for balancing accuracy and real-world alignment, enabling efficient data utilization. Github
Masked-Unmasked Face Recognition
- Face recognition struggles with brightness, angle, expression, and occlusion variations.
- This project uses a non-deep learning approach to develop a classifier inspired by human recognition. It focuses on selecting high-quality local features to improve accuracy and reliability in face recognition. The classifier adaptively selects features based on current conditions, excluding less relevant local features to enhance recognition accuracy.
- A mask detector with 100% accuracy on the Georgia Tech Dataset and maintaining high accuracy on harder datasets. Poster
Click here to see the full list.
Honors & Awards
- China National Scholarship for Undergraduates (2024 & 2025)
- The Grand Prize of School Motto “Qiushi” (Truth) Scholarship, SUSTech (1 out of 5000)
- The Nomination Prize of “Student of the Year”, SUSTech (6 out of 5000)
- Top 10 Outstanding Undergraduate Graduates, College of Engineering, SUSTech
- Championship for China Collegiate Programming Contest - Guangdong Provincial Collegiate Programming Contest
- First Prize for The 15th Chinese Mathematics Competitions
News Reports
- [May 2026] 26届优秀毕业生风采|周圣力:在代码间沉淀,于科研中成长
- [Nov. 2025] 他们是校训奖 | 周圣力:学海无涯,求索不息
- [Oct. 2025] 校训奖学金评选结果出炉,祝贺这28位同学!
- [Nov. 2024] 祝贺!南科大2024年本科生先进集体及优秀个人
- [Oct. 2024] 优秀!他们获得校训奖学金
- [May 2023] 2023年(第二十届)广东省大学生程序设计竞赛成功举办