One paper has been accepted by CVPR 2026!
Scalable Object Relation Encoding for Better 3D Spatial Reasoning in Large Language Models
- Shengli Zhou, Minghang Zheng, Feng Zheng✉, Yang Liu✉
- Accepted by CVPR 2026
- This paper proposes QuatRoPE, a linear-scalable 3D positional embedding method that computes pairwise object spatial relations via quaternion rotations in Transformer attention layers, and the Isolated Gated RoPE Extension (IGRE) to minimize its interference with LLMs’ original language RoPE, while also introducing the ASR benchmark for pure 3D spatial reasoning evaluation; extensive experiments show that the proposed methods consistently boost the 3D spatial reasoning performance of LLMs on multiple 3D VL benchmarks and the ASR benchmark, outperforming strong baselines and validating their effectiveness.
- Title: One paper has been accepted by CVPR 2026!
- Author: Shengli Zhou
- Created at : 2026-02-23 13:50:00
- Updated at : 2026-02-25 09:39:17
- Link: https://fz-zsl.github.io/CVPR2026/
- License: This work is licensed under CC BY-NC-SA 4.0.