Qilin Huang

News

[Aug 2025] I am thrilled to be joining the GRASP Laboratory at the University of Pennsylvania as a visiting research student for the Fall 2025 semester.
I will be working on a project in 3D Physics under the supervision of Prof. Lingjie Liu, and will be directly collaborating with Ph.D. student Long Le.

About Me

I am a final-year undergraduate student in Computer Science at the Southern University of Science and Technology (SUSTech), currently advised by Prof. Feng Zheng.

My research goal is to enable machines to perceive, reason about, and interact with the 3D world as humans do. I am actively seeking a Ph.D. position for Fall 2026 and am passionate about building the next generation of intelligent systems.

Research Interests

Structured 3D Visual Reasoning: Building interpretable models like Neural Module Networks (NMNs) for complex tasks such as 3D Visual Question Answering (VQA).
Physics-Informed 3D AI: Integrating physical principles into deep learning models to enable realistic and predictive understanding of the 3D world.
Multimodal Learning: Fusing vision, language, and other modalities to create robust and generalizable foundation models for 3D understanding.

Publications

HCNQA: Enhancing 3D Visual Question-Answering with Hierarchical Concentration Narrowing Supervision

Shengli Zhou, Jianuo Zhu, Qilin Huang, Fangjing Wang, Yanfu Zhang, Feng Zheng

International Conference on Artificial Neural Networks (ICANN), 2025

( paper , code )

This work introduces HCNQA, a method to enhance spatial reasoning in 3D-VQA. By integrating a Hierarchical Concentration Narrowing (HCN) module, we guide the model's attention to suppress shortcuts and improve performance on the ScanQA benchmark.

Research Experience

Learning Variational Physical Representations from Visual Features

Under the supervision of Prof. Lingjie Liu, in collaboration with Long Le.

[Upcoming] Visiting Researcher @ GRASP Lab, UPenn (Fall 2025)

Upcoming research on more variable and expressive latent representations of physical properties, moving beyond deterministic point estimates to better capture material uncertainty inherent in visual data.

Neural Module Network for 3D-VQA

Advisor: Prof. Feng Zheng (Feb. 2025 – Present)

( paper , code )

Leading a project to develop a Neural Module Network (NMN) for 3D-VQA. My work involves designing a framework to parse natural language questions into executable programs, aiming to improve model interpretability, structured reasoning, and generalization.

Enhancing Spatial Reasoning in 3D Scene Understanding with LLMs

Advisor: Prof. Feng Zheng (Jan. 2024 – May 2024)

( paper , code )

Investigated and implemented a pipeline to fuse textual spatial embeddings from a 3D grounding model (EDA) into an LLM (LEO), providing critical insights into mitigating spatial information loss in multimodal models.

Education

Southern University of Science and Technology (SUSTech), Shenzhen, China
- Bachelor of Science in Computer Science, Sep. 2022 – Jun. 2026 (Expected)
- GPA: 3.89/4.00 (Rank: 14/162)
National University of Singapore (NUS), Singapore
- Summer Workshop in Computer Vision, May 2024 – Jul. 2024
- Grade: A+