About

I am Yuqiang Yang, a Research and Development Engineer at the Embodied AI Center, Shanghai AI Laboratory, working with Dr. Tai Wang and Dr. Jiangmiao Pang. I have full-stack experience in robotic foundation models, covering data processing, training infrastructure, high-speed model inference, and simulation/real-world evaluation. My current research interest is to build robust and generalizable robotic systems through systematic system design and holistic optimization, enabling embodied AGI to better serve real-world human needs.

My research experience spans three closely connected areas: robot manipulation, navigation, and mobile manipulation. In robot manipulation, I work on pre-training and adapting robotic foundation models with heterogeneous robot and multimodal data (e.g. COCO series, LLaVA series) and large-scale robot datasets (e.g. OXE, AgiBot World), and evaluate them on mainstream simulation benchmark (e.g Robtwin, Simpler, Calvin) and real-world environment. In navigation, I explore dual-system navigation frameworks, where a VLM-based System-2 model performs high-level reasoning and decision-making, while a diffusion-based System-1 model enables smooth and efficient obstacle avoidance across different embodiments, such as Unitree G1, Go2, TurtleBot, and Galaxea. In mobile manipulation, I investigate whole-body planning and control for mobile manipulators, enabling efficient pick-and-place tasks while avoiding collisions in cluttered and dynamic environments with chairs, tables, and shelves.

Our team is dedicated to building Embodied AGI systems and empowering both academia and industry through open-source initiatives. We have contributed many research projects and codebases on GitHub. If you are interested in joining us or collaborating with us, feel free to contact us.

Education and Training

South China University of Technology
Master, Robotics
Supervisor: Prof. Chenguang Yang
GPA: 3.82/4.0 (ranked first)

Sep. 2022 - Jun. 2025

South China University of Technology
Bachelor of Engineering, Automation
School of Automation Science and Engineering
GPA: 3.94/4.0 (ranked first)

Sep. 2018 - Jun. 2022

FastLab of Zhejiang University
Visiting student
Wholebody planning and control for multicopter
Supervisor: Prof. Fei Gao

Oct. 2023 - Nov. 2023

Research Experiences

InternVLA-A1: Unifying Understanding, Generation and Action for Robotic Manipulation

2026 Technical report

[Project Page] [Paper] [Code] [Data] [Model]

We present InternVLA-A1, a unified VLA model that coordinates scene understanding, visual foresight generation, and action execution with a Mixture-of-Transformers architecture. Trained on heterogeneous robot, simulation, and human-video data, InternVLA-A1 brings world-model-style dynamics prediction into robotic manipulation and shows strong performance on both static and highly dynamic tasks.

LatentPilot: Scene-Aware Vision-and-Language Navigation by Dreaming Ahead with Latent Visual Reasoning

2026 arXiv

[Project Page] [Paper] [Code]

We propose LatentPilot, a future-aware VLN paradigm that learns action-conditioned visual dynamics from future observations during training while requiring only current observations at inference. Its recurrent latent visual tokens allow the agent to dream ahead, reason over likely scene changes, and improve navigation decisions in simulation and real-world tests.

OVExp: Open Vocabulary Exploration for Object-Oriented Navigation

2026 ICRA

[Paper] [HTML]

We introduce OVExp, a learning-based framework for open-vocabulary object-oriented exploration. OVExp builds VLM-aware top-down scene representations, aligns text or image goals in the same feature space, and predicts goal-conditioned target locations with a lightweight decoder, enabling efficient generalization to unseen objects, image goals, and novel scenes.

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-Language Navigation

2026 ICLR

[Project Page] [Paper] [Code] [Zhihu]

We propose DualVLN, a dual-system foundation model for Vision-Language Navigation, which includes: (i) System 2: a large foundation VLM, performs slow but robust reasoning and produces explicit pixel goals. (ii) System 1: a lightweight diffusion policy, generates smooth and safe trajectories in real time.

LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry

2026 ICRA

[Project Page] [Paper] [Code] [RedNote]

We introduce LoGoPlanner, a localization-grounded, end-to-end navigation framework: (i) finetuning a long-horizon visual-geometry backbone to ground predictions with absolute metric scale, thereby providing implicit state estimation for accurate localization; (ii) reconstructing surrounding scene geometry from historical observations to supply dense, fine-grained environmental awareness for reliable obstacle avoidance; and (iii) conditioning the policy on implicit geometry bootstrapped by the aforementioned auxiliary tasks, thereby reducing error propagation.

InternVLA-N1: An Open Dual-System Vision-Language Navigation Foundation Model with Learned Latent Plans

2025 Technical report

[Project Page] [Paper] [Code] [Zhihu]

We introduce InternVLA-N1 the first open dual-system vision-language navigation foundation model. Unlike previous navigation foundation models that can only take short-term actions from a limited discrete space, InternVLA-N1 decouples the task as pixel-goal planning with System 2 and agile execution with System 1.

StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling

2026 ICRA

[Project Page] [Paper] [Code] [Zhihu]

We propose StreamVLN, a streaming VLN framework that employs a hybrid slow-fast context modeling strategy to support multi-modal reasoning over interleaved vision, language and action inputs. StreamVLN can understand complex human instructions and finish VLN task in various indoor/outdoor scenarios.

NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance

2026 ICRA

[Project Page] [Paper] [Code] [Zhihu]

We propose NavDP, an end-to-end framework trained solely in simulation that combines diffusion-based trajectory generation and a critic function for trajectory selection (conditioned on local observation tokens from a shared policy transformer); it can zero-shot transfer to different robot embodiments in diverse real-world environments.

RAMPAGE: Towards Whole-body, Real-Time and Agile Motion Planning in Dynamic Cluttered Environments for Mobile Manipulators

[Video] [PDF]

2024 IEEE Transaction on Industrial Electronics

We proposed a hierarchical topology-guided search and AL-DDP-based optimization to solve whole-body kinodynamic planning in dynamic environments, and achieved real-time planning (≈30ms) and ≈80% success rate with accurate collision detection via ESDF map and sphere decomposition.

Learn to Coordinate: a Whole-Body Learning from Demonstration Framework for Differential Drive Mobile Manipulators

[Video1, Video2] [PDF]

2023 IEEE Conference on Systems, Man, and Cybernetics

We developed a Gaussian Process-based learning framework with WLN inverse kinematics for few-shot whole-body skill learning, which enabled coordinated door opening with disturbance rejection and simplified human guidance via admittance control.

Project Experiences

Locomotion in complex terrain through reinforcement learning in Isaac lab

[Video]

We used PPO algorithm with a fine-tuned reward function to train Unitree H1’s locomotion via curriculum learning, and realized robust Sim-to-Sim transfer of RL policies to diverse photo-realistic environments in Nvidia Isaac Sim.

Risk-aware contingency motion planning under uncertainties for Automated Valet Parking(AVP)

DJI Automotive

We proposed Voronoi-based safe corridors and SplineGrid optimization for lateral bypass plus iLQR for risk-aware longitudinal speed, and handled prediction multimodality via tree-branch iLQR to ensure safe trajectory under perception/prediction uncertainties.

Low-cost and efficient location, mapping, planning and control for multicopter in embedded system

[Video1, Video2, Video3]

Application Innovate Laboratory, Huawei; FastLab, Zhejiang University | 2023.2 - 2024.3

We enhanced VINS-Fusion with learning-based features and QR optimization for better accuracy/robustness on low-performance chips, and implemented fast OGM updating (incremental inflation) and robot-centric ESDF for efficient MPCC-based collision-aware control.

SE3 planning and control for multicopter to cross narrow gap

[Video]

FastLab, Zhejiang University | 2023.10 - 2023.12

We built safe flight corridors (SFC) and used MINCO for spatial-temporal trajectory optimization via L-BFGS, and achieved accurate narrow gap crossing by calibrating thrust mapping and tuning controllers for large-attitude tracking.

Pedestrian following and collision avoidance with spatial-temporal optimization for differential car

[Video]

Application Innovate Laboratory, Huawei; FastLab, Zhejiang University | 2023.2 - 2023.11

We developed multi-level hybrid A* for EKF-predicted pedestrian following plus MINCO optimization for smooth trajectories, and ensured collision avoidance via lidar filtering and MPC control to handle kinematic constraints and communication delay.

Self-balanced racing car with wireless charging capability

[Video]

School of Automation Science and Engineering, SCUT | 2020.01 - 2020.08

We designed adaptive PD wireless charging (≈30W) for super-capacitors and tuned cascade controllers for stable track navigation, which helped win 5th national place with a 23.8s race time and successful passage through circles, slopes, and crossroads.

Wholebody pick-and-place for mobile manipulator

[Video]

Application Innovate Laboratory, Huawei | 2022.10 - 2022.11

We trained GGCNN for 6-D object perception and used OSQP to solve dynamic-weighted QP for whole-body coordination, which enabled smooth pick-and-place by balancing manipulability, energy, and trajectory tracking performance.

Honor & Awards

National Scholarship
2024-10
National Scholarship
2021-10
Scholarship of Guangzhou Automobile Group Co., Ltd
2020-10
National Scholarship
2019-10
The first prize of the National University Students Intelligent Car Race
2020-08
Meritorious Winner of Interdisciplinary Contest In Modeling (ICM)
2021-03

Internship

DJI Automotive
Intern, PnC
Decision and motion planning for autonomous vehicles
Supervisor: Dr. Yifan Tang, Dr. Zhepei Wang

Apr. 2024 - Nov. 2024

Huawei Technologies Co.Ltd
Intern, Application Innovation Laboratorys
Planning and Control for various robots
Supervisor: Dr. Chen Chen, Dr. Zehui Meng

June. 2022 - Apr. 2024

China-Singapore International Joint Research Institute
Intern, Robot Perception and Computer Vision Group
Multi-sensor calibration and 3D detection
Supervisor: Dr. Mingxing Wen

Jan. 2021 - Mar. 2021

Skills

Programming:

Python, MATLAB, C/C++, PyTorch, Pybullet, Airsim, Embedded System, Distributed Training, High-speed Model Inference

Robotics:

Wholebody Control, Perception and Mapping, Motion Planning, Convex Optimization, Trajectory Optimization, Admittance/Impedance Control, Gravity Compensation, Teleoperation

Embodied AI:

Data Preparation, Model Training and Evaluation, Sim2Real Transfer, Embodied Foundation Models, Vision-Language-Action Models, Multimodal Data Processing, Robot Benchmarking and Deployment

System Integration:

Robot System Design, Training Infrastructure, Simulation Evaluation, Real-world Evaluation, Model Deployment, Cross-embodiment Generalization

Robot Hardware Platforms:

Unitree G1, Unitree Go2, Turtlebot4, Galaxea R1, Ark X5, Lift, AgileX, RealMan, AgiBot G2, Multicopter, Diablo, Franka, UR, Mobile Manipulator, Robotiq 2F85, Vicon, Touch X, ATI sensors, STM32

Yuqiang Yang(杨宇强)

About

Education and Training

Research Experiences

InternVLA-A1: Unifying Understanding, Generation and Action for Robotic Manipulation

LatentPilot: Scene-Aware Vision-and-Language Navigation by Dreaming Ahead with Latent Visual Reasoning

OVExp: Open Vocabulary Exploration for Object-Oriented Navigation

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-Language Navigation

LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry

InternVLA-N1: An Open Dual-System Vision-Language Navigation Foundation Model with Learned Latent Plans

StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling

NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance

RAMPAGE: Towards Whole-body, Real-Time and Agile Motion Planning in Dynamic Cluttered Environments for Mobile Manipulators

Learn to Coordinate: a Whole-Body Learning from Demonstration Framework for Differential Drive Mobile Manipulators

Project Experiences

Locomotion in complex terrain through reinforcement learning in Isaac lab

Risk-aware contingency motion planning under uncertainties for Automated Valet Parking(AVP)

Low-cost and efficient location, mapping, planning and control for multicopter in embedded system

SE3 planning and control for multicopter to cross narrow gap

Pedestrian following and collision avoidance with spatial-temporal optimization for differential car

Self-balanced racing car with wireless charging capability

Wholebody pick-and-place for mobile manipulator

Honor & Awards

Internship

Skills