dhy大红鹰3366(中国·VIP认证)股份有限公司

The 15th IEEE Image, Video, and Multidimensional Signal Processing Workshop (IVMSP 2026) recently took place at the Tsinghua University Shenzhen International Graduate School.

Dr. Long Chen, CTO of Foundation Models at Jiangxing Intelligence, delivered an invited keynote titled "Toward Physical AI: Challenges and Opportunities in Real-World Industrial Systems." Speaking to faculty, students, and industry leaders, he shared a core insight:

The AI race is shifting — from competing on model capabilities to competing on physical system capabilities.

This perspective is grounded in eight years of systematic research and hands-on deployment in industrial settings by the Jiangxing Intelligence team. What follows are the key takeaways from his talk.

Dr. Long Chen — CTO of Foundation Models, Jiangxing Intelligence

Why Physical AI's Moment Is Now

Dr. Chen outlined five foundational layers that have come together in China, creating fertile ground for industrial physical AI to take root:

Application Layer: China's industrial robot fleet is more than 8 times the size of the United States', growing roughly 12x over the past decade. Higher scenario density means faster formation of task and data feedback loops.

Model Layer: Open-source and industry-specific models — including DeepSeek and Kimi — are catching up rapidly and becoming increasingly optimized for industrial deployment.

Infrastructure Layer: China operates over 4 million 5G base stations, accounting for more than 60% of the global total. This provides the connectivity backbone for on-site access, distributed inference, and multi-device coordination.

Chip Layer: While access to high-end training chips remains constrained, this limitation is actually pushing the industry toward more efficient model architectures, cloud-edge collaboration, and joint software-hardware optimization.

Energy Layer: China's power generation and installed capacity provide the long-term energy foundation needed for AI to move from cloud training to on-site deployment.

"When you stack these five layers together," Chen said, "physical AI in China isn't just a vision — it has real soil to grow in."

Building on this foundation, he identified three structural opportunities: supply-side maturity, closed-loop on-site learning, and efficiency-driven adoption. Jiangxing Intelligence's role, he explained, is to weave these advantages together into industrial physical AI systems that are deployable, replicable, and continuously improving.

The Full-Stack Approach to Industrial Physical AI

Industrial problems can't be solved with a single model. In his talk, Dr. Chen walked through Jiangxing Intelligence's fully in-house JX-Phi physical AI technology stack:

Layer 1: JX-Phi Brain

This is the system's decision-making core. It has two components: the JX-Phi Foundation Model, which includes S-VLM (Spatial Vision-Language Model) for perception and understanding, and LT-VLA (Long-Horizon Vision-Language-Action Model) for perception and action; and the JX-Phi Harness, which orchestrates one-brain-multi-body coordination and industrial procedure execution — ensuring task allocation, state synchronization, and anomaly response across multiple terminals work seamlessly.

"The hard part about industrial AI is this: many critical anomalies don't happen often, but when they do, they must be detected accurately and handled reliably," Chen explained. "We can't afford to wait for problems to occur in the field before we learn from them."

Layer 2: JX-Phi World

This is the data and infrastructure layer, built around two core capabilities: trustworthy on-site learning and world model simulation. It comprises two engines — AutoEdge and AutoWorld. AutoEdge handles real industrial data collection, cloud training, distributed inference, and model deployment. AutoWorld powers world model simulation and data generation, creating long-tail scenarios, risk situations, extreme operating conditions, and complex task workflows. Working together, these two engines turn real-world model feedback, operational experience, and virtual simulation into a flywheel for continuous model improvement.

"Physical AI isn't a standalone model," Chen emphasized. "It's an intelligent system that actually works in the field."

Four Key Technologies That Truly "Bring the Brain On-Site"

Dr. Chen also shared four key technical breakthroughs from Jiangxing Intelligence:

1. Making Industrial Scenes "Continuously Computable"

Real industrial sites are dynamic — people move, equipment creates occlusions, robots perform tasks, networks fluctuate. A static 3D model quickly falls out of sync with reality. Using technologies like TrackerSplat and SizeGS, Jiangxing Intelligence enables continuous updating of 3D scenes. SizeGS specifically solves 3D content compression and transmission under weak network conditions — the core capability behind the CAGS paper recently accepted at SIGGRAPH 2026.

2. Letting AI "Make Mistakes" in Simulation First

In power, energy, and chemical industries, the cost of trial and error in the field is prohibitively high. Using world models, Jiangxing Intelligence has built a closed-loop training pipeline: real-world data feeds into models, world models generate long-tail faults, extreme conditions, and complex workflows, and robot policies go through trial, evaluation, and iteration in simulation before being deployed to real sites.

3. Multimodality: From "Spotting Anomalies" to "Understanding Anomalies"

Take photovoltaic defect detection as an example — the same type of defect can look very different depending on weather, equipment type, and shooting angle. By fusing infrared, visible light, 3D spatial information, and equipment status data, the model doesn't just "see" anomalies — it understands where they occur, why they happen, how risky they are, and what to do about them. The shift is fundamental: from defect recognition to defect comprehension.

4. One Brain, Multiple Bodies: Turning Plans into Action

In large industrial spaces, robots come in different types, with different capabilities and perspectives. Through VLA models and the one-brain-multi-body system, the central brain handles global planning (path planning, for instance), while on-site robots handle physical execution — navigation, meter reading, data transmission. The key insight: physical AI must do more than understand the scene. It must turn planning into action, action into feedback, and feedback into the next decision.

Field Validation: Two Flagship Deployments

New Energy: Physical AI O&M System for Wind Farms

Jiangxing Intelligence's system delivers three capabilities in new energy farms: full-coverage station inspection, all-weather operation, and cluster-level replication. The results speak for themselves: full-station inspection takes just 2 days, compared to 30+ days for traditional manual inspection at the same scale. "The value of physical AI isn't just higher recognition accuracy," Chen noted. "It's the complete reengineering of the entire O&M workflow."

Power Grid: Physical AI Intelligent Inspection System for Substations

In the high-safety environment of substations, Jiangxing Intelligence has built a collaborative architecture of "central brain + embodied brain + controllable terminals," integrating cameras, drones, robot dogs, and more. Today, the system covers over 10,000 high-density inspection points per station, with single inspections lasting 4+ hours. Core algorithm accuracy reaches 99%, and overall operational accuracy averages 96% — dramatically improving both efficiency and safety.

Bringing the Brain to the Field

Concluding his talk, Dr. Chen shared:

"The end goal of physical AI isn't to make models better at answering questions. It's to put intelligent systems on the industrial frontline — reliably executing tasks, consistently creating value."

The next wave of AI's industrial value won't happen only on screens. It will happen in physical spaces, with real equipment, real tasks, and real productivity.

← Back to News