Peiwen Zhang*, Yufan Deng*, Shangkun Sun*, Juncheng Ma, Duomin Wang†, Jonas Du, Zilin Pan, Ye Huang, Hao Liang, Songyan Huang, Ruihua Zhang, Enze Xie†, Ming-Yu Liu, Daquan Zhou†‡
Peking University · NVIDIA
*Equal Contribution †Co-Project Lead ‡Corresponding Author
PhysisForcing is a training-time, plug-and-play framework that makes robotic video generation physically plausible. It focuses supervision on interaction-critical regions and aligns generation at two levels — a pixel-level trajectory loss and a semantic-level relational loss — on an intermediate DiT feature, dropping into existing video backbones (Wan, Cosmos) with zero extra inference cost. It ranks first on R-Bench, PAI-Bench, and EZS-Bench, and as a world model lifts the WorldArena-action planner closed-loop success rate from 16.0% → 24.0%.
PF-Cosmos = Cosmos3-Nano + PhysisForcing | PF-Wan = Wan + PhysisForcing
[2026.06.29]🚀 Code and model weights will be open-sourced within one week (Expected: 2026.07.06).[2026.06.29]🔥 We release the arXiv paper and project page of PhysisForcing.
- Inference code & checkpoints for PF-Wan & PF-Cosmos
- Training code & auxiliary model checkpoints
The video below is a compressed preview. Full HD demos and the complete set of side-by-side qualitative comparisons (playable videos across embodiments and tasks) are best viewed on the Project Page.
demo.mp4
All scores are normalized percentages (higher is better, only the Avg. column shown). Bold = PhysisForcing variants (PF-Wan14B/PF-Wan5B on Wan, PF-Cosmos on Cosmos3-Nano). The first three are embodied video-generation benchmarks; the rightmost is the WorldArena Action Planner world-model evaluation (IDM closed-loop success rate).
|
|
|
|
Full per-metric tables and ablations are in the paper.
git clone https://github.com/DAGroup-PKU/PhysisForcing.git
cd PhysisForcing
conda create -n physisforcing python=3.10 -y
conda activate physisforcing
pip install torch==2.5.1 torchvision==0.20.1
pip install -r requirements.txt🚧 Coming soon.
🚧 Coming soon.
Our work builds on many excellent projects: Wan, Cosmos, CoTracker3, Depth-Anything-2, V-JEPA, and R-Bench / RoVid-X.
If you find PhysisForcing useful, please consider giving a ⭐ and citing:
@article{zhang2026physisforcing,
title={PhysisForcing: Physics Reinforced World Simulator for Robotic Manipulation},
author={Zhang, Peiwen and Deng, Yufan and Sun, Shangkun and Ma, Juncheng and
Wang, Duomin and Du, Jonas and Pan, Zilin and Huang, Ye and Liang, Hao and
Huang, Songyan and Zhang, Ruihua and Xie, Enze and Liu, Ming-Yu and Zhou, Daquan},
journal={arXiv preprint arXiv:2606.28128},
year={2026}
}This project is released under the MIT License.
