Computer Science > Computer Vision and Pattern Recognition

arXiv:2606.26740 (cs)

[Submitted on 25 Jun 2026]

Title:LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing

Authors:Xinyu Wang, Chongbo Zhao, Fangneng Zhan, Yue Ma

Abstract:Streaming video editing has made rapid progress, yet practical deployment is still limited by two core issues: maintaining stable backgrounds and non-edited regions over time, and achieving the low latency required for real-time interactive scenarios. Meanwhile, recent streaming video generation methods are mostly developed for synthesis and cannot be directly applied to editing due to the strict preservation requirement and region-specific control. In this work, we present a novel streaming video editing framework that performs causal, frame-by-frame editing with strong content preservation and real-time responsiveness. Our key design is a three-stage distillation pipeline that progressively transfers editing capability from a powerful bidirectional foundation model to an efficient unidirectional streaming editor, enabling stable long-horizon edits without sacrificing visual fidelity. To further support real-time deployment, we introduce an AR-oriented mask cache that reuses region-related computation across frames, substantially reducing redundant processing and accelerating inference. Finally, we establish a dedicated benchmark for streaming video editing. Extensive evaluations demonstrate that our method achieves state-of-the-art visual quality among streaming baselines while drastically boosting inference speed to 12.66 FPS, making it suitable for interactive and augmented reality applications.

Comments:	Accepted by ECCV 2026, Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2606.26740 [cs.CV]
	(or arXiv:2606.26740v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2606.26740

Submission history

From: Xinyu Wang [view email]
[v1] Thu, 25 Jun 2026 08:24:03 UTC (33,462 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing

Submission history

Access Paper:

Current browse context:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing

Submission history

Access Paper:

Current browse context:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators