LongDWM: Cross-Granularity Distillation for Building a Long-Term Driving World Model
Xiaodong Wang
1,2
, Zhirong Wu
1
, Peixi Peng
1,2
1
Peking University
2
Peng Cheng Laboratory
1. Long Video prediction compared with Vista
Videos in this section are: 15 seconds, 8 Hz, 480×720 resolution.
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
2. Short Video prediction compared with Vista
Videos in this section are: 3 seconds, 8 Hz, 480×720 resolution.
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
Ours
Vista
3. Trajectory Controllability Comparison with Vista
Videos in this section are: 2.5 seconds, 10 Hz, 480×720 resolution.
Ours
Ours
Vista
Vista
Ours
Ours
Vista
Vista
4. Additional Results of Trajectory Controllability
Videos in this section are: 2.5 seconds, 10 Hz, 480×720 resolution.
Normal
Greater
Smaller
Opposite
Normal
Greater
Smaller
Opposite
Normal
Greater
Smaller
Opposite
Normal
Greater
Smaller
Opposite
5. Longer Video prediction compared with Vista
Videos in this section are: 90 seconds, 8 Hz, 480×720 resolution.
Ours
Ours
Ours
Vista
Vista
Vista