Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction
Paper • 2606.05769 • Published • 5
Computer Vision
Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction
RIVER: A Real-Time Interaction Benchmark for Video LLMs