Abstract: This paper explores the effectiveness—specifically in improving video consistency—and the computational burden of Contrastive Language-Image Pre-Training (CLIP) embeddings in video ...