Background Video Clip Download

CLIPPING: Distilling CLIP-Based Models with a Student Base for Video-Language Retrieval

Abstract: Pre-training a vision-language model and then fine-tuning it on downstream tasks have become a popular paradigm. However, pre-trained vision-language models with the Transformer architecture ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

CLIPPING: Distilling CLIP-Based Models with a Student Base for Video-Language Retrieval

Trending now