We are excited to release Pusa V1.0, a groundbreaking paradigm that leverages vectorized timestep adaptation (VTA) to enable fine-grained temporal control within a unified video diffusion framework.
Abstract: Zero-shot text-to-video diffusion models are crafted to expand pre-trained image diffusion models to the video domain without additional training. In recent times, prevailing techniques ...
VideoGuide 🚀 enhances temporal quality in video diffusion models without additional training or fine-tuning by leveraging a pretrained model as a guide. During inference, it uses a guiding model to ...
Abstract: Large-scale text-to-video diffusion models have shown outstanding capabilities. However, their direct application to video stylization is hindered by the limited availability of ...