• Share this News :        


  • November 23, 2023
  • Hiba Moideen
Stability AI Unveils AI-Driven Video Generation with Stable Video Diffusion

Amidst the ongoing buzz surrounding OpenAI's internal turmoil, AI startups are diligently advancing their product roadmaps. Stability AI takes the spotlight, announcing a groundbreaking development—Stable Video Diffusion, an AI model designed to generate videos by animating existing images.

This novel technology, an extension of Stability's existing Stable Diffusion text-to-image model, marks a notable addition to the limited pool of video-generating models available, both in open source and commercially.

Stable Video Diffusion is currently in a "research preview" phase, accessible to those who agree to specific terms of use outlining its intended applications. However, considering the historical trajectory of AI research previews, there is a concern that the model could find its way onto the dark web, potentially leading to misuse due to the absence of a built-in content filter.

The Stable Video Diffusion release comprises two models—SVD and SVD-XT. SVD transforms still images into 576x1024 videos with 14 frames, while SVD-XT, utilizing the same architecture, increases the frame count to 24. Both models can generate videos at varying frame rates, ranging from three to 30 frames per second.

While Stability AI's blog showcases cherry-picked samples that appear competitive with outputs from industry giants like Meta, Google, Runway, and Pika Labs, there are acknowledged limitations. Stability transparently communicates that the models struggle with generating videos lacking motion or slow camera pans, being controlled by text, rendering text legibly, or consistently generating faces and people accurately.

Stability AI acknowledges these limitations and emphasizes the extensibility of the models, suggesting potential adaptations for use cases such as generating 360-degree views of objects.

The training data for SVD and SVD-XT comprises millions of videos fine-tuned on a smaller set of hundreds of thousands to a million clips. The origin of this data raises questions about potential copyright concerns, with implications for the legal and ethical usage rights of Stability and its users.

Despite these considerations, Stability AI has ambitious plans for Stable Video Diffusion, aiming to introduce additional models building on SVD and SVD-XT. The startup envisions a "text-to-video" tool that integrates text prompting on the web, with aspirations for commercial applications in advertising, education, entertainment, and beyond.

Stability AI, which recently raised $25 million through a convertible note, faces financial challenges and executive turnover, yet maintains a vision of evolving its AI capabilities and expanding its market impact.