• Home
  • Sofware
  • Thanks to artificial intelligence, the era of video production has begun: The results are impressive

Thanks to artificial intelligence, the era of video production has begun: The results are impressive

So far, we have seen artificial intelligence models that produce video, but Stability AI, the artificial intelligence initiative behind Stable Diffusion, enters this field and raises the quality considerably. New...
 Thanks to artificial intelligence, the era of video production has begun: The results are impressive
READING NOW Thanks to artificial intelligence, the era of video production has begun: The results are impressive

So far, we have seen artificial intelligence models that produce video, but Stability AI, the artificial intelligence initiative behind Stable Diffusion, enters this field and raises the quality considerably. The new Stable Video Diffusion model, which differs from its counterparts in terms of quality and realism, allows users to create videos from a single image.

Artificial intelligence now also produces videos

Stable Video Diffusion actually comes in two models: SVD and SVD-XT. The first, SVD, converts still images into 576×1024 videos at 14 frames. SVD-XT uses the same architecture, but increases the frames to 24. Both can produce videos between three and 30 frames per second. This video appears to be either head-to-head with or better than output from Meta’s latest video rendering model, as well as output from Google and AI startups Runway and Pika Labs.

Besides these developments, Stable Video Diffusion can currently only be used for research purposes, not for real world or commercial applications. While Stability AI notes that potential users can sign up for a waiting list for access, the tool could be used in potential applications in advertising, education, entertainment and many more industries.

There are missing

The examples shown in the video above appear to be of relatively high quality and match competing manufacturer systems. However, it has some shortcomings, the company writes: it produces relatively short video (less than 4 seconds), lacks excellent photorealism, can’t do camera movement other than slow pans, has no text control, can’t produce legible text, and may not render people and faces properly.

On the training side, Stability AI says the tool is trained on a dataset of millions of videos and then goes through fine-tuning on a smaller dataset of a few hundred thousand to a million videos. Stability AI emphasizes that it only uses publicly available videos for research purposes.

Just as text-to-image artificial intelligence tools are rapidly developing and reaching the photorealistic level, video-producing artificial intelligence will also be able to quickly produce much more realistic content. All of this comes with the risk of deepfakes, copyright, and some misuse. Therefore, it is essential that developments are made within limitations.

Comments
Leave a Comment

Details
173 read
okunma41475
0 comments