Stability AI, one of the leading companies producing artificial intelligence-supported technologies, just like OpenAI, made a very important announcement today. In this announcement, the company’s new artificial intelligence model “Stable Video Diffusion” was introduced. Built on the company’s artificial intelligence model “Stable Diffusion”, which helps produce images from text, the new model converts texts first into images and then into videos.
Stable Video Diffusion is an artificial intelligence model that is still in its early stages. As such, it is not available to everyone. However, Stability AI’s lucky individual and commercial license users can already try the company’s new artificial intelligence model. Meanwhile; The examples shared for Stable Video Diffusion reveal that the technology is not bad at all.
Here are some sample videos produced with Stable Video Diffusion:
According to the statements made by Stability AI, the new artificial intelligence model can produce two different output formats: SVD and SVD-XT. SVD format converts 14 frames into video with a resolution of 576×1024 pixels. SVD-XT, on the other hand, can use 24 frames for one content. In both formats, images can be captured between 3 and 30 fps.
Stable Video Diffusion is a model in which millions of videos are used for education. The company retrained the model, which it trained with such a huge data set, using approximately one million videos in the second stage. The second training enabled fine-tuning the images produced by Stable Video Diffusion. However, the company did not make a statement about where it obtained the videos used in its data sets. In fact, the company says that the training data is taken from royalty-free and public databases, but no one knows how this is done.
According to Stability AI’s statement, the new artificial intelligence model was actually designed for commercial use. The company says that the technology it has developed will make things easier in some sectors such as advertising, education and entertainment. However, it should not be overlooked that there are some problems that its individual use may cause. After all, we all know the consequences of deepfake technology…
Here Stable Video Diffusion contains some restrictions to eliminate the risks of individual use. According to the statements made, this artificial intelligence model does not allow reorganization. Additionally, a person’s face may not match the one in the text. In addition to all this, the new artificial intelligence model does not produce images that are largely still or contain slow camera effects. It is currently unclear whether the team will be able to protect consumers with these methods.