AI video generation is becoming a key node in the current AI industry development, but not limited to video generation. From AI generation to AI workflow, one-stop AI video generation + editing + UGC creation is expected to become the core development direction of the industry.
Zhixin Finance APP learned that Xinda Securities released a research report stating that AI video generation is becoming a key node in the current development of the AI industry, but not limited to video generation. From AI generation to AI workflow, one-stop AI video generation + editing + UGC creation is expected to become the core development direction of the industry.
Standing at the present, this report studies the significance of AI + video: the rapid iterative upgrades of technology and products have resulted in most of the current market reports lacking timeliness and often lacking actual product testing and comparative analysis of the same prompts. AI video generation has become a key node in the current development of the AI industry. Video mixes multidimensional content such as text, voice, and images, and the training difficulty often lies in the insufficient quantity and quality of video data, the need for algorithm architecture optimization, poor physical regularity, and so on. However, with the technological and product upgrades of AI + video, many industries are expected to benefit, such as movies, advertising, video editing, video streaming platforms, UGC creation platforms, and short video comprehensive platforms, and we are now in a key moment in the development of AI + video.
The mainstream AI video generation technology iteration path has undergone early GAN+VAE, Transformer, Diffusion Model, and DiT architecture (Transformer+Diffusion) used by Sora, and technological iteration and upgrading have brought about a leap forward in video processing quality. VAE introduces latent variable inference, GAN generates clear and real images, and the concatenation of VAE+GAN can achieve automatic data generation + high-quality image generation; Transformer has strong advantages in parallel processing, long-time sequence data processing, and multi-attention processing, and model performance can be improved through pre-training and fine-tuning; diffusion model has strong interpretability and can generate high-quality images and videos; Li Feifei's joint development with Google WALT video large model encodes images and videos into a shared latent space. The DiT architecture used by Sora effectively combines Transformer to process image data blocks in the latent space, simulating the diffusion process to generate longer and higher quality images and videos.
Xinda Securities believes that domestic AI + video products have a lower single price than overseas products, among which RunwayGen-3 Alpha and kuaishou Kelin are currently in the first echelon of global AI video generation, with excellent performance in many dimensions such as video resolution, generation speed, object compliance with physical rules, prompt word understanding, and video length. The core participants in the domestic and overseas AI video generation markets are summarized, such as overseas Luma AI (Dream Machine), Runway (Gen 1-2 & Gen-3 Alpha), Pika, Sora, and domestic kuaishou Kelin, Meitu, PixVerse, Jianyingjimeng, Zhipuqingying, Tsinghua Vidu, and Qihuoshan Etna, etc., focusing on the financing process, product iteration, core functions, and comparative analysis of actual effects of many products. According to our calculations, the single video generation prices of mainstream AI + video products are: Luma AI 0.16 USD (1.17 rmb), Pika 0.05 USD (0.364 rmb), Runway 0.48 USD (3.49 rmb), kuaishou Kelin 0.5 rmb, byte jianyingjimeng 0.04 rmb, Aishikeji Pixverse V2 is 0.02 USD (0.174 rmb), and Meitu WHEE is 0.32 rmb. Overall, domestic AI + video products have a higher cost-performance ratio for generating single videos.
Not limited to video generation, one-stop AI video generation + editing + UGC creation from AI generation to AI workflow is expected to become the core development direction of the industry. Currently, most AI + video is used for creative content generation, with little direct use for ToB commercialization. Looking back at the reasons, firstly, the consistency of the characters generated in the video, the required length of time, and the quality of the picture do not yet meet the immediate commercialization standards. Secondly, we found that the mainstream AI video tools are still in the competition stage of video generation, and most of them are single-function products. After video generation, many detailed functions such as accurate prompt word generation, editing video clips, adding subtitles, generating scripts, transitioning, and adding background music have not yet been integrated. Therefore, at this stage, multiple different video creation tools need to be used in combination to achieve the effect of directly outputting commercialized videos. The process is tedious, and there may be compatibility issues between multiple tools, which brings inconvenience to users. Therefore, we believe that it is necessary to continue to pay attention to enterprises that can provide one-stop video generation + editing and other functions, understand user pain points, polish product details, and truly use technology for production work, entertainment, and many other aspects to bring potential space for commercialization and monetization.
With the advent of the AI+ video era, which types of companies are likely to have commercial possibilities? According to Cinda Securities, they include: 1) One-stop platform-type companies, such as Adobe and Meitu; 2) AI+ video technology head service providers transformed into product-type companies, such as Runway and SenseTime; 3) Video editing companies, such as Kuaishou; 4) Advertising and marketing companies, such as Yidian Tianxia, BlueFocus Intelligent Communications Group, Guangdong Insight Brand Marketing Group, and Leo Group Co.,Ltd.; 5) UGC community companies, such as Bilibili; 6) Video data companies, such as Beijing Jetsen Technology, Zhejiang Huace Film & TV, Visual China Group, and TVZone Media; 7) IP companies, such as Shanghai Film, China Lit, Zhejiang Jinke Tom Culture Industry, Col Group Co.,Ltd., Guomai Culture; 8) Companies exploring AI video workflow and other creative directions, such as Bona Film Group, Super Telecom, and Ningmeng Film and Television; 9) Other recommended companies to pay attention to include Maoyan Entertainment, Beijing Enlight Media, Mango Excellent Media, and Wanda Film Holding.
Risk factors: The development of AI big models at the bottom layer may not meet expectations; AI video technology iterations may not meet expectations; and the penetration rate of AI video product payments may not increase as expected.