Article count:10350 Read by:146647018

Account Entry

Domestic AI Director Thief 6, short video lenses and objects move in different ways|Hong Kong City University & Kuaishou & Tianda University

Latest update time:2024-02-11
    Reads:
The west wind comes from Ao Fei Temple
Qubit | Official account QbitAI

Kuaishou focuses on AI video and participated in the development of an intelligent “director”.

Direct-a-Video successfully decouples object movement and camera movement in AI-generated videos , greatly enhancing flexibility and controllability!

If you don’t believe it, come and enjoy some of the works.

The direction of camera movement in the short video depends entirely on the director's instructions. Horizontal (X-axis), vertical (Y-axis), and zoom must be precise:

The AI ​​director also performed a dazzling stunt, moving the camera in a mix of horizontal and vertical directions:

Mixed level, zoom motion effects are also available

In addition, the director also requires each "actor" in the video to move according to the drawn frame:

Achieve the effect of integrating camera movement and actor movement.

For example, when Big Bear is walking in space, the camera moves horizontally and vertically to achieve the overall video motion effect:

Of course, the position of the big bear can also be moved from one place to another by drawing a frame with arrows:

You can even control the movement paths of multiple "actors" at the same time:

This is the effect of the Direct-a-Video text-video generation framework jointly proposed by the research team of City University of Hong Kong, Kuaishou Technology, and Tianjin University .

How did you do it?

Specifically, Direct-a-Video is divided into two sections -

In the training phase, camera movement control is learned; in the inference phase, object motion control is implemented.

When implementing camera movement control, the researchers adopted the pre-trained ZeroScope text-to-video model as the base model and introduced a new trainable temporal self-attention layer (camera module) , which will be the pan and zoom parameters mapped by Fourier encoding and MLP Embedding is injected into it.

The training strategy is to use data augmentation self-supervised training method to learn the camera module on limited data, without manual motion labeling .

Data augmentation, in layman's terms, means adding a slightly modified version of existing data, or creating new synthetic data from existing data to increase the amount of data:

After self-supervised training, this module can analyze camera motion parameters to achieve quantitative control.

When realizing object motion control, no additional data sets and training are required . Users only need to simply draw the first and last frame frames and the intermediate trajectory to define the object motion.

To put it simply, pixel-based self-attention enhancement and suppression are directly used during inference, and the self-attention distribution of each object in each frame is controlled in time-sharing stages, so that the object is generated to the position specified by the user through a series of boxes, and the object is realized Motion trajectory control.

It is worth mentioning that camera movement control and object movement control are independent of each other , allowing separate or joint control.

How effective is Direct-a-Video?

The researchers verified the effectiveness of the method by comparing Direct-a-Video with multiple benchmarks.

Camera movement control evaluation

The comparison results between Direct-a-Video and AnimateDiff and VideoComposer are as follows:

Direct-a-Video is better than the baseline in terms of generation quality and camera movement control accuracy:

Object motion control assessment

Direct-a-Video was compared with VideoComposer and Peekaboo to verify the control capability of this method in multi-object and motion scenes.

Better than VideoComposer in terms of generation quality and object motion control accuracy:

When netizens saw the effect, they called Imhamestine:

In addition to Runway, there is a new option.

PS:

Runway Gen-2 "Motion Brush" (Motion Brush) , paint wherever you move, and you can also adjust the parameters to control the movement direction:

Reference links:
[1]https://x.com/dreamingtulpa/status/1756246867711561897?s=20

[2]https://arxiv.org/abs/2402.03162

-over-

Click here ???? Follow me and remember to star~

Three consecutive clicks of "Share", "Like" and "Watching"

Advances in cutting-edge science and technology are seen every day ~


Latest articles about

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号