Sora training data is suspected to be exposed, netizen: UE5 is definitely used

Latest update time：2024-02-25

Reads：

Baijiao House comes from Aofei Temple
Qubit | Official account QbitAI

Good news, good news, there’s a new video for True Sora! Do not miss passing through!

(You don’t have to wait hard, or strain your eyes to identify genuine and fake Sora products) .

In just the past few hours, staff including Bill Peebles and Tim Brooks, two people in charge, have been buzzing.

(Okay, okay, I know you are good friends)

Not only does it have unprecedented multiple viewing angles and new functions, but most importantly, the presentation effect of each video is still at an amazing level.

For example, diving to explore a sunken ship from the perspective of a Go Pro.

For example, in the video below, the effect is a bit different from the previous realistic and realistic painting style.

Moreover, Sora produces images from different perspectives of the same clip.

Its prompt: an elaborate diorama depicting a tranquil scene from Japan's Edo period. Traditional wooden building. A lone warrior, clad in intricate armor, walks slowly through the town.

Another amazing thing is a little white dragon with big eyes, long eyelashes, and air-conditioning spray from its mouth. This is the one below:

Someone tried to use the same prompt to draw on DALLE·3, and the result was as follows:

Yes, quite similar!

But the effect of Sora Xiaobailong makes a certain sound get louder and louder, that is:

Boy, I can tell at a glance that this thing has the shadow of Unreal Engine!

However, this wave of videos still made netizens wow and wonder, why is Sora's video effect getting better and better with each wave?

Oh my God, everyone can only eat three meals a day while waiting for the open beta of Sora!

Some netizens are so excited that they have already pitted Sora’s API on ProductHunt, a well-known new product mining platform.

All is ready except for the opportunity.

Official new video leaked again

First, let’s take a look at Sora’s new products. The most amazing thing this time is the turtle made of glass, crawling on the beach at sunset.

However, some attentive netizens discovered: "I only saw three legs..." "The two front legs are more like a turtle's flippers."

And using the same prompt on Midjourney, the effect is like this.

In addition, multi-perspective display has also become a highlight of this launch.

For example, skydiving in Hawaii.

Prompt word: a man BASE jumping over tropical hawaii waters. His pet macaw flies alongside him (a man BASE jumping over tropical hawaii waters. His pet macaw flies alongside him)

There's even an F1 driver's perspective.

In addition, Sora also exposed some new features similar to editing- seamless connection .

Previously seen, it can prompt models through text, image or video input.

Now it has been discovered that it can also gradually interpolate between two input videos. Two unrelated Sora videos ended up seamlessly transitioning into a new video.

Ahem, but why are there butterflies underwater? ?

Any size ratio can be generated , which is also shown in this new video.

However, since all of the videos are posted by Sora team members, some netizens feel that unless there is a non-OpenAI staff member testing, Sora is just vaporware .

Among these cases, some are considered to have overturned...

Prompt word: a dark neon rainforest aglow with fantastical fauna and animals (a dark neon rainforest, flashing with the light of fantastical fauna and animals)

Netizens said: Why is it in the style of vector animation? There is no such hint in the prompt.

This is the worst example of Sora I've ever seen

"I'm no expert, but this definitely uses UE5"

At the same time, the focus of discussion about the video generated by Sora gradually shifted from "this does not conform to the laws of the physical world" to a deeper level -

Discussion about the source of training data behind it .

The current mainstream folk saying (doge) is:

This is definitely trained using the 3D engine/UE5!

NVIDIA scientist and old friend Jim Fan, who is familiar to everyone, speculated on the first day, saying that although Sora did not explicitly call UE5, it is very likely that the text and video generated with UE5 are treated as synthetic data and added to the self-made data. Individual training is concentrated.

A former Google employee also gave a critical online review of Sora’s new video:

I really think that the effect of Sora requires a combination of 3D engine + generative AI to achieve such consistency and video quality.
It turns out that more data and calculations are needed...

This is not the opinion of Jim Fan and others. As early as when the first wave of Sora videos appeared, this voice started to appear immediately, and the volume was still quite high.

Give one more example.

A Twitter friend who is engaged in data science and ML listed the "evidence" for his position.

The card he showed was the video of strolling through the cherry blossom streets.

Then the text said: "The people moving in the video seem to be very similar to the way humans move in the UE5 demo. In real life, people walking and shopping do not always use a constant speed."

Some people also question this statement. After all, there are billions (perhaps more) of hours of video clips on the Internet such as Youtube. Why use Unreal Engine to increase the workload?

So someone threw the "car driving" video clip in front of the tweeter above and said that it didn't look like it was made using a 3D engine!

The little brother started his analysis in a friendly way:

"I'm not an expert... but it feels like the dust that UE raises when driving the car is only in the rear wheels. But in reality, dust is also raised in the front wheels."

Of course, many people agreed with him and echoed:

Although it may not necessarily be UE5... but the fact is that using digital twin simulation may be more effective and efficient.
This also allows for higher quality data sampling with less IRL data.

And some people listed their understanding of Sora's pipeline on Twitter.

After this discussion spread on a large scale, many people sneered at the possibility that Sora was the result of "UE5+AIGC".

"Hmph! Let me put it here, synthetic data is a cheat code for visual machine learning!!"

At the same time, some people saw a possibility in the near future from this discussion.

That is, the generation of the future is not presented by simulating real physics, but by training a model that simulates a physical simulation (i.e., the real world) .

Well... let's just say, who can deny that this is not possible?

One More Thing

After Sora posted a new video, some netizens rushed to ask the CEO of Runway, another leading AI video generation player.

"In the next few months, are there any plans to release a new version? The one with quality close to Sora~"

Runway CEO replied coldly:

better

Reference link:
[1]https://twitter.com/minchoi/status/1761367515777695965

-over-

Registration is underway!

AIGC companies & products worthy of attention in 2024

Qubits is selecting the most noteworthy AIGC companies in 2024 and the most anticipated AIGC products in 2024. Welcome to register for the selection !

Registration for selection ends March 31, 2024

The China AIGC Industry Summit is currently under preparation. To learn more, please click: Here, see the future of generative AI applications! China AIGC Industry Summit is coming!

For business cooperation, please contact WeChat: 18600164356 Xu Feng

For event cooperation, please contact WeChat: 18801103170 Wang Linyu

Click here ???? Follow me and remember to mark it with a star