ah? Sora's popular short film “Balloon Man” is also “fake”? ? ?
The latest reveal of the team of artists behind it can be said to have stirred up waves with one stone: It turns out,The video images are not entirely generated by AI, and there are a lot of visual effects that need to be realized by humans later..
be like:
Now the netizens have given up and are playing with OpenAI sincerely, but OpenAI is playing tricks behind the scenes:
They are vague, hoping that the audience will think that the short film is completely generated by AI. Isn't this a bit dishonest?
This is not a video generated by artificial intelligence, but a video that uses some AI technology.
Some netizens directly criticized: misleading marketing! This is misleading marketing!
Let’s take a closer look at what exactly is going on.
Revealing the secrets of Sora’s blockbuster workflow
Although OpenAI introduced at the beginning that short films like “Balloon Man” were produced by a team of artists, and they only opened Sora to artists for use, the official did not mention how the short films were produced.
Now, Shy Kids, the artist team behind “Balloon Man”, has revealed the secret themselves, including:
How to achieve video clip consistency
How they process the video footage generated by Sora
Limitations and post-processing of videos generated by Sora
Video consistency
The consistency of the protagonist's image in “Balloon Man” is astonishing.
But in fact, according to Patrick Cederberg (Lao Pa), the brother in charge of post-production on the Shy Kids team, achieving this kind of consistency is not just about writing prompt words.
Sora does not provide tools to help achieve subject consistency between different shots. In other words, even if the prompt words are the same, the results of the two runs will be different.
What they do is to describe the protagonist in as much detail as possible.
Interpreting the character's clothing and the type of balloon is how we solve the consistency problem, and Sora currently does not have the proper functionality integrated to enable this kind of control.
Even so, the team still encountered many problems of one kind or another when using Sora to generate video materials.
For example, the prompt word clearly states that the balloon is yellow, but the balloon may turn red in the clip generated by Sora.
Video material processing
In addition to consistency, Lao Pa mentioned that in terms of timeline, Sora allows users to modify keyframes. However, this kind of time control is not precise and cannot guarantee that the desired effect will be achieved.
Also, want to implement this shot:
Even if the focus of the lens is moved from the jeans all the way to the balloon head, humans have to crop and pan the picture in post-production, because Sora itself will not render such a shot: it always tends to focus on the balloon head.
Lao Pa also mentioned that they also encountered some problems when writing prompt words:
OpenAI didn't consider how real filmmakers think before letting artists try Sora.
To put it simply, Sora has limited understanding of photography terms (such as tracking, panning, etc.). Lao Pa believes that Sora is inferior to Runway in this regard.
It is worth mentioning that although Sora natively supports generating 1080p videos, the materials actually generated by Lao Pa and others are all 480p. They used tools such as Topaz to super-process the video material in post-production.
In terms of generation speed, according to Lao Pa's recollection, it takes about 10-20 minutes each time.
Video post-production
Next, comes the part where netizens reacted the most strongly – the later stage. As mentioned earlier, Sora itself cannot solve the problem of consistency in different video clips.
In addition to the balloons not necessarily meeting the settings, perhaps because of the training data, Sora also likes to automatically add strange faces to the balloons.
Generate a dummy head for the protagonist that is not actually needed.
Sora was also adamant that the balloon should have a string.
All in all, these all need to be thrown into AE for post-processing.
In addition, although Shy Kids found that keywords such as “35mm film” are very useful and can make the video images generated by Sora more consistent, the artists still need to color grade the final film to add grain and flicker effects to the image. , in order to make the entire film more harmonious and unified.
Lao Pa also mentioned an interesting detail: Sora likes slow motion.
I don't know why, but there are a lot of lenses that look like 0.5x and 0.75x.
So we had to slow down a lot of the footage so it didn't look like one big slow-motion project.
So how much of the video footage generated by Sora ended up being used in the film? Lao Pa, who was “very bad at mathematics”, estimated that it was probably 300:1.
In terms of audio, Sora cannot currently generate sounds, so the narration and music are added by the team themselves.
copyright
In order not to infringe copyright, OpenAI has placed some restrictions on Sora.
For example, you can't write the prompt word as “35mm film, in a future spaceship, a man approaches with a lightsaber”, then Sora will directly refuse to generate it, because the picture is too similar to “Star Wars”.
Oh, yes, things like “Aronofsky lens” and “Hitchcock zoom” are also not allowed.
Completed in 2 weeks by a team of 3 people
Needless to say, before the outside world stirred up trouble, the Shy Kids team was still very satisfied with Sora's performance.
After all, it only took the three of them 1.5 to 2 weeks to make such a high-quality short film as “Balloon Man”.
The team believes that now, for professional film teams, Sora certainly still has a lot of room for improvement, but for most people, Sora is already amazing enough.
In Lao Pa’s own words:
I think people should make Sora part of their workflow.
But if they don't want to have anything to do with AI, that's okay.
Many netizens agree with this view and believe that video generation AI like Sora is a good complement to existing workflows.
It is a very good idea for Adobe to integrate them into the software. But, “I'm tired of OpenAI's exquisite demo marketing.”
Some netizens are dissatisfied that there is a lot of human work behind the popular “artificial intelligence generated videos”. They put in hundreds of hours of work, but the real value is covered up by AI.
So, what do you think about this?
Reference links:
(1)https://www.fxguide.com/fxfeatured/actually-using-sora/
(2)https://twitter.com/bilawalsidhu/status/1783544598259794046