OpenAIâs software Sora generates video in response to text queries

February 16, 2024

OpenAI has announced innovative new software that can generate high-quality videos in response to a few simple text queries. This is a remarkable advance by the makers of ChatGPT and could take concerns about deepfakes and plagiarism of licensed content to a new level.

The technology, called Sora, uses its “deep understanding of language” to create clips up to one minute long with “compelling characters” and “multiple shots in one generated video.” . the company said on its website Dedicated to new technology.

“Sora can generate complex scenes with multiple characters, specific types of movement, and precise details of subjects and backgrounds,” OpenAI said. “The model understands not only what the user asks for in a prompt, but also how those things exist in the physical world.”

According to tech outlet Wired, the Sam Altman-led company provided a glimpse of Sora’s capabilities with some surprising examples from prompts that appear to have been written for Hollywood scripts.

“The streets of Tokyo are bustling with the beautiful snow. The camera moves through the busy streets and follows several people enjoying the beautiful snowy weather and shopping at nearby food stalls. “Flower petals are flying in the wind with snowflakes,” the prompt reads.

Sola transformed those three sentences into a vibrant 17-second video that falls well short of the one-minute limit. The video showed a nondescript couple walking hand-in-hand down a snow-covered street lined with tower-topped shops, with the Tokyo skyline in the distance. .

The cherry blossoms were in full bloom while the sky was cloudy and snow was falling.

Although there were a few bugs, such as a dead-end sidewalk, overall it was a “surprising exercise in world-building,” Wired wrote.

“Current models have weaknesses. They may struggle to accurately simulate the physics of complex scenes and may not understand certain instances of cause and effect,” OpenAI says. .

“For example, if a person bites into a cookie, subsequent cookies may not have a bite mark.”

But another surprising example came from a prompt requesting “an animated scene of a short, fluffy monster kneeling next to a red candle” with “eyes wide open and mouth agape.”

The result is a mash-up of Furby and Gremlin, creating a cute creature worthy of Pixar’s Monsters, Inc. franchise. Sora’s rendering of the character is deceptively simple, typically a time-consuming task for experienced animators, raising concerns about the technology’s impact on the film industry.

A future enhancement will be the ability to generate video from still images, the company said.

“This would be another really great way to improve storytelling abilities,” Bill Peebles, a researcher on the project, told Wired.

“You can draw exactly what you have in your head and animate it to bring it to life.”

It wasn’t immediately clear when Sora would be available to the public or if it would be available for free to users.

Representatives for OpenAI did not immediately respond to The Post’s request for comment.

Now, this software has been released to select authors and security experts to “red team” the product regarding security issues.

Red teaming is the process by which a group poses as an enemy and attempts to physically or digitally infiltrate an organization.

Not only does Sola’s generative power threaten to upend Hollywood in the future, but in the short term there is a risk that short-form videos will spread misinformation, bias, and hate speech on popular social media platforms like Reel and TikTok. cause.

The company has vowed to prevent its software from rendering violent scenes or deepfake pornography, such as the graphic images of a nude Taylor Swift that went viral last month.

And while Sora has no intention of appropriating the style of real people or famous artists, using “publicly available” content for AI training has led OpenAI to face challenges from media companies, actors, and authors over copyright infringement. This can lead to legal headaches such as:

“Training data comes from content we have licensed as well as content that is publicly available,” the company said.

OpenAI said it is developing a tool that can identify whether a video was generated by Sora, allaying growing concerns about threats such as GenAI’s potential influence on the 2024 election.

The company, which has a $10 billion “multi-year” deal with Microsoft, is expanding a partnership that began in 2019 with just $1 billion from the big tech company to “make Sora available in the U.S.” “We have taken some important safety measures,” he assured. OpenAI products.

Since the company released ChatGPT, which can convincingly imitate human text, and DALL-E, which can use that technology to create “deepfakes,” or images that are fabricated and look like the real thing, AI has strengthened its ability to intervene in elections.

Last May, Altman testified before Congress that he was “disturbed” by the ability of generative AI to undermine election integrity through “one-on-one interactive disinformation.”

The San Francisco-based company said it is working with the National Association of Secretaries of State, an organization focused on promoting effective democratic processes such as elections.

ChatGPT added that it will direct users to CanIVote.org when asked specific election-related questions.

News of Sora’s future developments follows rival Meta’s move last year to enhance its image generation model Emu, adding two AI-based features that can edit and generate videos from text prompts.

Startups like Google and Runway are also launching text-to-video AI projects.

Comes with post wire.