“Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will abuse it. “We’ll be engaging policymakers, educators and artists around the world to understand their concerns and to identify positive use cases for this new technology,” OpenAI writes. The company also says that, should it choose to build the model into a public-facing product, it’ll ensure that provenance metadata is included in the generated outputs. OpenAI says it’s working with experts to probe the model for exploits and building tools to detect whether a video was generated by Sora. Its rationale is the potential for abuse OpenAI correctly points out that bad actors could misuse a model like Sora in myriad ways. OpenAI’s very much positioning Sora as a research preview, revealing little about what data was used to train the model (short of ~10,000 hours of “high-quality” video) and refraining from making Sora generally available. The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.” For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark. “ may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. OpenAI - for all its superlatives - acknowledges the model isn’t perfect. And these videos maintain reasonable coherence in the sense that they don’t always succumb to what I like to call “AI weirdness,” like objects moving in physically impossible directions.Ĭheck out this tour of an art gallery, all generated by Sora (ignore the graininess - compression from my video-GIF conversion tool): But the cherry-picked samples from the model do look rather impressive, at least compared to the other text-to-video technologies we’ve seen.įor starters, Sora can generate videos in a range of styles (e.g., photorealistic, animated, black and white) up to a minute long - far longer than most text-to-video models. Now, there’s a lot of bombast in OpenAI’s demo page for Sora - the above statement being an example. “ The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.” “Sora has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions,” OpenAI writes in a blog post. Sora can also “extend” existing video clips - doing its best to fill in the missing details. Given a brief - or detailed - description or a still image, Sora can generate 1080p movie-like scenes with multiple characters, different types of motion and background details, OpenAI claims. OpenAI today unveiled Sora, a generative AI model that creates video from text. OpenAI, following in the footsteps of startups like Runway and tech giants like Google and Meta, is getting into video generation.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |