The maker of ChatGPT on Thursday unveiled its next leap into generative artificial intelligence with a tool that instantly makes short videos in response to written commands.
San Francisco-based OpenAI’s new text-to-video generator, called Sora, isn’t the first of its kind. Google, Meta and the startup Runway are among the other companies to have demonstrated similar technology.
But the high quality of videos displayed by OpenAI — some after CEO Sam Altman asked social media users to send in ideas for written prompts — astounded observers while also raising fears about the ethical and societal implications.
“An instructional cooking session for homemade gnocchi hosted by a grandmother social media influencer set in a rustic Tuscan country kitchen with cinematic lighting,” was a prompt suggested on X by a freelance photographer from New Hampshire. Altman responded a short time later with a realistic video that depicted what the prompt described.
The tool isn’t yet publicly available and OpenAI has revealed limited information about how it was built. The company, which has been sued by some authors and The New York Times over its use of copyrighted works of writing to train ChatGPT, also hasn’t disclosed what imagery and video sources were used to train Sora. (OpenAI pays an undisclosed fee to The Associated Press to license its text news archive).
OpenAI’s website displays several videos generated using the tool, including one that shows a group of wooly mammoths trotting through a mountain setting and another that depicts two pirate ships in a cup of coffee.
“Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee,” reads the prompt for the latter video.
OpenAI said in a blog post that it’s engaging with artists, policymakers and others before releasing the new tool to the public.
“We are working with red teamers — domain experts in areas like misinformation, hateful content and bias — who will be adversarially testing the model,” the company said.
“We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.”