OpenAI just released Sora 2, their text-to-video generation model that’s been in development for over a year since the original February 2024 announcement.
This isn’t a research preview anymore. It’s a full product launch with a standalone app, social feed, and a clear signal that OpenAI thinks AI-generated video is ready for everyday use.
What Sora 2 can do#
The core capability is simple: describe what you want to see, and Sora generates a video. But the improvements from version 1 to version 2 are substantial:
Video quality and realism#
- Better physics: Complex movements like gymnastics routines, dance sequences, and intricate anime battles look more natural
- Improved object permanence: Things stay consistent as the camera moves or objects interact
- Higher fidelity: More realistic lighting, textures, and motion
Audio generation#
This is the big new feature. Sora 2 doesn’t just create video, it generates synchronized audio:
- Dialogue: Characters can speak with lip-sync
- Sound effects: Footsteps, ambient noise, environmental sounds
- Music: Background scores that match the scene
Length and control#
- Videos up to 20 seconds (or longer with storyboard mode)
- Remix existing videos: Upload a video and have Sora modify it
- Storyboard mode: Chain multiple clips together for extended sequences
- Loop creation: Generate seamless video loops
Where you can use it#
Sora.com#
New standalone web platform where you create and explore videos. Three subscription tiers:
- Free: 500 credits per month, watermarked videos
- Plus ($20/month): 5,000 credits, some non-watermarked downloads
- Pro ($200/month): 10,000 credits, priority generation, higher resolution
iOS app#
Native mobile app for creating on your phone. Android coming later.
ChatGPT integration#
Generate videos directly in ChatGPT Plus and Pro conversations.
The Sora Feed#
Here’s where it gets interesting. OpenAI built a social feed into Sora where users can:
- Discover videos other people have created
- Remix and extend existing creations
- Follow creators and see trending content
- Steer the algorithm based on your preferences
OpenAI’s philosophy document explains this as “creating a space to inspire creative participation, not passive consumption.”
They’re explicitly positioning it as different from TikTok or Instagram Reels. The feed is designed to show you what’s possible and encourage you to create, not just scroll.
My take: This is smart. Generative video is hard to understand in the abstract. Seeing what real people make helps you grasp the capabilities and limitations much faster than any feature list.
But it also means OpenAI is building a content platform, not just a tool. That comes with all the moderation, algorithmic, and social dynamics that make traditional platforms challenging.
Safety and guardrails#
The system card details the extensive safety work OpenAI did:
Content restrictions#
- No photorealistic people uploads: You can’t upload videos or images of real people to prevent deepfakes
- Minor protection: Strict policies against generating content involving minors
- Violence and explicit content: Filters for graphic, sexual, or violent imagery
- Public figures: Restrictions on generating videos of identifiable celebrities or politicians
Technical safeguards#
- C2PA metadata: All videos include digital provenance information showing they were AI-generated
- Watermarking: Free tier videos are watermarked; paid tiers can remove it but metadata remains
- Red teaming: Extensive testing with external experts to find edge cases and failure modes
Moderation approach#
- Automated filters: Content screening before generation and after completion
- Human review: Flagged content reviewed by moderation teams
- User reporting: Community-driven flagging system
My take: The inability to upload photos of people is a blunt but necessary tool. Deepfakes are a real problem, and OpenAI is taking a conservative approach rather than trying to solve it with detection alone.
The trade-off is that legitimate use cases (animating family photos, creating personalized content) are blocked. That will frustrate some users, but it’s probably the right call for a public release.
What’s impressive technically#
Generating coherent video is much harder than generating images. Video has to maintain:
- Temporal consistency: Objects stay the same across frames
- Physical plausibility: Movement follows real-world physics
- Camera motion: Perspective shifts need to make sense
- Audio synchronization: Sound has to match what’s happening visually
Sora 2 handles all of this reasonably well based on the example videos OpenAI shared. Not perfectly, but well enough to be useful.
The fact that it generates synchronized audio alongside video is particularly notable. Most text-to-video systems generate silent clips and leave audio as a separate step.
What’s still limited#
From the documentation and examples:
- 20-second limit feels short for many creative projects
- Slow generation: Videos can take minutes to create
- Prompt sensitivity: Small wording changes can produce very different results
- Cost: Pro tier at $200/month is expensive for casual use
- Artifacts: Videos still have telltale signs of being AI-generated
What this means for creators#
Immediate use cases:
- Rapid prototyping: Visualize concepts quickly for pitches or storyboards
- Stock footage alternative: Generate specific scenes that are hard or expensive to shoot
- Social media content: Short-form video creation without filming
- Concept art: Visualize scenes for larger projects
Limitations:
- Quality isn’t production-ready for most professional video work
- Lack of control: Can’t specify exact camera angles, timing, or detailed actions
- Artifacts: AI-generated look is still visible in most outputs
My take: This is a tool for iteration and exploration, not for final output. Think of it like sketching versus finished illustration. It’s fast and flexible for testing ideas, but you probably won’t ship the raw Sora output for serious projects.
That said, the pace of improvement is striking. Compare this to what was possible two years ago, and it’s clear the gap is closing quickly.
The bigger questions#
What happens to traditional video production? Stock footage companies, small video agencies, and certain types of commercial work face real disruption.
Who owns AI-generated content? The legal framework is still unclear. OpenAI grants commercial rights, but derivative work and training data questions remain.
How do we verify authenticity? When anyone can generate realistic video, how do we know what’s real? C2PA metadata helps but isn’t a complete solution.
What about consent and likeness? Even with restrictions on uploading photos, models are trained on video data. Who consented to being in that training set?
What comes next#
OpenAI positioned this as the beginning, not the endpoint:
- Longer videos: Storyboard mode is the first step toward extended content
- Better control: More precise editing and direction capabilities
- API access: Developers will eventually be able to build on Sora
- Integration: Expect Sora to show up in more OpenAI products
The speed of improvement is what strikes me most. From research preview to shipping product in under two years. From silent videos to synchronized audio. From limited beta to public iOS app.
This isn’t the future of video. It’s the present. And it’s available now if you’re willing to pay $20/month.
Try it yourself: Head to sora.com to create your first video, or download the iOS app. Read the full system card for technical details and safety measures.