Meet Sora: OpenAI’s Latest Marvel Turning Text Into Video — What You Need to Know

This text-to-video model is definitely the best one we’ve been waiting for!

Mrinal Walia
6 min readFeb 16, 2024
Prompt: A young man at his 20s is sitting on a piece of cloud in the sky, reading a book.

Sora is an AI model capable of generating realistic and imaginative scenes from text instructions.

With a wide range of potential applications, OpenAI’s Sora can create hyper-personalized ads and interactive stories for social media, among others.

Launched on the 15th of February 2024, OpenAI is sharing Sora with a small group of early testers as it tries to understand the potential dangers.

All videos on this page were generated directly by Sora without modification in their generative prompts.

It’s capabilities

With Sora, you can create videos that are up to a minute in length, with exceptional visual quality and attention to detail.

Prompt: Extreme close up of a 24 year old woman’s eye blinking, standing in Marrakech during magic hour, cinematic film shot in 70mm, depth of field, vivid colors, cinematic

Plus, Sora will work closely with you to make sure your video turns out just the way you want it.

This amazing tool can help you create the most vivid and breathtaking scenes you can imagine!

Prompt: Drone view of waves crashing against the rugged cliffs along Big Sur’s garay point beach. The crashing blue waters create white-tipped waves, while the golden light of the setting sun illuminates the rocky shore. A small island with a lighthouse sits in the distance, and green shrubbery covers the cliff’s edge. The steep drop from the road down to the beach is a dramatic feat, with the cliff’s edges jutting out over the sea. This is a view that captures the raw beauty of the coast and the rugged landscape of the Pacific Coast Highway.

With Sora, you can add several characters, capture different motions, and even add minute details to both the subjects and background.

Believe it or not, Sora not only follows your prompts but also understands how these elements interact in the real world. It’s like having your own personal artist right at your fingertips!

Sora can bring your ideas to life with emotionally engaging characters and multiple shots that keep your viewers hooked.

Prompt: Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. The art style is 3D and realistic, with a focus on lighting and texture. The mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. Its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. The use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.

And the best part?

It does all this while maintaining a consistent visual style that’s sure to leave a lasting impression on your audience.

The model could be better and has some limitations, too.

It struggles to simulate complex scenes accurately and understand specific cause-and-effect instances. It might miss details like a bite mark on a cookie after someone bites it, confuse left and right, and need help to maintain a specific camera path over time.

Prompt: Step-printing scene of a person running, cinematic film shot in 35mm. Weakness: Sora sometimes creates physically implausible motion.

It’s safety

Before making Sora available in OpenAI’s products, a team of safety and policy enforcement at OpenAI implemented several key safety measures.

They have collaborated with red teamers — experts in fields such as misinformation, hateful content, and bias — who will be rigorously testing the model.

The Sora model blocks text input that violates usage policies, such as requests for extreme violence, sexual content, hateful imagery, celebrity likenesses, or others’ intellectual property.

It also uses advanced image classifiers to review every video frame and ensure adherence to usage policies before displaying it to users.

Furthermore, OpenAI plans to partner with global policymakers, educators, and artists to identify positive uses and concerns for their technology.

Despite thorough testing, they can only predict some uses. Real-world application is crucial for developing safer AI systems over time.

It’s research techniques

Just like GPT models, Sora is built on a transformer architecture, which offers exceptional scalability.

Prompt: A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.

Sora can generate whole videos in one go or add to existing videos to lengthen them.

By processing many frames simultaneously, it can tackle the difficult task of ensuring continuity for subjects, even when they temporarily leave the frame.

Prompt: The camera rotates around a large stack of vintage televisions all showing different programs — 1950s sci-fi movies, horror movies, news, static, a 1970s sitcom, etc, set inside a large New York museum gallery.

Sora can create videos from just text instructions or take a still image and animate its contents with precision and attention to detail.

Moreover, the model can extend an existing video or interpolate missing frames, enhancing its versatility.

Prompt: A Samoyed and a Golden Retriever dog are playfully romping through a futuristic neon city at night. The neon lights emitted from the nearby buildings glistens off of their fur.

Ending notes

I am truly amazed by Sora’s capabilities!

Sora lays the groundwork for models that possess the ability to comprehend and replicate the complexities of our world — a crucial stepping stone towards attaining AGI.

It’s incredible to see how scaling video models can pave the way for the development of advanced simulators capable of replicating both physical and digital worlds.

With the ability to simulate objects, animals, and people, the possibilities are endless!

I’m excited to see where this technology will take us in the future.

Mrinal Walia

