Beyond Sora: The AI Video Landscape in 2024

12/15/2024 • 12 min read

Photography didn't kill painting - it birthed impressionism. AI won't kill filmmaking - it'll birth something new, but great films need more than great tools - they need artists with something to say. The release of OpenAI's Sora has sparked both wonder and worry about AI's role in visual creation, but this moment is bigger than Sora and what it can do.

Watch the full video here

This revolution is obviously not about replacing artists but in further democratizing the ability to create. Headlines are focusing on Sora but a vibrant ecosystem of tools - from open-source innovations to professional platforms - is already enabling artists to push creative boundaries at every scale. Let's explore how we got here, where we're going, and most importantly, how you can be (and why we need you to be!) part of shaping what comes next.

0:00

/0:03

Generated With Sora

What Happened

OpenAI's Sora launch definitely made some noise last week, finally delivering on their February teaser that had people speculating all year. While sign-ups are currently open (though they've been toggling availability based on capacity), the bigger story is about how we got here and where we're heading.

Disclaimer: This isn't an exhaustive history - Jürgen Schmidhuber probably invented most of this in the 90’s.

2021: The Spark

OpenAI's DALL·E showed us AI could create compelling images from text.
- They didn’t open source it (thanks OPENai) but they did release part of it, CLIP. CLIP connects images and text. It learned by looking at pairs of images and their associated text descriptions from the internet
The open source community pushed by a few super motivated people made experiments like VQGAN+CLIP and Disco Diffusion that sparked an AI art community
- Platforms like Google Colab and Discord helped bring artists, researchers, and tinkerers together to push boundaries.

2022: Stable Diffusion Changes Everything

Stability AI, a hot new startup with a colorful CEO launched Stable Diffusion the first widely used AI image generation model, a few times, and some were wonky (like when Runway leaked Stable Diffusion 1.5, Stability said to take it down… but then changed their mind)
- This really kicked off many large debates on ethics of using images from the internet, or the ability to make deepfakes, and put Stability AI right in the middle of it - these discussions (and court cases) have yet to be resolved
Midjourney exploded, proving the commercial potential of productizing open source models and making them better / easier to use - they would hit $200M yearly revenue with just 40 employees the next year
The community starts demanding more control and early tools like ControlNets to guide structure or using other images to influence style emerge and platforms like CivitAI & Huggingface host custom models of people, objects, and styles… many of them explicit but we’ll leave that alone
Towards the end of 2022 people (like me and my team working on Nouns AI! who made one of the first dedicated AI video interfaces ever) started realizing you could make AI VIDEO by having the model morph from one image to another with something called ‘Interpolation’

0:00

/0:32

One of the first videos made on NounsAI

2023: From Images to Motion

Runway releases Gen-1 one of the first usable video generation models (closed source)
Animatediff emerged as one of the first major open source video breakthroughs stabilizing that interpolation
Kosinkadink, Matteo Spinelli, the Banodoco community led by Pom, and others started rapidly pushing boundaries especially once these tools were supported in ComfyUI - more info in Pom’s thread
Stability AI releases a big new model SDXL but criticism is mounting (CSAM found in their dataset), so is VC pressure to monetize

2024: The Rise of AI Video

My team at Salt AI formerly Plai Labs hosts the first ever ComfyUI Conference where developers including the creator himself from all over the world fly in to hear people like JBoogX, Fill, and others share how they created content for themselves, major movies, advertisers, and even Coachella sets!
CEO of Stability AI steps down and many members leave, Black Forest Labs steps in with an amazing image model called Flux, collaborating with X (formerly Twitter) who took it one step further with a brand new model this week that’s unparalleled in realism
Multiple closed platforms start competing with Runway including Luma, Pika, Krea, Hailuo, and Kling AI
But recently open source chinese models like Hunyuan have started to steal the show and in some cases are actually better then Sora which was just released!

0:00

/0:18

AI Generated Videos from Hunyuan

And that's just video – we haven't even touched the explosion in AI audio, voice generation, 3D generation or character animation But those are stories for another time.

My Take

The A24 team who made Everything Everywhere All at Once showed exactly what's possible when you combine great tools with actual vision. Their budget was ~20x smaller than Avatar 2 but using AI they to help edit they got some effects that were almost as good. With that said, they didn’t beat win best picture (over Avatar) because of their editing (though they won that too), they won because they also had the best Director, Original Screenplay, Actress, supporting Actor & Actress. AI just amplified their creative vision.

I admire Spencer Sterling, an artist turned AI researcher who despite being self taught, created one of the first AI video models. Be he didn’t stop there, now he's combining AI-generated visuals with lasers all synced to music for awesome live experiences. Why? He thinks it’s cool and is passionate about it. Meanwhile in China, creators are completely rethinking video storytelling. Their vertical short dramas started at under $15K per series and now average just around $100K per series to produce but are out-earning traditional movies. They're redesigning narrative structure for phone viewing, using techniques like the '3-7-21 rule' to keep viewers hooked every 21 seconds.

Personally, I prefer slightly longer more useful content but I don’t think Studios who are just using AI to cut costs will last long.
I’m optimistic that indie films & studios will explode.
Suddenly you don’t need a massive budget to tell your story. We've got a whole spectrum now, from free open-source tools to premium platforms. We're seeing the same democratization in visual effects, editing, and animation that we saw with more affordable cameras & Youtube as a free platform. This could lead to completely new forms of art we haven't even imagined yet; but we need artists with actual vision to make something worth watching.

David Deutsch laid out ‘The Principle of Optimism’ in The Beginning of Infinity:

"All Evils Are Caused By Insufficient Knowledge"

What You Can Do Today

Premium Options ($30-200/month):

Adobe rolling out AI video features across their suite (if you're already paying for Creative Cloud, keep an eye out)
Runway's been crushing it with features like Expand Video to change aspect ratios and their new keyframing prototype that lets you explore possibilities on a visual canvas
Luma Labs, Krea, and Hailuo each have their own strengths for different types of videos
Kling AI's motion brush gives you direct control over how things move

Pay Per Generation:
Want to try before committing to a subscription? Huggingface, Replicate, Tost, and Fal.ai all host Hunyuan (probably the best open source model right now) and other models. You get some free credits then pay per generation - great for experimenting or occasional use.

DIY (Free + Hardware):
Got a decent NVIDIA GPU or M1+ Mac? You can run these models yourself:

Pinokio: One-click installer for most AI tools (seriously, it's that easy)
ComfyUI: If you’re really down to dive into the deep end, new desktop app makes it way more approachable, and the node-based interface gives you total control once you learn it. If you’re going in, join Banadoco’s Discord. Pom shares updates twice a week and people are so great about answering questions. It’s an awesome place.

Don't let the technical stuff scare you - these are getting easier to use every day. Start with what matches your budget and comfort level, then work your way up as needed. The tools will keep evolving but having a real vision for what you want to create matters more than which one you pick.

A final note

Silicon Valley built the tools. Artists will define what they become. From A24-level productions to indie creators, the barriers keep dropping. Every week brings new options - some free, some paid, all getting easier to use.
But vision can't be generated. Neither can taste, or soul, or the drive to say something real. The next cultural shifts will come from people who bring both technical understanding and artistic instinct to the table.
Keep exploring. Support Artists. Make something worth watching.

I’m Al with AlxAI,

December 15th 2024

And that’s how I see it.

What Happened

My Take

What You Can Do Today

A final note

Related Posts

Google’s New AI Tools: A No-Hype Guide to What Actually Matters

Want to win $50k? Learn to speak AI

Is AI Actually Any Good?