Skip to content
Cover van The Evolution of Video Models — with Dino.
THE SEAM · AFLEVERING

The Evolution of Video Models — with Dino.

Dino the creative dinosaur reluctantly acknowledges the lineage from Veo 2 to Omni. The article's four-lineage frame survives.

S0 · E611:35Synthetische AI-stemmen
  • Rutger
  • Dino
  • Saar
11:35 · RUTGER · DINO · SAAR · SYNTHETISCHE AI-STEMMEN
The Evolution of Video Models — with Dino.
00:00 / 11:35

Op deze site: The Evolution of Video Models.

TRANSCRIPT
Rutger
This is The Seam. I'm Rutger, and I've got the two people most likely to disagree with me in the room. To my left, the only art director in Rotterdam who still owns the good pencil — Dino.
Dino
Hm.
Rutger
And, all the way from a cancelled shoot, the Netherlands' own — Saar.
Saar
Hi! Quick question before we start — is this the one that goes on the radio, or the other radio?
Rutger
[laughs] It's a podcast, Saar.
Saar
So… the other radio. [to Dino] And you. You're the one who frightens me a little. Like a museum that talks.
Dino
I am a working professional. The museum is free on Wednesdays.
Rutger
[chuckles] Okay — before we start.
Rutger
Quick setup, in case you haven't read the piece. I wrote about the evolution of AI video models — Veo, Google's generative video tool — and the argument's a lineage. You can line up four generations and shoot the same eight-second scene through each one: Veo 2, the clean silent clip; Veo 3, sound born with the picture; Veo 3.1, in cheaper Fast and Lite gears; and Omni, which holds a reference and a face across cuts. Each step really was a step — and the strange part is the brief got shorter as the picture got better.
Saar
Ooh — I heard "shorter" and "better." That sounds like a good day at any shoot.
Rutger
It is. Dino — you read it. And I can already tell you've got an opinion.
Dino
I have read the title. That was sufficient.
Rutger
[chuckles] So no, then.
Dino
[cutting in] It's a fad.
Rutger
You haven't read it yet.
Dino
I don't have to read it. I can smell it. Every few years a new little machine arrives, and a man with a microphone tells me it changes everything. The fax. The website. The dancing app. They all changed everything, and yet here I am — still the only one in the building who knows what a baseline grid is.
Rutger
[a soft laugh] The dancing app being —
Dino
TikTok. I call it what it is. People dancing. Don't put a hashtag on me, Rutger — I've never touched one and I'm not starting on the radio.
Saar
I have nine million on the dancing app.
Dino
[a beat] I rest my case. I don't know which case. But I rest it.
Rutger
[chuckles] Let me actually walk you through the lineage, Dino — because that's the part I think is real. Four steps. First one — Veo 2. It made a clean eight-second clip. Looked nice. Silent. No sound at all. You had to lay audio over it like a silent film.
Dino
[grunts] A silent film. So — a step backward into nineteen twenty. Marvellous. Progress.
Rutger
Hold that thought. Step two — Veo 3. Same idea, but now the sound comes out of the model with the picture. Footsteps land on the footstep. Lips move on the word. The audio's baked in, not bolted on.
Dino
Hm.
Rutger
That "hm" was lower than the last one.
Dino
Same hm. The acoustics in here are a crime. You have no soft furnishings, Rutger. A man who plays sound for a living and owns no rug. It's like a chef with no salt.
Saar
[laughs] I have a rug from a film. Sheepskin. They let me keep it because I cried on it. Method.
Rutger
[chuckles] Step three — and this is the practical one — Veo 3.1, in two flavours. A Fast version, cheaper and quicker, for when you need forty options by lunch. And a Lite version, lighter still, for roughing things out. Same family, different gears. You stop paying for the Rolls when you're just popping to the shop.
Dino
[grudging] That part is not stupid.
Rutger
Sorry — say again, for the recording.
Dino
I said the gears are not stupid. A good studio always had the big camera and the cheap one. You don't light a test Polaroid the way you light the cover. That's not new thinking, Rutger. That's nineteen seventy-four. You've reinvented the Polaroid, put a vowel on it, and called it Lite.
Rutger
[laughs] Honestly — fair. Step four is the one that got me. Omni. You don't just describe a shot anymore — you can hand it a reference, turn it, extend it, keep a face consistent across cuts. It holds the thread shot to shot instead of forgetting between clips.
Saar
[perking up] Wait — a face. It keeps the face. [pause] Whose face?
Dino
[exhales] Here we go.
Saar
No — listen. If the little machine keeps a face across the cuts … it could keep MY face. Across cuts. In Rotterdam. While I'm in Ibiza. Can it do the eyebrow? The — [does the eyebrow] — critics have written paragraphs about that eyebrow.
Dino
She has been awake the entire time and absorbed precisely one word. And it was "face."
Rutger
[chuckles] It's a fair worry, Saar — just aimed at the wrong floor of the building. We'll come back to your face, I promise. Dino — be honest with me. You've been calling it a fad for ten minutes. But you saw the four clips before we sat down. The Veo 2 one, then the last one. Side by side. What did you actually think?
Dino
[long pause] I thought the last one read from across the room.
Rutger
Meaning.
Dino
It's the only test I trust. You pin a thing to the wall, you walk to the far end of the studio, you turn around. If the eye knows where to go — contrast, where the light sits, what's sharp and what's soft — it reads. Most things fail. Students fail it. Award winners fail it. The first clip, the silent one — I could see the seams from the door. The last one … I walked to the door. I turned around. And the eye went where it should. The hierarchy was correct. [a beat] I was furious.
Rutger
So the lineage is real.
Dino
The lineage is real. I am not pleased about it, but I am not a liar. Each one was better than the last in a way I can point at — not just feel. That's not a fad. A fad doesn't improve in a straight line. A fad gets louder. This got *better.* Those are different things, and only one of them is allowed in my studio.
Rutger
That might be the most you've ever conceded.
Dino
I conceded a fact. I have conceded nothing about whether I care — note the difference. The grid is still the grid. Contrast is still contrast. The machine has learned every rule I've known since nineteen ninety-four. It has not learned why a single one of them is a rule. It can pass the test. It cannot tell you why the test exists. That's not a craftsman. That's a very confident parrot.
Rutger
Hold that thought right there — it's the best place to stop — while we hear from the people quietly paying for the studio with the bad acoustics.
Dino
[flatly] A scent. On the radio. They have found a way to advertise a smell to people who cannot smell it. [a beat] I almost admire it. Almost.
Saar
I leaned in. I always lean in for perfume. Old habit.
Rutger
[laughs] Okay — back to it. Dino, you just said the machine can't tell you why the grid is the grid. Saar — go.
Saar
[brightly, jumping in] Oh — so what you're saying is — the computer learned the recipe but it can't taste the soup.
Rutger
…Saar. That's actually it.
Dino
[immediately] I find that deeply annoying.
Rutger
[chuckles] Why?
Dino
Because I've said it ten ways and she said it once, with soup. Next.
Rutger
[laughs] That's the whole piece, though. It's not "is the video good." It crossed that line — Dino just admitted it from the doorway. It's "what's the thing a person still brings." You can hand the model the reference and the turn and the consistent face, and it'll give you something that reads across the room. What it won't do is know which of the forty options is the one. That's still taste. That's still you, at the far wall, turning around.
Dino
I will allow that. Grudgingly. On worse coffee than I was promised.
Saar
[leaning in] No, but — my face, though. If it keeps the face — and nobody can tell the kept face from my face — then why do they need the — … what do I DO?
Rutger
Okay. Saar — under the vanity of it, you keep landing the real question. It's not "do I stay famous." It's "what does the person bring when the tool can make the picture." And the answer's the same as Dino's wall. The model can make the face. It can't decide it's the right take. The little eyebrow thing — the model can copy it. It can't know it's the one worth keeping. That's a person's job. Probably yours, annoyingly.
Saar
So I'm the soup.
Dino
[before she finishes] … You're the soup, Saar. For once in this conversation, somebody is the soup, and it's you. Beautiful work.
Saar
Beautiful. I'll allow it.
Rutger
So — the lineage. Veo 2, the clean silent clip. Veo 3, sound born with the picture. Veo 3.1, Fast and Lite, the gears for cost. Omni, holds the reference and the face across the cuts. Four steps, and each one really was a step. Dino — last word. Fad or not?
Dino
Not a fad. I want that on the record — because I will deny it at the agency on Monday. It's a real lineage and it really got better. It still can't taste the soup, it still doesn't know why the grid is the grid, and I still own the only good pencil in Rotterdam. But it improved in a straight line a man can point at. So — not a fad. A tool. [a beat] I hate that it's a tool. A tool is a thing you can be replaced by. A fad you just outlive. I was so comfortable with fad.
Rutger
[laughs] I'll take it. For the record — none of this is a Google position. It's my read, on my site, synthetic voices. Dino's a character with a very real opinion about rugs. Saar is —
Saar
Saar is available. For the right project. And the soup.
Rutger
That's our button. The lineage is real, the picture got good, and the part you can't automate is knowing which one's the one. Goodnight.