OpenAI's Latest AI Model - Welcome Sora...πŸ‘‹

Show transcript

00:00:00: So, last week was a very exciting week.

00:00:03: By the way, welcome to the next episode of our podcast, because last week AI fought back

00:00:11: once again, you could say.

00:00:12: Yeah, I'm not sure if it's fighting back because it's basically battling us since the end of

00:00:16: 2022.

00:00:17: But you're right, last year, last week, Sora was released, OpenAI's latest new model, which

00:00:26: now also shows us that they totally destroy AI-generated video.

00:00:34: Because until Sora, it wasn't really a big thing, not too impressive.

00:00:38: But I guess Sora changed it all.

00:00:41: I think it did, yes.

00:00:42: And you already created a YouTube video about that.

00:00:45: It's last Friday, I think.

00:00:48: Yeah, right after it came out.

00:00:49: And then we thought, let's also talk about that a bit, because it affects us personally,

00:00:55: professionally, and also everybody on the planet.

00:00:58: What is your general opinion about this whole AI thing, or maybe specifically about Sora?

00:01:04: So how does this video or this text-to-video aspect now, or this option that now works

00:01:09: quite impressive, actually, change your view on AI, maybe?

00:01:14: Yeah, so I think we're getting less and less things that we can trust.

00:01:20: So to start with the downside right away, but I think that's the negative impact.

00:01:24: I immediately thought of when I saw that.

00:01:27: But at the same time, I also thought, well, that's really impressive.

00:01:31: If we look at it from a positive angle, it gives us a lot of new options.

00:01:37: For example, to kind of stick to an example from our bubble, it makes it easier to generate

00:01:43: B-roll or it essentially replaces or could replace the need for stock footage.

00:01:50: If you want to have some short clip in a video, some background video, that's something

00:01:56: Sora seems to be able to do just fine, where you no longer need to reach out to expensive

00:02:04: and maybe not optimal stock footage.

00:02:07: So that's the positive side, I guess.

00:02:09: But at the same time, as I said, I also thought, wow, that opens up a lot of potentially bad

00:02:16: use cases.

00:02:18: And I'm rather on the negative side, I guess, to be honest, because for me, the development

00:02:25: is too fast.

00:02:26: It's really fast, yeah.

00:02:27: Because when I think about ChatGPT, which we heard about at the end of 2022, we were

00:02:33: quite late.

00:02:33: We were, but yeah, that was our bad.

00:02:37: And now we are a little bit more than one year later, and now we can create videos,

00:02:41: one minute videos only, but still we or OpenAI can create these videos based on text prompts.

00:02:48: That's scary, in my opinion.

00:02:50: And yes, the trust side is one very important one that you mentioned.

00:02:54: But maybe I'm too old again, but I think about the work of other people, how it devalues

00:03:01: the work that people did.

00:03:03: Because if you needed B-roll, as you said, somebody had to record it, had to film it,

00:03:07: had to fly there, had to go there, had to plan it, had to turn the ideas he or she had

00:03:15: into reality by creating that scene.

00:03:17: And of course, as you said, now this can be done faster and in minutes, maybe.

00:03:21: We don't know how this evolves.

00:03:23: Maybe we can use it on our own in six months or something like that.

00:03:27: But this whole creativity that humans have and that helped humans to produce these amazing

00:03:33: videos, for example, now turns into simple computer-generated footage.

00:03:39: So I personally don't know how I should think about it at the moment.

00:03:44: Basically, it's more negative, impressive, but negative.

00:03:48: That's one of my points here to get started.

00:03:52: Still, I think this trust issue you mentioned is also a very big one because even now you

00:03:58: don't know what to believe in the Internet, if it's written, if it's images, and now you

00:04:03: can't trust videos anymore.

00:04:05: So basically, everything you see on the Internet has to be double-checked and you have to find

00:04:10: a way to double-check to make sure it really is the truth.

00:04:13: And I'm not sure how we or all these people using the Internet daily can find a way to

00:04:20: implement this double-check feature or whatever you want to call it.

00:04:25: Yeah, lots of good points.

00:04:27: I totally agree on the argument that it is a problem for all those people who created

00:04:36: the stock footage, for example.

00:04:39: And of course, stock footage is just one thing that could be replaced by SORA or by

00:04:45: text-to-video models like SORA.

00:04:48: It's absolutely valid.

00:04:50: I do think that at the same time – and I might be wrong here – but it looks like

00:04:55: these text-to-video models might not be able to replace Hollywood or movie studios in general

00:05:07: because, of course, it's one thing to replace some background footage, some clip where the

00:05:14: exact content might not be too important.

00:05:17: I mean, if you want like a dancing bunny, which was one of the examples OpenAI showed

00:05:22: on their website, a video generated by SORA where we can see a dancing bunny.

00:05:26: If you need something like that and you want to have it as B-roll in your video, you might

00:05:31: not care too much about the exact details.

00:05:33: It might be fine if you have a prompt where you roughly describe the scene and that you

00:05:37: want a dancing bunny and some neon colors or something like this, but the details aren't

00:05:42: too important.

00:05:43: But of course, that would change if we would talk about a real movie.

00:05:46: That's, of course, all the way longer than a minute, so where multiple clips would have

00:05:51: to be coherent and fit together.

00:05:53: And where we also have dialogues, which are not a thing with SORA at all, as it seems.

00:05:59: So there we only had videos without people talking or without lip synchronization and

00:06:06: all these things.

00:06:07: So I think it's not the entire video industry that's in danger here, but it looks like some

00:06:16: selected examples, as it almost always is with AI, I guess, for whom this really could

00:06:23: be a big problem.

00:06:25: I got two points here.

00:06:27: The first one is that, of course, things always changed.

00:06:32: And as you said, maybe certain jobs or certain tasks are replaced.

00:06:37: This happened all the time in the history, I guess.

00:06:40: So if we talk about industrialization, where people probably thought, oh my God, I won't

00:06:45: have a job anymore.

00:06:46: And it turns out there is more work than ever that has to be done.

00:06:49: But the other point is that it's the speed.

00:06:54: You mentioned it already, but you just said they can't do dialogues at the moment.

00:06:59: They can't replace Hollywood movies.

00:07:01: Hollywood movies are maybe also something different because you want to have the actors

00:07:04: and so on.

00:07:05: But I guess nobody knows where we are next year at the same point in time.

00:07:10: So next February.

00:07:12: And that's the little bit maybe scary thing that I see, that at the moment, I would say

00:07:16: yes, this has some bad implications, but can also be good.

00:07:20: But if it evolves that rapidly within the next 12 months, then we might talk about dialogues.

00:07:27: We might talk about 30-minute movies maybe.

00:07:30: And then things get really crazy, I guess.

00:07:32: Still I don't think that being scared or something like that helps in any way because

00:07:37: things will be as they are.

00:07:40: But yeah, I'm quite skeptical, as you see.

00:07:42: We also don't know how far they are already.

00:07:45: I'm not sure if OpenAI, for example, shows us all they can do because Sora just came

00:07:51: out of nothing, basically.

00:07:53: So maybe six months from now, the next thing comes out of nowhere.

00:07:57: So yeah, I don't know.

00:07:59: Yeah, it's of course possible, absolutely.

00:08:01: We don't know what we'll see in a year from now and what we'll see in two years and what

00:08:07: AI might be able to do then.

00:08:09: I totally agree.

00:08:10: I also agree that maybe if we're talking about big movies, it's a different thing there.

00:08:15: I would imagine it's also kind of hard to describe the complexity of a big movie in

00:08:23: a prompt, even if it could be a super long prompt, even if it could be an entire book.

00:08:28: I'm not sure.

00:08:29: But as I said, we don't know how things will turn out.

00:08:35: Of course, it's also worth noting that what we see with Sora, the videos we see, they

00:08:41: are not perfect, of course.

00:08:43: They look pretty perfect.

00:08:44: I have to say on first sight and even maybe the second time you look at one of those videos,

00:08:52: they look absolutely fine.

00:08:53: I have to say they do.

00:08:55: Of course, there are all these people who like tell you, yeah, it's clearly AI generated,

00:08:59: but is it really that clear if you just take a brief look at it?

00:09:05: It's sort of interrupting, but especially people who say that really know that this

00:09:10: is AI generated.

00:09:12: Exactly.

00:09:13: That's exactly the point I also wanted to make.

00:09:16: You can spot subtle errors like the hands of people, some movements which are a bit

00:09:23: strange.

00:09:24: Yeah, you can spot them because you know what you're looking at is AI generated and you're

00:09:30: looking for those hints.

00:09:32: But fast forward a year from now, let's say Sora or similar models are available to everyone

00:09:40: or to the majority or in five years, maybe we have models we can run on our own machines

00:09:45: or in our own data centers and we can achieve similar results.

00:09:49: And now let's think we have some bad actors.

00:09:52: We have some people who want to fake certain things, maybe with celebrities, maybe fake

00:09:59: some war videos, anything like that.

00:10:03: And you see that and you might not be prepared for the fact that it could be AI generated.

00:10:09: Of course, you know that there is a danger of that being the case, but we're coming from

00:10:14: a world where we take video for the truth, where we know if it's a video, it's very,

00:10:23: very likely not faked.

00:10:26: It might be staged or something like this, but it was recorded like that.

00:10:31: That is a relatively fair assumption.

00:10:33: It would take a lot of effort to really fake a video.

00:10:37: It's possible without AI, but it's basically a lot of work.

00:10:42: It's not easily done.

00:10:44: And we're coming from that world and adjusting to a different world where every video could

00:10:49: be fake and could be AI generated.

00:10:52: That'll be tricky.

00:10:54: There also is another side here because we can, of course, and we will probably have

00:11:00: the problem where we have to anticipate that videos might be AI generated and might not

00:11:08: showing – might not show us the truth.

00:11:11: But at the same time, it could also mean that we don't trust any video sources anymore.

00:11:18: And of course, that means that if we have like some media outlet, which we normally

00:11:23: would trust, let's say some big – let's say the New York Times or whatever.

00:11:29: And of course, there are reasons why you could mistrust them as well.

00:11:32: But let's say there are some media outlets whom we generally would trust.

00:11:37: And now they're showing a video of some, let's say, war crimes being committed by

00:11:42: whatever, anywhere in the world.

00:11:45: There will or there can be people in the future who say, yeah, that's fake.

00:11:50: That's not the reality.

00:11:52: So we have those two sides.

00:11:53: We have videos where we might see something horrible or might see some politicians say

00:11:59: something horrible and it's faked.

00:12:02: But it's hard to prove that it's fake.

00:12:04: But we also have the other side where we see something that happened but we have people

00:12:08: who can say, no, that didn't happen.

00:12:09: That is fake.

00:12:11: There might not really be a way of proving them wrong or it might be very difficult at

00:12:16: least because we're living in a world where video can't be trusted anymore.

00:12:20: And that's another big problem I see.

00:12:23: And that's really a problem because this means now we have another medium that people

00:12:28: can use to influence people regarding their agenda.

00:12:33: You know what I mean?

00:12:33: So if you think about the elections this year in the US, for example, I read that open AI

00:12:37: will do all they can to prevent any abuse of that technology.

00:12:41: And it's not publicly available yet.

00:12:43: So it might not matter for the elections this year, actually.

00:12:46: But considering this rapid development in February, who knows what happens in July,

00:12:51: for example.

00:12:51: Yeah, absolutely.

00:12:53: Who knows for which persons have this technology or it doesn't have to be open AI technology.

00:12:57: Maybe other similar technologies evolve in the next months.

00:13:02: So taking this into a global political context, it becomes really scary, I must say.

00:13:08: Absolutely.

00:13:08: Because your point is really valid.

00:13:09: No matter if you have a real source, a fake source, everybody can influence or tell people,

00:13:16: no, don't trust this.

00:13:17: I know the truth.

00:13:18: This video has the truth, not this one.

00:13:20: But which one is fake in the end?

00:13:22: And this also brings me to my next point, actually.

00:13:25: How does or how do we as humans, as tech people, maybe, but also the ordinary person, deal

00:13:32: with that?

00:13:33: Because if I think about some friends of mine who are not in the tech industry, they are

00:13:38: not aware of all these things.

00:13:39: We read this on X, we see it on YouTube, whatever.

00:13:43: But many people have the normal jobs, the normal lives.

00:13:45: They have a mobile phone for WhatsApp, whatever.

00:13:48: But they don't read all these tech news.

00:13:50: They don't use AI at all in their daily lives.

00:13:53: So how should they be aware of the fact that the video they just watched might be fake?

00:13:58: I would guess that probably some techniques evolved that can help with identifying AI-generated

00:14:07: videos.

00:14:08: But of course, as AI gets better, that might not really work.

00:14:13: I guess AI will always be in the advantage here.

00:14:18: It's like the old problem.

00:14:19: The person who wants to validate something always has to react.

00:14:23: If you want to protect against something, you have to react.

00:14:26: And it's the average part, it's AI in this case, who's the actor, who's advancing and

00:14:32: who's innovating.

00:14:32: And therefore, it will really be hard.

00:14:36: And I don't know what the future will be there.

00:14:40: Maybe there are some technical hurdles which can't be overcome in certain areas.

00:14:47: So that creating super long videos is maybe still a long road to go before we get there.

00:14:54: So maybe that's the case.

00:14:56: But I wouldn't rely on that.

00:14:57: Maybe it becomes even more important who's spreading a video or who's publishing a video.

00:15:07: Maybe that – persons as trust sources or publishers as trust sources, maybe that becomes

00:15:15: more important, which of course means that people have to trust them in the first place.

00:15:19: But maybe that is one of the ways of establishing trust that you know, okay, if it's that outlet,

00:15:27: if it's that person that publishes a video, I can trust that person or that outlet.

00:15:33: But obviously, that's – it's a question and it's not a super – it's not the perfect

00:15:40: solution, I'd say.

00:15:41: And it also kind of is contrary to the internet ideas, right, that everybody can share his opinion.

00:15:49: And of course, if somebody has an opinion and wants to prove or make a point, then he

00:15:53: has to validate that point.

00:15:56: But this became more difficult now with all this possible fake content that we have.

00:16:02: But still, we have to deal with that technology, I guess.

00:16:06: If you just deny it and say, I won't be a big thing, maybe we are wrong.

00:16:10: But at the moment, it looks like AI won't be gone in a year.

00:16:14: It will be here and it will get bigger.

00:16:17: So we have to adapt, right?

00:16:19: And we have to find ways to adapt to this, as you said, by thinking about the sources

00:16:24: we trust, also being even more suspicious, maybe, if somebody tells you, hey, this is

00:16:29: the only truth and this is fake, so you have to do your homework, so to say.

00:16:33: But blind trust was never a good idea in the past.

00:16:38: But it's an even worse idea in the future, I guess.

00:16:40: Absolutely.

00:16:40: So what this means in the end, AI theoretically makes our lives easier, as you say, because

00:16:46: we can create B-roll easily, we can summarize text with chat GPT, for example.

00:16:51: But in the end, it becomes more complex in other parts of our lives.

00:16:56: So once again, technology solves one problem but creates others.

00:17:01: So it's kind of funny, actually.

00:17:03: I guess it's always been like this, probably, but yeah, I totally agree.

00:17:09: And I'm not sure if it's a good or a bad thing that we have SORA and some other technologies.

00:17:16: Me neither.

00:17:16: So these are our thoughts after this SORA release last week, which also led to lots

00:17:23: of discussions between us two.

00:17:25: We would be happy to receive your feedback.

00:17:28: What do you think about AI, specifically about SORA in this case?

00:17:32: How this impacts your lives?

00:17:34: Maybe it doesn't have an impact on your life at all.

00:17:37: Maybe it's something you say, it's here, but I don't care.

00:17:40: And with that, we hope to see you in the next podcast episode.

00:17:44: Absolutely.

00:17:45: Bye.

New comment

Your name or nickname, will be shown publicly
At least 10 characters long
By submitting your comment you agree that the content of the field "Name or nickname" will be stored and shown publicly next to your comment. Using your real name is optional.