
00:00:00
if you haven't seen it Sora is the text
00:00:02
video model from open Ai and this if you
00:00:05
weren't studying it would look like a
00:00:09
major feature of film the traditional
00:00:11
approach for rendering video is you
00:00:13
create three-dimensional objects and
00:00:15
then you have a rendering engine that
00:00:16
renders those objects and then you have
00:00:17
a system that defines where the camera
00:00:19
goes and that's how you get the visual
00:00:21
that you use to generate a 2d movie like
00:00:24
this this doesn't do that this was a
00:00:26
train model so how would you train a
00:00:29
model to do this without having a 3D
00:00:32
space the compute necessary to Define
00:00:34
each of those objects place them in 3D
00:00:36
space is practically impossible today my
00:00:39
guess is that open AI used a tool like
00:00:42
Unreal Engine 5 and generated tons and
00:00:45
tons of video content tagged it labeled
00:00:48
it and we're then able to use that to
00:00:50
train this model that can for whatever
00:00:53
reason that we don't understand do this
00:00:55
you're referring to synthetic training
00:00:56
data exactly