Adobe is heading into AI/Machine Learning full steam ahead. A private room NAB demo showed picture search and dialog search powered by IBM Watson, and now this collaboration with Stanford organizes takes, matches them to lines of dialog, recognizes voices, faces and emotions, camera framing, and more and then builds dialog-focused stories in any style desired.
What makes this exciting is that this system is that it “understands” film idiom – that we start with a wide shot, for example – and that there are different styles of editing.
You can also use leisurely or fast pacing, emphasize a certain character, intensify emotions or keep shot types (like wide or closeup) consistent. Such idioms are generally used to best tell the story in the way the director intended.
The system then assembles a cut very quickly: a 71s cut in just three seconds!
We’re moving in this direction, and I don’t know how long it will be between a “collaboration with Stanford” and a shipping product, but regardless, these types of tools will increasingly affect our workflows. For now, there are limits:
The system only works for dialogue, and not action or other types of sequences. It also has no way to judge the quality of the performance, naturalism and emotional beats in take. Editors, producers and directors still have to examine all the video that was shot, so AI is not going to take those jobs away anytime soon. However it looks like it’s about ready to replace the assistant editors who organize all the materials, or at least do a good chunk of their work.
I added the emphasis, because – as Terence Curren and I have been discussing on The Terence and Philip Show, these type of job losses are somewhat inevitable. It also emphasizes that it’s not true ‘creativity’ but 90% of what we do isn’t.