On the night of the Supermeet 2011 Final Cut Pro X preview I was told that this was the “foundation for the next 10 years.” Well, as of last week, seven of the ten have elapsed. I do not, for one minute, think that Apple intended to convey a ten year limit to Final Cut Pro X’s ongoing development, but maybe it’s smart to plan obsolescence. To limit the time an app continues to be developed before its suitability for the task is re-evaluated.
Final Cut Pro X was the outcome of a multi-year process with Apple to “reimagine” what an NLE should be in a post-tape and post film world. I understand it was not smooth sailing during the process. Ultimately though some very clever thinking went into the design (as you will see in Bradley Olsen’s Off the TracksÂ documentary, Premiering in LA this week).
In the seven years since FCP X was released we’ve had another sea change that will be as important as the digital transition away from tape and celluloid. Artificial Intelligence, or more accurately Machine Learning.
Adobe have already included some of Adobe Sensei (their branding for applied Machine Learning in their ecosystem) into recent releases of Premiere Pro CC for seemingly mundane things. While we’ve been trying to work out whether or not “AI” is going to replace editors, Adobe have shown how it can help editors with everyday tasks, like color or audio matching.
Which raised the question: How much can be integrated before the app falls apart. I don’t mean specifically Premiere Pro CC, but in general, how much intelligent assistant can be built into an NLE in a way that integrates with existing structures?
Now, FCP X isn’t badly positioned for smart metadata extraction. Thanks to the inclusion of Content Auto Analysis in the first release, the basic plumbing for analyzing content and returning keyword ranges is there and could – I think – quite easily be repurposed for smarter Content Auto Analysis running on Machine Learning models via CoreML running on the GPU. For a start it would be fast enough to make it useful!
I also have no doubt that engineers at Adobe, Blackmagic Design and even Avid could make things work, but would it be the right approach.
As I get an increasingly clear vision of how Machine Learning is going to first integrate, I do see it as being more a human/machine partnership. Trained Machine Models embody a certain (limited) expertise. Adobe’s color matching toolset appears way better than anything I would be capable of, but probably not better than a professional colorist could do.
But if get 95% of the way with a button click (and a machine trained for the task) how many people are going to do it manually. (I will add that Adobe has full manual over-ride of the results.) If I can automate ducking music under dialog, would I bother doing it manually?
Take this a little further. Taking some of the tools working in a lab setting now, and extrapolating a little, I can see a future tool that has computer vision to recognize the content of shots – identifying people and tagging the b-roll; it has speech to text integrated and full keyword, concept, product, and entity identification; most of the more technical processes have been automated under an intelligent assistant built into the app (and likely build on OS foundation technology).
With this (as yet imaginary) toolset the editor could (literally) ask for all shots where person x spoke about subject D in a timeline. You would be able to ask the assistant for other shots containing the person, or the concept, or word, that you’re looking for.
You would be able to ask for matching b-roll that has open space to the right (because you want to put something graphic there).
With one click or request, the color would be matched across your timeline with the color setting you chose (or copied from a stock image, movie, etc). Audio would similarly have levels and tonality matched upon request.
There will be so much more that our built in assistant editor will be able to help with, that the editor will be free to focus on the creative. This will be a huge assist to most editors, and will be completely the opposite of what “Hollywood Professional Editors” want, and I’m OK with that. It’s an important niche market that requires very specialized tools, and Avid is the entrenched provider and will be so for the foreseeable future. It’s not where the millions of users of Premiere Pro CC and FCP X are.
The millions of users not working in “Hollywood” will be all over this next generation of intelligently assisted professional editing tools. (Of course I expect their consumer counterparts to go even further with the intelligent assistant concept into smart templates.)
But I don’t think any current NLE is ready to have all that retrofitted. So, maybe ten years from an app is enough time before it too, needs to be reimagined for a new generation. (Final Cut Pro classic was 12 years from launch to death notice.) Our industry is evolving ever faster. Why shouldn’t our tools?