The present and future of post production business and technology | Philip Hodgetts

CAT | Machine Learning

While researching the anecdotal history of some local property, I did what I’ve done previously: ask Siri. In this case, asking about actors dates of birth and death. In the past, these type of questions would have pulled up the relevant IMDB or Wikipedia page with Siri saying “I’ve found some links for you on the web” or similar.

It took several rounds before I realized that, while the pages were still being pulled up as before, Siri was parsing out the answer to the question I’d asked, and gave that to me directly. I never had to glance down or open my phone.

Similarly, in Mail, there is now a predictive mailbox making suggestions (usually accurate) into which email box I might want to move the selected email.

In Calendar, I find addresses being suggested for my events, based on whether I’ve been there or not, address book entries, or other information.

It’s clear to me that these are all improvements related directly the Apple’s increased use of Machine Learning across it’s software products.

As I’m trying to figure out how and where we might use Machine Learning (ML) in our software businesses, I thought I’d review all the uses I can find beyond the more general cognitive services (like speech to text, image recognition, keyword extraction, etc) that I’ve already talked about and that – by themselves – are incredibly valuable and offer a near-immediate payoff.

I was a little shocked at the diversity of ways ML is being used. According to TechCrunch there has already been over $10 billion in Venture Capital to 1500 AI/ML startups in 70 countries, which is predicted to rise to more than four times that in 2017!

Since I was compiling this list, I thought I’d share it with you, but it’s just a sampling. Even so there are more than 40 applications described here, in addition to the Cognitive Services as stand alone ML tools.


It’s relatively easy to get an overview of the current state of Artificial Intelligence (AI). It’s probably easier to understand the benefits of machine learning, particularlyMachine Learning (ML) that’s already applied to common tasks that we can benefit from now because we’re fitting those new technologies within existing frameworks.

What is much harder to determine, is how machine learning will be directly applied to post production processes, and what role AI will take in our collective production future.


In September 2010 Apple purchased Swedish facial recognition company Polar Rose, and today we learn they’ve purchased Israeli startup RealFace: “a cybersecurity and machine learning firm specializing in facial recognition technology”.

What is different between the two purchases is that this latest is based on machine learning.

…the startup had developed a unique facial recognition technology that integrates artificial intelligence and “brings back human perception to digital processes”. RealFace’s software is said to use proprietary IP in the field of “frictionless face recognition” that allows for rapid learning from facial features.

Another step towards our software identifying and labelling people in our media.



AI and Production: Now

In the Overview I pointed out that most of what is being written up as Artificial Intelligence (AI) is really the work of Learning Machines/Machine Learning. We learnt that Learning Machines can improve your tax deduction, do the work of a paralegal, predict court results, analyze medical scans, and much more. It seems that every day I read of yet another application.

There are readily available Learning Machines available for all comers, but there are ways to benefit from them without even using one.


Over the last couple of years I’ve become more and more interested in the ways that the research being done into Artificial Intelligence (AI) might be applied to production and post production. In this article I’ll be giving an overview of what AI is at this stage of development, and what technologies are being used. Later articles will cover immediate and future applications and implications.


It seems the smartest way to make money right now is to have a startup in speech recognition, machine learning, neural networks or other Artificial Intelligence related startup.

TechCrunch reported late last week the Apple had acquired another machine learning company:

After buying Perceptio at the end of 2015 and Turi just a few months ago, Apple has now acquired an India/US-based machine learning team, Tuplejump.

Presumably to beef up its efforts in AI and machine learning across the company.

Not to be left behind, Google:

said that it’s acquired, a startup with tools for speech recognition and natural language understanding….

In addition to its developers tools, offers a conversational assistant app with more than 20 million users.

I would expect the purchase is to beef up their speech recognition in its AI assistant Google Now.



Show and Tell: Neural Networks in Practice

Google have open sourced it’s Show and Tell model for automatically captioning images. This is an excellent example of how neural networks work: train the model with examples – in this case human captioned images – and then let it loose on new images. From the Venture Beat article:

Google trains Show and Tell by letting it take a look at images and captions that people wrote for those images. Sometimes, if the model thinks it sees something going on in a new image that’s exactly like a previous image it has seen, it falls back on the caption for the caption for that previous image. But at other times, Show and Tell is able to come up with original captions. “Moreover,” Shallue wrote, “it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.”

As the article points out, there are many more players looking to do the same thing. Imagine how much easier life would be in editorial if all the B-roll came in organized like this.

The extensive article by Steven Levy – The iBrain is Here – is a fascinating read on how Apple are using Machine Learning, neural networks and Artificial Intelligences across product lines. It’s well worth the time to read through, but this quote from Phil Schiller stood out:

“We use these techniques to do the things we have always wanted to do, better than we’ve been able to do,” says Schiller. “And on new things we haven’t be able to do. It’s a technique that will ultimately be a very Apple way of doing things as it evolves inside Apple and in the ways we make products.”

The ways this could all be aligned with editing? Speech-to-text; keyword extraction (just like Magic Keywords in Lumberjack System); sentiment extraction; image recognition; facial detection and recognition; speech controlled editing (if anyone really wants that), and the list goes on.

I’d like to believe the Pro Apps Team are working on this.

UPDATE: Ruslan Salakhutdinov is Apple’s first Director of AI.

April 2017
« Mar