CAT | Machine Learning
It’s relatively easy to get an overview of the current state of Artificial Intelligence (AI). It’s probably easier to understand the benefits of machine learning, particularlyMachine Learning (ML) that’s already applied to common tasks that we can benefit from now because we’re fitting those new technologies within existing frameworks.
What is much harder to determine, is how machine learning will be directly applied to post production processes, and what role AI will take in our collective production future.
In September 2010 Apple purchased Swedish facial recognition company Polar Rose, and today we learn they’ve purchased Israeli startup RealFace: “a cybersecurity and machine learning firm specializing in facial recognition technology”.
What is different between the two purchases is that this latest is based on machine learning.
…the startup had developed a unique facial recognition technology that integrates artificial intelligence and “brings back human perception to digital processes”. RealFace’s software is said to use proprietary IP in the field of “frictionless face recognition” that allows for rapid learning from facial features.
Another step towards our software identifying and labelling people in our media.
In the Overview I pointed out that most of what is being written up as Artificial Intelligence (AI) is really the work of Learning Machines/Machine Learning. We learnt that Learning Machines can improve your tax deduction, do the work of a paralegal, predict court results, analyze medical scans, and much more. It seems that every day I read of yet another application.
There are readily available Learning Machines available for all comers, but there are ways to benefit from them without even using one.
Over the last couple of years I’ve become more and more interested in the ways that the research being done into Artificial Intelligence (AI) might be applied to production and post production. In this article I’ll be giving an overview of what AI is at this stage of development, and what technologies are being used. Later articles will cover immediate and future applications and implications.
Comments off · Posted by Philip in Machine Learning
It seems the smartest way to make money right now is to have a startup in speech recognition, machine learning, neural networks or other Artificial Intelligence related startup.
TechCrunch reported late last week the Apple had acquired another machine learning company:
Presumably to beef up its efforts in AI and machine learning across the company.
Not to be left behind, Google:
…said that it’s acquired API.ai, a startup with tools for speech recognition and natural language understanding….
In addition to its developers tools, Api.ai offers a conversational assistant app with more than 20 million users.
I would expect the purchase is to beef up their speech recognition in its AI assistant Google Now.
Google have open sourced it’s Show and Tell model for automatically captioning images. This is an excellent example of how neural networks work: train the model with examples – in this case human captioned images – and then let it loose on new images. From the Venture Beat article:
Google trains Show and Tell by letting it take a look at images and captions that people wrote for those images. Sometimes, if the model thinks it sees something going on in a new image that’s exactly like a previous image it has seen, it falls back on the caption for the caption for that previous image. But at other times, Show and Tell is able to come up with original captions. “Moreover,” Shallue wrote, “it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.”
As the article points out, there are many more players looking to do the same thing. Imagine how much easier life would be in editorial if all the B-roll came in organized like this.
The extensive article by Steven Levy – The iBrain is Here – is a fascinating read on how Apple are using Machine Learning, neural networks and Artificial Intelligences across product lines. It’s well worth the time to read through, but this quote from Phil Schiller stood out:
“We use these techniques to do the things we have always wanted to do, better than we’ve been able to do,” says Schiller. “And on new things we haven’t be able to do. It’s a technique that will ultimately be a very Apple way of doing things as it evolves inside Apple and in the ways we make products.”
The ways this could all be aligned with editing? Speech-to-text; keyword extraction (just like Magic Keywords in Lumberjack System); sentiment extraction; image recognition; facial detection and recognition; speech controlled editing (if anyone really wants that), and the list goes on.
I’d like to believe the Pro Apps Team are working on this.
UPDATE: Ruslan Salakhutdinov is Apple’s first Director of AI.