Indirectly – by forming the Partnership on AI – Facebook, Amazon, Google, IBM and Microsoft to promote “best practices” in Artificial Intelligence. I just hope that includes not wanting to wipe out humans!
Google have open sourced it’s Show and Tell model for automatically captioning images. This is an excellent example of how neural networks work: train the model with examples – in this case human captioned images – and then let it loose on new images. From the Venture Beat article:
Google trains Show and Tell by letting it take a look at images and captions that people wrote for those images. Sometimes, if the model thinks it sees something going on in a new image that’s exactly like a previous image it has seen, it falls back on the caption for the caption for that previous image. But at other times, Show and Tell is able to come up with original captions. “Moreover,” Shallue wrote, “it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.”
Terry Curren pointed me to this example where IBM Watson (one of the Smart APIs I referred to a couple of weeks back) was tasked with determining whether or not an Artificial Intelligence could “cut” a movie trailer. This is the result, with a very interesting insight into how they did it at the end.
IBM Watson pulled the selects based on action and emotion, but an editor created the trailer from the selects. Still, being able to locate the highlights and determine emotion is a big step forward.
The extensive article by Steven Levy – The iBrain is Here – is a fascinating read on how Apple are using Machine Learning, neural networks and Artificial Intelligences across product lines. It’s well worth the time to read through, but this quote from Phil Schiller stood out:
“We use these techniques to do the things we have always wanted to do, better than we’ve been able to do,” says Schiller. “And on new things we haven’t be able to do. It’s a technique that will ultimately be a very Apple way of doing things as it evolves inside Apple and in the ways we make products.”
The ways this could all be aligned with editing? Speech-to-text; keyword extraction (just like Magic Keywords in Lumberjack System); sentiment extraction; image recognition; facial detection and recognition; speech controlled editing (if anyone really wants that), and the list goes on.
I’d like to believe the Pro Apps Team are working on this.
Buried in an article called The iBrain is Here about Apple’s use of Artificial Intelligence across a wide range of products and purposes was this gem:
Machine learning…. It even knows what good filmmaking is, enabling Apple to quickly compile your snapshots and videos into a mini-movie at a touch of a button
At one level this is certainly true, and likely. After all, Greg and I spent a summer analyzing how I made documentary-style edits. It was a fascinating experience for me, analyzing why “that” was the right place to start b-roll over an interview.
I would then have to turn that analysis into a rule of thumb that Greg could program. This was the basis of (the now gone) First Cuts app. That work will resurface at some time. It’s too valuable not to.