“Impossible” just takes longer
In an industry that’s constantly evolving, “impossible” is a temporary state.
In an industry that’s constantly evolving, “impossible” is a temporary state.
An explanation of the basics of AI & Machine Learning and its applications to Production.
APIs for Web Services improve over time, and everyone who uses them gets the improvement for free!
A 46 minute interview transcribed with only 15 of 4100 words needing correction. That’s over 99.98% accurate.
Our industry is evolving ever faster. Why shouldn’t our tools? Maybe apps should have a best before date.
Where are the needs for automation in post?
It seems machine learning is powering more and better extraction of all kinds of metadata.
Endgaget recently had an article on transferring facial movements from a person in one video, to a different person in a different video. Unlike previous approaches, this latest development requires only a few minutes of the target person’s video, and correctly handles shadows.
Combined with other research that allows us to literally “put words in people’s mouths” by typing them and having them created in a person’s voice that never said the words. Completely synthesized and indistinguishable from the person saying it.
Transferred facial movements plus created words in that person’s voice and it will be a forensic operation to determine if the results are “genuine” or created.
This is the first time I’ve taken a deep look at a TV show and worked out what I think would be the perfect metadata workflow from shoot to edit bay. I chose to look at Pie Town’s House Hunters franchise because it is so built on a (obviously winning) formulae, and I thought that might make it easier for automation or Artificial Intelligence approaches.
But first a disclaimer. I am in no way associated with Pie Town Productions. I know for certain they are not a Lumberjack System customer and am also pretty sure they – like the rest of Hollywood – build their post on Avid Media Composer (and apparently Media Central as well). This is purely a thought exercise built around a readily available example and our Lumberjack System’s capabilities.
In some way I guess this is another example of Artificial Intelligence (by which we mean Machine Learning) taking work away from skilled technicians, but human recall has been replaced with facial identification at the recent Royal Wedding in the UK, where Amazon’s facial recognition technology was used to identify guests arriving sat the wedding.
Users of Sky News’ livestream were able to use a “Who’s Who Live” function:
As guests arrived at St. George’s Chapel at Windsor Castle, the function identified royals and other notable guests through on-screen captions, interesting information about each celebrity and how they are connected to Prince Harry and Meghan Markle.
The function was made possible by Amazon Rekognition, a cloud-based technology that uses AI to recognize and analyze faces, as well as objects, scenes and activities in images and video. And Sky News isn’t the first to use it: C-SPAN utilizes Rekognition to tag people speaking on camera.
Rekognition is also being used by law enforcement.
Facial recognition and identification would obviously be useful for logging in reality and documentary production.