CAT | Metadata
Metadata is one of the most useful tools we have, if we have the tools to use it! Aside from the obvious problems when no metadata is gathered during the shoot, or insufficient metadata is gathered, other issues arise because there are not always tools in the production chain that use the metadata that has been gathered!
A recent student film was used as a template by the Entertainment Technology Center at USC with the purpose of realizing the long-hoped for promise of production metadata, with some fairly ambitious goals.
The results are interesting and important, particularly considering that this is what I would categorize as Technical metadata, rather than Content metadata.
Although my focus is very much on metadata for production, and in particular Content Metadata, there’s a whole other area of metadata for distribution, built around the EIDR ID and fleshed out largely by Rovi. But there’s another area where metadata will likely have to apply: distribution deliverables.
We were discussing metadata in Final Cut Pro X after dinner last night, as one does, and Greg challenged me to think about the difference between Roles and Keywords (Ranges).
I’d spent time thinking about how best to translate metadata from Lumberjack into FCP X before we gained organizational folders for Keyword Collections in an Event, and was mildly surprised we didn’t have anything we thought would map well to Roles.
And that was the last time I thought about it until last night. It took a minute or two, but then it hit me, and it was totally obvious why there was no place for Roles in a “logging and pre-editing” tool.
Keyword Ranges (and Collections) are for organizing Clips.
Roles are for organizing a Project (timeline), and I guess for exporting information to Producer’s Best Friend where we make good use of Role information.
Just over 7 years ago I started identifying the types of metadata that would be useful in post production. One that particularly excited me was derived metadata: using a computer algorithm to derive useful information for use in post production. At the time the only example I could suggest was deriving location and type of location from GPS data.
I first wrote about derived metadata back at the end of January 2009. Derived metadata uses computer analysis to derive metadata from the video source. There are now technologies for speech-to-text, meaning extraction, facial detection, facial recognition, emotion detection, image recognition, and more. One company has been accumulating these somewhat diverse technologies: Apple.
One of the Final Cut Pro X features that really resonates with me, is Keyword Ranges, and by extension, Keyword Collections. I realize now that this enchantment is because Keyword Ranges are a very pure embodiment of Content Metadata. I also realize now, that I’d been simulating this approach in other software, for as long as I can remember. In order to understand better, we’ll need to take a little trip to the past.
At the current stage of technology development, we are largely limited to adding Content Metadata manually. If we want people described; if we want the scene described; or the action described, we need to add Keywords or Notes to achieve that. I don’t expect that to be the case in the future. Technology from Clarifai and Google give us clues to the future.
I was saddened, but not really surprised, by this week’s announcement that Adobe were pulling Speech-to-Text transcription from Premiere Pro, Prelude and AME. As Al Mooney says in the blog post:
Today, after many years of work, we believe, and users confirm, that the Speech to Text implementation does not provide the experience expected by Premiere Pro users.
I am saddened to see this feature go. Even though the actual speech-to-text engine was somewhat hit or miss, there was real benefit in the ability to import a transcript (or script) and lock the media to the script. So it’s probably worth keeping the current version of Premiere (or one of the other other apps) to keep the synching function, as the apps will continue to support the metadata if it’s in the file.
Co-incidentally, we had a feature request recently, wanting a transcription-based workflow in Final Cut Pro X. When questions on how he’d like it to work, he described (unintentionally) the workflow in Premiere Pro!
In fact, I’d almost implore Adobe to keep the ability to import a transcript and align it to the media, using a speech analysis engine. That way the industry will have an alternative to Avid’s Script Sync auto-alignment (previously powered by Nexidia) tools currently unavailable in Media Composer. The ability to search – by word-based content – hundreds of media files with transcripts, is extremely powerful for documentary filmmakers.
And yes, there is the Nexidia-powered Boris Soundbite, but there is one problem with this waveform-matching approach: there is no content metadata. Nor anything (like text) we can use to derive content metadata.
This week I sat down with Larry Jordan and Michael Horton and talked – what else – metadata: what it is, why we need it, how we get it, and how we use it. It was a good interview and probably my clearest explanation of what metadata is.
You can hear the interview here:
and read the transcript, courtesy of Take 1 Transcription is at:
Over the weekend, Chris Fenwick interviewed me for his FCPX Grill podcast about the importance of logging, metadata and Lumberjack System. The podcast is available now and it is a good conversation.
What makes it meta though is that Alex Gollner logged the conversation and put it up on his website: metadata (logging) about a conversation on metadata. It doesn’t get much more meta than that!