CAT | Technology
I was saddened, but not really surprised, by this week’s announcement that Adobe were pulling Speech-to-Text transcription from Premiere Pro, Prelude and AME. As Al Mooney says in the blog post:
Today, after many years of work, we believe, and users confirm, that the Speech to Text implementation does not provide the experience expected by Premiere Pro users.
I am saddened to see this feature go. Even though the actual speech-to-text engine was somewhat hit or miss, there was real benefit in the ability to import a transcript (or script) and lock the media to the script. So it’s probably worth keeping the current version of Premiere (or one of the other other apps) to keep the synching function, as the apps will continue to support the metadata if it’s in the file.
Co-incidentally, we had a feature request recently, wanting a transcription-based workflow in Final Cut Pro X. When questions on how he’d like it to work, he described (unintentionally) the workflow in Premiere Pro!
In fact, I’d almost implore Adobe to keep the ability to import a transcript and align it to the media, using a speech analysis engine. That way the industry will have an alternative to Avid’s Script Sync auto-alignment (previously powered by Nexidia) tools currently unavailable in Media Composer. The ability to search – by word-based content – hundreds of media files with transcripts, is extremely powerful for documentary filmmakers.
And yes, there is the Nexidia-powered Boris Soundbite, but there is one problem with this waveform-matching approach: there is no content metadata. Nor anything (like text) we can use to derive content metadata.
This week I sat down with Larry Jordan and Michael Horton and talked – what else – metadata: what it is, why we need it, how we get it, and how we use it. It was a good interview and probably my clearest explanation of what metadata is.
You can hear the interview here:
and read the transcript, courtesy of Take 1 Transcription is at:
Red Shark news reports that Disney Research have:
Researchers working for the Mouse have developed a groundbreaking program that delivers automated edits from multi-camera footage based on cinematic criteria.
When you read how they’ve achieved it, I think it’s impressive, and very, very clever.
The system works by approximating the 3D space of the cameras in relation to each other. The algorithm determines the “3D joint attention,” or the likely center of activity, through an on-the-fly analysis of the multiple camera views. Based on this information, the algorithm additionally takes into account a set of cinematic preferences, such as adherence to the 180 degree rule, avoidance of jump cuts, varying shot size and zoom, maintaining minimum and maximum shot lengths, and cutting on action. The result is a very passable, almost human edit.
Perhaps it’s the very nature of research, but I’m not sure of the practical application. Maybe that’s the point of pure research.
Assuming the technology delivers, it’s rare that we want to take a multicam shoot and do a single, switched playback version. “Live switching” after the fact, if you will. At least in my experience, the edit not only needs to switch multicam angles, but to remove dross, tighten the presentation, add in additional b-roll, etc, etc.
More often than not, my angle cuts are more directed by the edit I want, than a desire to just pick the best shot at the time.
That said, this type of research is indicative of what can be done (and therefore almost certainly will be done): combine a good multicam edit, with content metadata and perhaps you’d have a decent first pass, that could be built on, finished and polished by the skilled editor. The point being, as Larry Jordan points out is
How do you save time every step of the production process, so that you’ve got the time that you need to make your films to your satisfaction.
Ultimately the commercial versions of these type of technologies should be seen as tools editors can use to make more time for their real job: finessing, polishing and finishing the project; bringing it heart that makes the human connection in storytelling.
Once upon a time it was easy to differentiate between Film and TV production: film was shot on film, TV was shot electronically. SAG looked after the interests of Screen Actors (film) while AFTRA looked after the interests of Television actors. That the two actors unions have merged is indicative of the changes in production technology.
As is noted in an article at Digital Trends, there is almost no difference between the technologies used in both styles of production, so what are the differences? It comes down to two thing, which are really the same thing.
Over on IndieGoGo there’s a project for MOX – an open source mezzanine codec for (mostly) postproduction workflows and archiving. The obvious advantage over existing codecs like ProRes, DNxHD and Cineform is that MOX will be open source, so there is significantly reduce risk that the codec might go away in the future, or stop being supported.
Technically the project looks reasonable and feasible. There is a small, but significant, group of people who worry that support for the current codecs may go away in the future. There’s no real evidence for this, other than that Apple has deprecated old, inefficient and obsolete codecs by not bringing them forward to AVFoundation.
I have more concerns for the long term with an open source project. History shows that many projects start strong, but ultimately it comes down to a small group of people (or one in MOX’s case) doing all the work, and inevitably life’s circumstances intervene.
MOX is not a bad idea. I just doubt that it will gain and sustain the momentum it would need.
As Final Cut Pro X – and other modern video apps – are built on Frameworks from the core OS, those Frameworks sometimes provide clues to Apple’s thinking. One that we care a lot about is AVFoundation, which is the modern replacement for QuickTime at the application and OS level. We’ve seen this in the transition from QuickTime Player 7, which is built on QuickTime (both QTKit and the older C API). Unfortunately AVFoundation has lacked many features that are essential for video workflows, so I watch the features added to AVFoundation as a way of understanding where video apps might go.
Firstly, there has been a massive update to AVFoundation in Yosemite, and it appears we get reference movies back.
Greg and I are heading for Amsterdam tomorrow for IBC, but especially the Amsterdam Supermeet. This will be the first time we’ve had Lumberjack System at a Supermeet and we’re going to celebrate by giving away 200 of these cute little keyring pouches that can carry a credit card, business cards, or a USB memory stick attached to your keyring.
Two of the pouches will carry vouchers for two (2) Envoy Pro EX 240GB SSD drives courtesy of OWC/ MacSales which you’ll get instantly from OWC at the Supermeet.
Come check out Lumberjack System – if you edit with Final Cut Pro X, some easy logging in the field will pay huge benefits preparing for post.
Visit our table at the #Amsterdam #SuperMeet. Save €5.00 on tickets. Buy using this link: https://amsterdam2014.eventbrite.com/?discount=lumbervip
Adobe have previewed their IBC video app presentations and have confirmed that they will be continually adding new features to the Creative Cloud.
FCP.co’s lead story today is good news for Apple and Final Cut Pro X – The BBC are adopting more than 1000 seats of Final Cut Pro X for news. To be fair, the BBC seems to be adopting both Final Cut Pro X and Premiere Pro across their own production units, and some remain on installs of Media Composer, but News seems to be going Final Cut Pro X exclusively.
My Apple PR contacts tell me that Final Cut Pro X is also being used on “several popular daytime shows” as well.
In other good news, not reported (yet) on FCP.co, the French TF1 group have also adopted Final Cut Pro X. According to this Tweet both Premiere Pro CC and Final Cut Pro X were tested, with Final Cut Pro X getting the gig.
Of course, come IBC I’m sure Adobe will share some of their new partners as well, and no doubt, increased Creative Cloud subscribers.
A new show in which we discuss 4K. http://www.theterenceandphilipshow.com/?p=546