One of the joys of 2014 for me was to learn to sing – from very much a position of sucking at it. I still suck whenever I start learning a new song. What I realized is that we have to be prepared to suck at something before we can be good at it, or even learn it.
By “sucking” at something I mean, being very, very bad at it. I realize now I’ve been there many times.
There was a time when I had no idea what XML was; now if you search that term and my name you’ll find I have a contribution to be made.
There was a time when I sucked at metadata – like XML I had no idea why it was important.
The thing is, I’ve sucked at so many things and yet, putting through the sucky period, eventually we suck a little less, then barely at all, until we arrive at a point of knowing we don’t suck at that skill or knowledge any more.
Never be afraid to start off badly: it’s the only way to learn something new.
In early 2012 I went through a process of reducing production gear to a minimum, akin to trying to write a Haiku. I’m gearing up for a production trip to Australia to record interviews with my extended family during our quadrennial family reunion. It’s a new production Haiku with different solutions due to the inevitable march of technology, and the needs of this production.
I do not expect this project to ever reach broadcast, but there’s no reason not to have the best quality sound and picture I can, for these recordings should last into posterity. I am traveling alone, so it was important to not carry too much. Essentially I need a good multicam interview setup, with excellent audio quality. I will shoot b-roll around the family reunion and some of the family sites.
At the current stage of technology development, we are largely limited to adding Content Metadata manually. If we want people described; if we want the scene described; or the action described, we need to add Keywords or Notes to achieve that. I don’t expect that to be the case in the future. Technology from Clarifai and Google give us clues to the future.
Out of the blue, Apple announces Final Cut Pro X 10.1.4, which includes some key stability improvements. There is also a Pro Video Formats 2.0 software update, which provides native support for importing, editing, and exporting MXF files with Final Cut Pro X. While FCP X already supported import of MXF files from video cameras, this update extends the format support to a broader range of files and workflows.
- Option to export AVC-Intra MXF files
- Fixes issues with automatic library backups
- Fixes a problem where clips with certain frame rates from Canon and Sanyo cameras would not import properly
- Resolves issues that could interrupt long imports when App Nap is enabled
- Stabilization and Rolling Shutter reduction works correctly with 240fps video
Jon Chapelle of Digital Rebellion has noted that the support for MXF is much wider than just Pro Apps. What is interesting is that the MXF components seem to be QuickTime based, rather than AV Foundation, probably for historic reasons.
I was saddened, but not really surprised, by this week’s announcement that Adobe were pulling Speech-to-Text transcription from Premiere Pro, Prelude and AME. As Al Mooney says in the blog post:
Today, after many years of work, we believe, and users confirm, that the Speech to Text implementation does not provide the experience expected by Premiere Pro users.
I am saddened to see this feature go. Even though the actual speech-to-text engine was somewhat hit or miss, there was real benefit in the ability to import a transcript (or script) and lock the media to the script. So it’s probably worth keeping the current version of Premiere (or one of the other other apps) to keep the synching function, as the apps will continue to support the metadata if it’s in the file.
Co-incidentally, we had a feature request recently, wanting a transcription-based workflow in Final Cut Pro X. When questions on how he’d like it to work, he described (unintentionally) the workflow in Premiere Pro!
In fact, I’d almost implore Adobe to keep the ability to import a transcript and align it to the media, using a speech analysis engine. That way the industry will have an alternative to Avid’s Script Sync auto-alignment (previously powered by Nexidia) tools currently unavailable in Media Composer. The ability to search – by word-based content – hundreds of media files with transcripts, is extremely powerful for documentary filmmakers.
And yes, there is the Nexidia-powered Boris Soundbite, but there is one problem with this waveform-matching approach: there is no content metadata. Nor anything (like text) we can use to derive content metadata.
A recent comment in an article on CNET.com caught my eye:
“If I owned a studio, I’d make movie theaters pay me,” says Dana Brunetti, producer of “House of Cards” and “The Social Network.”
Needles to say I had to read the article. First note was that this comment was in the context of a web focused conference, so there may be an element of “playing to the audience”, but in essence the argument is that more online/web companies should follow Netflix (and Amazon, Google and Apple) into producing more original content.
With online and technology-based companies already threatening traditional distribution methods, the impact would be huge: “Once Silicon Valley can create content as well,” said Brunetti, “they’ll own it soup to nuts.”
I can’t argue with that. More original production means more jobs in the industry. (And yes, more clients for my day job’s business.)
What appeals to me is the push for “per program” content purchase. As long as the pricing issue is solved. It should cost no more (over a month) for a la carte purchases of limited programming, than it is for a full cable subscription.
This week I sat down with Larry Jordan and Michael Horton and talked – what else – metadata: what it is, why we need it, how we get it, and how we use it. It was a good interview and probably my clearest explanation of what metadata is.
You can hear the interview here:
and read the transcript, courtesy of Take 1 Transcription is at:
Comments off · Posted by Philip in Video Technology
Red Shark news reports that Disney Research have:
Researchers working for the Mouse have developed a groundbreaking program that delivers automated edits from multi-camera footage based on cinematic criteria.
When you read how they’ve achieved it, I think it’s impressive, and very, very clever.
The system works by approximating the 3D space of the cameras in relation to each other. The algorithm determines the “3D joint attention,” or the likely center of activity, through an on-the-fly analysis of the multiple camera views. Based on this information, the algorithm additionally takes into account a set of cinematic preferences, such as adherence to the 180 degree rule, avoidance of jump cuts, varying shot size and zoom, maintaining minimum and maximum shot lengths, and cutting on action. The result is a very passable, almost human edit.
Perhaps it’s the very nature of research, but I’m not sure of the practical application. Maybe that’s the point of pure research.
Assuming the technology delivers, it’s rare that we want to take a multicam shoot and do a single, switched playback version. “Live switching” after the fact, if you will. At least in my experience, the edit not only needs to switch multicam angles, but to remove dross, tighten the presentation, add in additional b-roll, etc, etc.
More often than not, my angle cuts are more directed by the edit I want, than a desire to just pick the best shot at the time.
That said, this type of research is indicative of what can be done (and therefore almost certainly will be done): combine a good multicam edit, with content metadata and perhaps you’d have a decent first pass, that could be built on, finished and polished by the skilled editor. The point being, as Larry Jordan points out is
How do you save time every step of the production process, so that you’ve got the time that you need to make your films to your satisfaction.
Ultimately the commercial versions of these type of technologies should be seen as tools editors can use to make more time for their real job: finessing, polishing and finishing the project; bringing it heart that makes the human connection in storytelling.
Variety just posted an article on how many people had watched online game play (of one game) in one week. 75,000 players and 6 million individual viewers who collectively watched 327 million minutes of gameplay. Watched. That’s about an hour per viewer on average.
Six million people watching one game’s game play. That’s a decent network-sized audience these days. That’s one game for one week. Admittedly a release week for the game.
Watching game play has become a huge audience, with very low production costs. While it’s not traditional production, the time spent watching gamers play video games, erodes the time available for other forms of entertainment, specifically films and television!
Once upon a time it was easy to differentiate between Film and TV production: film was shot on film, TV was shot electronically. SAG looked after the interests of Screen Actors (film) while AFTRA looked after the interests of Television actors. That the two actors unions have merged is indicative of the changes in production technology.
As is noted in an article at Digital Trends, there is almost no difference between the technologies used in both styles of production, so what are the differences? It comes down to two thing, which are really the same thing.