Category Archives: Technology

Can a computer replace an editor?

Before we determine whether or not a computer is likely to replace an editor, we need to discuss just exactly what is the role of an editor – the human being that drives the software (or hardware) that edits pictures and sound? What do they bring to the production process? Having determined that, perhaps we can consider what it is that a piece of computer software might be capable of now or in the future.

First off I think we need to rid ourselves of any concept that there is just one “editor” role even though there is only one term to cover a vast range of roles in post production. Editing an event video does not use the same skills and techniques as editing a major motion picture; documentary editing is different from episodic television; despite the expectation of similarity, documentary editing and reality television require very different approaches. There is a huge difference in the skills of an off-line editor (story) and an on-line editor (technical accuracy) even if the two roles are filled by the same person.

So let’s start with what I think will take a long time for any computer algorithm to be able to do. There’s no project from current technology – in use or in the lab – that would lead to an expectation that an algorithm would be able to find the story in 40 hours of source and make an emotionally compelling, or vaguely interesting, program of 45 minutes. Almost certainly not going to happen in my lifetime. There’s a higher chance of an interactive storytelling environment à la Star Trek’s Holodeck (sans solid projection). Conceptually that type of environment is probably less than 30 years away, but that’s another story.

If a computer algorithm can’t find the story or make an emotionally compelling program, what can it do? Well, as we discovered earlier, not all editing is the same. There is a lot of fairly repetitive and rather assembly line work labeled as editing: news, corporate video, event videography are all quite routine and could conceivably be automated, if not completely at least in part. Then there is the possibility of new forms of media consumption that could be edited by software based on metadata.

In fact, all use of computer algorithms to edit rely on metadata – descriptions of the content that the software can understand. This is analogous to human logging and log notes in traditional editing. The more metadata software has about media the more able it is to create some sort of edit. Mostly now that metadata will come from the logging process. (The editor may be out of a job, but the assistant remains employed!) That is the current situation but there’s reason to believe it could change in the future – more on that later in the piece.

If we really think about what it is we do as editors on these more routine jobs, we realize that there are a series of thought processes that we go through and underlying “algorithms” that determine why one shot goes into this context rather than anther shot.

To put it at the most basic level, an example might be during editing content from an interview. Two shots of the same person have audio content we want in sequence but the effect is a jump cut. [If two shots in sequence feature same person, same shot…] At this point we choose between putting another shot in there – say from another interview or laying in b-roll to cover the jump cut. […then swap with alternate shot with same topic. If no shot with same topic available, then choose b-roll.]

That’s a rudimentary example and doesn’t take into account the value judgment that the human editor brings as to whether another interview conveys the story or emotion as well. Most editors are unfamiliar with their underlying thought processes and not analytical about why any given edit “works” – they just know it does but ultimately that judgment is based on something. Some learned skill, some thought process, something. With enough effort that process can be analyzed and in some far distant time and place, reproduced in software. Or it could except for that tricky emotional element – the thing that makes our storytelling interesting and worth watching.

The more emotion is involved in your storytelling output, the safer your job – or the longer it might be before it can be replaced. 🙂

Right now, the examples of computerized editing available now – Magic iMovie and Muvee Auto Producer use relatively unsophisticated techniques to build “edited” movies. Magic iMovie essentially adds transitions to avoid jump-cut problems and builds to a template; Muvee Auto Producer requires you to vet shots (thumbs up or down) then uses a style template and cues derived from audio to “edit” the program. This is not a threat to any professional or semi-professional editor with even the smallest amount of skill.

However, it is only a matter of time before some editing functions are automated. Event videography and corporate presentations are very adaptable to a slightly more sophisticated version of these baby step products. OK, a seriously more sophisticated version of these baby-step products, but the difference between slightly and seriously is about 3 years of development!

In the meantime, there are other uses for “automated” editing. For example, I developed a “proof of concept” piece for QuickTime Live! in February 2002 that used automated editing as a means of exploring the bulk of material shot for a documentary but not included in the edited piece. Not intended to be direct competition for the editor (particularly as that was me) it was intended as a means of creating short edited videos that were customized in answer to a plain language query of a database. The database contained metadata about the Clips – extended logging information really. In addition to who, where and when, there are fields for keywords, a numeric value for relative usefulness of the clip, a field for keywords to search for for b-roll [If b-roll matches this, search for more than one clip in the search result, edit them together and lay b-roll over all the clips that use this b-roll.]

So, right now, computer editing can be taught rudimentary skills. This particular piece of software knows how to avoid jump cuts and cut to length based on the quality criteria. It is, in fact, a better editor than many who don’t know the basic grammar of video editing. Teaching the basic grammar is relatively easy. Teaching software to take some basic clips and cut into a news item or even basic template-based corporate video is only a matter of putting in some energy and effort.

But making something that is emotionally compelling – not any time soon.

Here’s how I see it could pan out over the next couple of year. Basic editing skills from human-entered metadata – easy. Generating that metadata by having the computer recognize the images – possible now but extremely expensive. Having a computer edit an emotionally compelling piece – priceless.

It’s not unrealistic to expect, probably before the end of the decade, that a field tape could be fed into some future software system that recognizes the shots as wide, medium, close-up etc; identifies shots in specific locations and with specific people (based on having been shown examples of each) and transcribes the voice content and the text in signs and other places in the image. Software will recognize poor exposure, loss of contrast and loss of focus, eliminating shots that do not stand up technically. Nothing here is that difficult – it’s already being done to some degree in high end systems that are > $300,000 right now. From there it’s only a matter of time before the price comes down and the quality goes up.

Tie that together with a template base for common editing formats and variations and an editing algorithm that’s not that much further on than where we are now and it’s reasonable to expect to be able to input one or more source tapes into the system in the afternoon, and next morning come back to review several edited variations. A couple of mouse-clicks to choose the best of each variation and the project’s done, output to a DVD (or next generation optical disc), to a template-based website, or uploaded to the play-out server.

Nothing’s far fetched. Developing the basic algorithm was way too easy and it works for its design goals. Going another step is only a matter of time and investment. Such is the case with anything that is repetitive in nature: ultimately it can be reproduced in a “good enough” manner. It’s part of a trend I call the “templatorization” of the industry. But that’s another blog discussion. For now, editors who do truly creative, original work need not be worried, but if you’re hacking together video in an assembly-line fashion start thinking of that fall-back career.

Avid buys Pinnacle – the fallout

The acquisition of Pinnacle will greatly strengthen Avid’s Broadcast video offerings, the area of their business that has been strongest in recent years but will create challenges in integrating product lines and cultures. It is a move that brings further consolidation to the post production business.

Pinnacle has been in acquisition mode for most of the last five years acquiring, among others, Miro, Targa, Dazzle, Fast and Steinberg (sold on to Yamaha recently). It has a diverse line of products in major product lines:

  • Broadcast Tools – Deko On Air graphics products (Character Generators) and MediaStream playout servers;
  • Consumer editing software and hardware – with 10 million customers;
  • Professional Editing – The Liquid product line acquired from Fast; and
  • Editing Hardware – Cinewave and T300 based on the Targa acquisition.

Pinnacle has achieved nine Emmy Awards for its Broadcast product lines.

There will be conflicts and opportunities for Avid. It presents Avid with a new opportunity to create a consumer brand and Avid CEO David Krall has announced that a new consumer division will be formed analogous to the M-Audio consumer audio division acquired last year. M-Audio is the consumer parallel to Avid’s Digidesign professional division. The acquisition also consolidates Avid’s position supplying the Broadcast markets, making the company more of a "one stop shop" for a broadcast facility. There is definitely engineering work to be done on integrating the two technology lines, but there are no particular challenges there, and savings are to be made in streamlining sales and marketing. In broadcast there are only pluses for Avid.

Consumer

Bringing the Avid brand into the consumer market has a slight risk of diluting the Avid editing brand – if consumers edit on "Avid" what’s special about professional editors? However, by carefully managing product brands over company brand, as has been done with M-Audio, there should be an opportunity to bring some of those retail customers up to Xpress or Adrenaline products as their need grows, similar to the way Apple have a path for their iMovie customers to move up to Final Cut Express or Final Cut Pro.

Hardware

Avid and Pinnacle have had a long relationship on the hardware side – Targa supplied the first boards Avid used for video acquisition and the Meridien hardware was designed to Avid’s specifications but manufactured by Pinnacle as an OEM. Whether Avid has any use for the aging T3000 hardware product line (like Cinewave based on the Hub3 programmable architecture that was the primary driver of the Targa purchase) is debatable: Avid have embraced the CPU/GPU future for their products and are unlikely to change course again.

Cinewave

It almost certainly spells the end of Pinnacle’s only Mac product – Cinewave. Rumors were spreading independently of the Avid purchase that Cinewave was at the end of its product life, possibly spurred by changes coming in a future version of Final Cut Pro that no longer supported direct hardware effects. Regardless of whether or not there was any foundation in that rumor, Cinewave is an isolated product in that product group and based on relatively old technology. It is a tribute to the design flexibility and engineering team that essentially the same hardware is still in active production four years after release. Whether the product dies because it’s reached the end of its natural life, or because Avid could not be seen to be supporting the competing Final Cut Pro, it’s definitely at an end.

Liquid

There is, however, one part of the integration that simply does not fit: Pinnacle’s Liquid NLE software. Avid are acquiring an excellent engineering team – the former FAST team out of Germany – but the two NLEs have no commonality. Integrating features from one NLE into another is not trivial as code-bases are unlikely to have any compatibilities, and attempting to move Avid’s customer base toward any Liquid editor is unlikely to have any success at all.

Avid could simply let the product line die. The Liquid range has not exactly sold like hotcakes. This scenario would bring the best of the features and engineers into the Avid family and we’d see the results in 2-3 years as engineering teams merged.

They could, of course, leave Liquid alone – set it up as a division within the company and leave it be. Avid have done that with Digidesign, Softimage and M-Audio. No radical changes and slow integration of technologies where it makes sense. Liquid have probably taken few customers from Avid to date – few Composer customers have moved to Liquid. Instead, Liquid has acquired new NLE customers or people moving "up" from other NLEs. Liquid’s strongest customer bases are in small studios and in broadcast markets.

Even though Avid have let Digidesign and M-Audio compete, even although there is some overlap, it’s hard to imagine keeping a full product line that directly competes with the flagship products – on cheaper hardware at lower cost. Hard to imagine, but not impossible. It would be the most consistent behavior based on past acquisitions but one that would require a delicate balancing act to retain the new customers Pinnacle are bringing to the fold, without risking cutting into the more profitable Xpress, Media Composer and DS products.

Transaction

The transaction values Pinnacle at $462 million based on Avid’s closing price yesterday and will be handled by a combination of cash and shares. Avid will pay about $71 million in cash and issue 6.2 million new shares to the holders of Pinnacle stock, who will then make up about 15% of Avid’s shareholders. The transaction has been approved by the Boards of both companies but must still be approved by regulators and shareholders and is not expected to close until the 2nd or 3rd quarter of 2005.

The companies expect savings in regulatory costs, marketing and sales. We can expect little to change in the short term except probably, some volatility in Avid’s stock price as people try and work out what it all means.

NAB is going to be interesting this year.

RSS, Vlogcasting and Distribution Opportunities [Edited]

Earlier I wrote about podcasting and the rapid uptake. Well, there’s every indication that video podcasting, of some sort, will follow. I think this is a tremendous opportunity for content creators because podcasting isn’t about broadcast but in fact an opt-in subscription service. In any discussion of these subjects it keeps echoing in my mind that RSS – is really simple subscription management (and yes, conditional access is possible) and Blogs.

To draw some parallels with traditional media: blogs are the journalism and writing and RSS would be the publishing channel (the network). Blogs and podcasting are bypass technologies – they bypass traditional channels. If pressed for an explanation for the truly astounding growth of podcasting and blogging to a lesser degree, I would hypothesize that they are to some degree a reaction against the uniformity of voice of modern media, where one company owns a very large proportion of the radio stations in the US (and music promotion and billboards) and news media is limited to a half dozen large company sources with little bite and no diversity.

The “blogsphere” (hate that word but it’s in common use) broke the CBS faked service papers during the last Presidential election campaign, and even in the last few weeks has been instrumental in the firing of a high level CNN executive and revealing the “fake” White House journalist and his sordid past. Collectively at least this is real journalism – and more importantly, it’s investigative journalism of the sort that isn’t done by traditional news outlets.

Blogging is popular because it’s easy and inexpensive. Sign up for a free blog on a major service, or download open source blogging software like WordPress (like I use) running on your own server. In a few minutes your voice is out there to be found. In my mind it harkens back to days of Wild West newspapers where someone would set a printing press and suddenly, be a newspaper publisher. But unlike a newspaper, blogs have an irregular publishing schedule. You can bookmark your favorite blogs and check them (when you remember), or you can be notified by your RSS aggregator application when there’s a new post (the URL for the RSS feed for this blog is in the bottom right of the page if you want to add it to your favorites).

Podcasting is easy and inexpensive unless your podcast becomes popular – then the bandwidth expense becomes considerable. Podcasting is a superior replacement for webcasting or streaming that does not have to be in real time. It’s produced and automatically delivered for consumption when it suits the listener. Those are the key attributes that, in my opinion, contribute to its success. There’s no need for a listener to tune in at exactly the time it’s “broadcast” – listen or miss it – or even to remember to visit some archive and download. My own experience on DV Guys totally parallels this. DV Guys has a nearly-five-year history as a live show (Thursday 6-7 pm Pacific) but has always been more popular through its archives pages.

Shortly after the advent of podcasting we set up a podcast of the live show available by the next day. Since then DV Guys has enjoyed more listeners than at any other time in its life. People tended to forget to visit the archives site weekly, or every second week. Even visiting every couple of weeks was too much of a commitment for a show that, while entertaining and informative, wasn’t at the top of everyone’s “must do” list. But with a podcast DV Guys is ready, available whenever a listener has a few moments – at the gym, during a commute, while waiting for a meeting or at an airport. DV Guys, like most radio, is consumable anytime. Importantly, it puts the listener in control, not the creator, of the listening experience.

Podcasting audio has another advantage – it’s easy to create. Almost as easy as blogging but not quite so easy. There are, consequently, proportionally fewer podcasts than there are blogs, because of that higher entry requirement. Even then, most podcasts are simply guys (mostly guys) talking into a microphone from a prepared script, or a few people together talking around a microphone. More highly produced podcasts are rarer.

The simplicity of publishing a blog means that it can be published for as few as half a dozen people – in fact there are people looking to use blogs and wikis as part of a project management tool. Podcasting can reach thousands but in broadcast terms that’s a tiny niche market.

Here’s a new truism – almost all markets are niche markets. What these new publishing models do is aggregate enough people in a niche to make it a market. There’s a lot of money to be made in niches. Particularly in the US, where there are a multiplicity of cable channels, small niches in the entertainment industry can be aggregated with appropriate low cost distribution channels, into profitable businesses. There are a lot of niches that are too small to be have their needs met by even a niche cable network, so cable channels get subdivided, or there’s no content for small niches.

RSS, low cost production tools and P2P is your new distribution channel. This is the other side of the production revolution we’ve been experiencing over the last 10 years, when the cost of “broadcast quality” content has dropped from an equipment budget of $200,000 upward (for camera, recorder and edit suite with titling and DVE) to similar quality at well under $20,000 (and many people doing it for under $10,000). At the end, a computer (or computer-like) device will be one of the inputs to that big screen TV you covet and you’ll watch content fed via subscription when it suits. If it’s not news, it can come whenever, ready to be watched whenever.

The relative difficulty of producing watchable video content will further limit the numbers (as happened from blogging to podcasting) and the current state of video blogs will make experienced professionals cry. That should not stop you planning for your own network. Instead of “The Comedy Network”, the “Sci-Fi Network” etc, prepare for a world of the “The fingernail beauty network” or “Fiat maintenance network” or “Guys who like to turn wood on a lathe network”, et al. Content could be targeted geographically, or demographically. There are very profitable niches available. Two that I’ve been involved with, at the video production level were for people who like to paint china plates (challenging to light) and basic metalwork skill training. There’s no need to fill 24 hours a day 7 days a week with this network model. When new content is produced, it’s ready for consumption when viewers want. We do something similar with our Pro Apps Hub where we publish content from out training programs piecemeal, as it’s produced, before we aggregate the disc version.

Note that I am not, basically, talking about computer-based viewing. My expectation is that software and hardware solutions will evolve into something usable as a home entertainment device. “TV” is a kick-back, put your feet up experience, video on the computer is a lean-forward pay attention experience. While both could be used for the end target of this publication model, what I’m really talking about is content for that lean back experience.

Now, I don’t expect “Hollywood” (as a collective noun for big media) to embrace this model early, or even ever, but that doesn’t mean it’s not going to become viable. The most popular content will probably still go through the current distribution channels, however they evolve. It also doesn’t mean we’ll be restricted to small budget production either. It could (should) evolve models where the viewers were in much closer touch with the producers, without the gatekeeper model.

For example, the basic skill training video series I produced back in Australia was niche programming. There were, effectively, 75 direct customers in the small Australian market (smaller, I should point out, than California alone). No customer or central group had money for production but each one had $150 – $300 to buy a copy of a product. Since these were very simple productions requiring a small crew and being produced in a regional city, each project had about a 30% profit margin. If the same proportions applied to the US market, the budget would have doubled but the profit would have quadrupled or more.

Take another more current example. Star Trek Enterprise has been canned but the last season had 2.5 million viewers an episode with a budget of $1.6 million an episode. If each viewer paid 75c for the episode, delivered directly to their “TV storage device” (somewhere between a Media PC, Mac Mini or TiVo) then the producers would turn a profit of $200,000 an episode or 12.5% margin on what they were getting from the network. At 99c a show, that’s nearly 50% more revenue than was coming from the network. And the audience isn’t limited to just the US market – that same content can be delivered to Enterprise fans anywhere in the world. As the producer I could live on 13 episodes at $200,000 profit above and beyond previous costs (which presumably included some salary and profit). Moreover, producers wouldn’t be locked into rating cycles and the matching boom/bust production cycle.

It doesn’t matter if each high quality (HD if you want) episode takes 20 hours of download “in the background”. When it’s complete and ready to watch it appears in the play list as available.

Bandwidth would be a killer in that scenario – even with efficient MPEG-4 H.264 encoding, a decent SD image is going to require 1-1.5 Mbits/sec and HD is going to want 7 or 8 Mbits/sec. Assuming 45 minute episodes (sans commercials) that’s around 700 MB an episode per person, or in HD about 5 GBytes, per subscriber. Across 2+ million subscribers that’s going to eat my profit margin rather badly without another solution. There are technologies in place that could be adapted.

Assuming the bandwidth challenge is resolved. What’s left?

Two things mostly: the device that stores the bits ready for display on the TV and software to manage the conditional access (you only get the bits you bought) and playlists. Something like a video/movie version of the iTunes music store. We’ll need to wait for a big player like Apple or Sony to wield enough muscle for that, but in the meantime, we see the beginning with Ant but as a computer interface and without the simplicity and elegance of a Dish/Direct/TiVo user interface. But it will come.

Will you have your network business plan ready? I’m working on mine already.

Wired has another take. Videoblogging already has a home page and for a bit of thinking on the flip-side, how this might all work for the individual wanting to aggregate a personal channel, Robin Good has a blog article on Personal Media Aggregators in one of my favorite (i.e. challenging) blogs.

NAB, Rumors and business

Why does the Apple rumor mill get so frantic coming up to NAB? It’s not like we don’t all know to delay purchases until after NAB unless you can get a pay back in the months between now and then. So what is it that makes us frantically review rumor sites and set the forums and email groups buzzing when ThinkSecret purported to leak (yet again) from within Apple?

Nobody can confirm or refute the rumors until Sunday April 17th, and in reality the rumors don’t do much more than supposedly “confirm” what can reasonably be inferred from existing public announcements (HDV support in FCP “next version” is an announced feature); known intentions to meet customer desire (heck there was even an obscure reference to Multicam in the FCP 4 manual suggesting it was, at one time, proposed for that version); or reasonable inference (CoreVideo technology in the OS would enhance FCP’s real time). New applications for sure – that’s called progress and until Apple have a full and complete set of professional tools in the Pro Apps product lineup then they’ll keep announcing new tools.

Since I am only guessing and have no knowledge, I won’t be publishing my guesses here or on DV Guys but ask me privately and I’ll make my guesses. Even though I think I’m as good at guessing as the next person I still expect to be surprised and impressed come NAB.

But that’s not the point – lots of opportunity for rumor mongering all over the place. It doesn’t do any good, it doesn’t influence business or buying decisions so why is there this intense speculation about what Apple might be going to announce? And why mostly Apple? Avid haven’t pre-announced their NAB releases. There’s the same level of secrecy going on but not the speculation.

Is this some bizarre desire to be “on the inside”? A sort of technological one-upmanship? It’s not like knowing there’s a new version of Final Cut Pro coming sometime (probably) in the next 2-3 months makes editing any easier today, or eases the pain of any “undocumented features” currently existing.

Until this last year or so I was as keenly interested in listening to, and spreading, any rumors I could find and yet now I find myself strangely disinterested. Curious yes – I’ll go read the rumor and consider whether or not I think it’s reasonable – but I find myself not as interested in spreading the guesses and inference.

I wonder why that is? Is it finally maturity, or is it finally evidence that I am, officially, jaded? 🙂

Update March 1 – there’s just been a purported “leak” of Avid’s NAB announcements. While the leak is almost certainly bogus, this type of malicious leak can be very damaging. The supposed prices are way below what is reasonable for Avid (although if true, would be a real change of direction) and there are other key giveways for the educated reader, that this is not a real release. But now, whatever great announcements Avid had for NAB will be compared with a totally unrealistic, bogus release setting up expectations that were never reachable.

At least that’s my take. If not and Avid do announce $5000 Unity and open interoperability with AJA and Decklink on April 16, then that paragraph will have never happened 😉

HDV – Is it something or is it nothing?

I’ve just added a comprehensive briefing paper to the Pro Apps Hub on HDV called, as the title of this post suggests, “Is it something or is it nothing?” Bottom line, it’s something all right and it’s going to be the final factor that drives production inexorably to HD.

Here’s the introductory paragraph:

“It’s hard not to be caught up in the HDV hype but is this 19/25 Mbit High Definition format going to take the world by storm, or does the heavy compression make it unworkable? This briefing paper takes a look at:

  • the format and how they fit an HD signal on a DV tape,
  • how it looks in practice,
  • how HDV can be edited,
  • distribution HDV, and
  • how it is likely to fit into, and change, the production and post-production industries. Particular attention is paid to working with HDV with Apple’s editing applications.”

You can access the briefing paper by downloading the free Pro Apps Hub software and following the link to download. The Pro Apps Hub is the most up to date, no time-wasting news for Apple’s Pro Apps users, daily productivity tips, briefing papers, the only index to the best of what’s free on the Internet – tutorials, articles, resources, forums so you don’t waste time with what isn’t great, and an online catalog. (Did I mention that I’m incredibly proud of what we’ve created with the Pro Apps Hub?)

Check out the HDV article and follow the link at the end of the article back here to comment. This entry will load directly in the Hub.