Categories
Artificial Intelligence Machine Learning The Technology of Production

AI/Machine Learning Basics on the Digital Production BuZZ

I don’t always cross post my appearances on Larry Jordan’s Digital Production BuZZ, but I thought I did a particularly good explanation of the basics of AI and Machine Learning and how they might apply in production, that I thought I’d share this one.

https://www.digitalproductionbuzz.com/interview/philip-hodgetts-the-basics-of-ai-explained/#.W9iRBi2ZN1Q

Categories
Interesting Technology Machine Learning The Technology of Production

The Advantage of Web APIs

Web APIs (Application Programming Interface) allow us to send data to a remote service and get a result back. Machine learning tools and Cognitive Services like speech-to-text and image recognition are mostly online APIs. Trained machines can be integrated into apps, but in general these services operate through an API.

The big advantage is that they keep getting better, without the local developer getting involved.

Nearly two years ago I wrote of my experience with SpeedScriber*, which was the first of the machine learning based transcription apps on the market. At the time I was impressed that I could get the results of a 16 minute interview back in less than 16 minutes, including prep and upload time. Usually the overall time was around the run time of the file.

Upload time is the downside of of web based APIs and is significantly holding back image recognition on video. That is why high quality proxy files are created for audio to be transcribed, which reduces upload time.

My most recent example sourced from a 36 minute WAV, took around one minute to convert to archival quality m4a which reduced the file size from 419 MB to 71MB. The five times faster upload – now 2’15” – compared with more than 12 minutes to upload the original, more than compensates for the small prep time for the m4a.

The result was emailed back to me 2’30.” That’s 36 minutes of speech transcribed with about 98% accuracy, in 2.5 minutes. That’s more than 14x real time. The entire time from instigating the upload to finished transcript back was 5’45” for 36 minutes of interview.

These APIs keep getting faster and can run on much “heavier iron” than my local iMac which is no doubt part of the reason they are so fast, but that’s just another reason they’re good for developers. Plus, every time the speech-to-text algorithm gets improved, every app that calls on the API gets the improvement for free.

*I have’t used SpeedScriber recently but I would expect that it has similarly benefited from improvements on the service side of the API they work with.

Categories
Adobe Apple Apple Pro Apps Interesting Technology Machine Learning Nature of Work The Business of Production The Technology of Production

Maybe 10 Years is Enough for Final Cut Pro X

On the night of the Supermeet 2011 Final Cut Pro X preview I was told that this was the “foundation for the next 10 years.” Well, as of last week, seven of the ten have elapsed. I do not, for one minute, think that Apple intended to convey a ten year limit to Final Cut Pro X’s ongoing development, but maybe it’s smart to plan obsolescence. To limit the time an app continues to be developed before its suitability for the task is re-evaluated.

Categories
Business Intelligent Assistance Software Machine Learning Nature of Work The Technology of Production

Looking Forward to 2018

As part of the regular year end activities The Digital Production BuZZ invited me, and a bunch of other people, to look forward to 2018 and predict what the major themes will be.

Here is a link to the full show -  
http://www.digitalproductionbuzz.com/2018/01/digital-production-buzz-january-4-2018/
Here is a link to the Transcript-  
http://www.digitalproductionbuzz.com/2018/01/transcript-digital-production-buzz-january-4-2018/
And here are the links (including the MP3 version) to your individual interview - 
http://www.digitalproductionbuzz.com/interview/hodgetts-smart-assistants-hdr-and-vr/
MP3: http://www.digitalproductionbuzz.com/BuZZ_Audio/Buzz_180104_Hodgetts.mp3

Categories
Artificial Intelligence The Technology of Production

The Terence and Philip Show Episode 78: AI and Editing

Recently a research team at Sanford University showed a demonstration project that purported to cut a piece of dialog in “different styles.” What does it mean and how does it related to the work Philip Hodgetts and Dr Gregory Clarke did in 2008 with First Cuts.  What will it mean for employment?

Hear what Terence Curren and I had to say on the topic.

 

Categories
Interesting Technology Random Thought The Technology of Production

Looking back on 2017 on the Digital Production BuZZ

I was honored to be invited – as one of many – to provide my thoughts on 2017: what technologies were important, what major changes happened.

Here is a link to the full show -  
http://www.digitalproductionbuzz.com/2017/12/digital-production-buzz-december-28-2017/
Here is a link to the Transcript -  
http://www.digitalproductionbuzz.com/2017/12/transcript-digital-production-buzz-december-28-2017/
Or if you want to go direct to my segment:
http://www.digitalproductionbuzz.com/interview/philip-hodgetts-2017-in-review/
MP3: 
http://www.digitalproductionbuzz.com/BuZZ_Audio/Buzz_171228_Hodgetts.mp3

Categories
Adobe Apple Augmented Reality Business Interesting Technology Lumberjack Machine Learning Metadata The Business of Production The Technology of Production

2017 – 2018 Introspection

If I was to summarize 2017 it would be: AI, HDR, VR, AR and Resolve. If you missed any trend they would be Artificial Intelligence (really Machine Learning); High Dynamic Range; Virtual Reality (i.e. 360 or 180 degree video); Augmented Reality; and Blackmagic Design’s Resolve 14.

As Augmented Reality is composited in at the viewer’s device, I doubt there will be any direct affect on production and post production.

Virtual Reality has had a good year with direct support appearing in Premiere Pro CC and Final Cut Pro X. In both cases the NLE’s parent purchased third party technology and integrated it. Combined with the ready availability of 360 cameras, there’s no barrier to VR production.

Except the lack of demand. I expect VR will become a valuable tool for a range of projects like installations, telepresence and travel, and particularly in gaming, although that’s outside my purview.

What I don’t expect is a large scale uptake for narrative or general entertainment functions. Nor in most of the vast range of video production. It’s not a fad, like 3D, but will likely remain a niche in the production world. I should point out it’s very possible to make good money in niches!

Conversely I would not buy a new screen without it being HDR compatible – at least with one or two of the major HDR formats. High Dynamic Range video is as big a step forward as color. I believe it provides a fundamentally better viewing experience than simply upping the pixel count.

High Dynamic Range is supported across the most important editing software but suffers from two challenges: the proliferation of competing standards and studio monitoring.

The industry needs to consolidate to one standard, or sets will have to be programmed for all standards. None currently are. Ultimately this will change because it has to, but some earlier set purchasers will probably be screwed over!

HDR studio monitors remain extremely expensive, and hard to find. There’s also the problem of grading for both regular and high dynamic range screens.

I have no doubt that HDR is fundamental to the future of the “television” screen. It will further erode attendance in movie theaters as the home experience is a better image than the movie theater, and you get to control who arrives in your media room!

In 2017 Resolve fulfilled it’s long growing promise of integrating a fully feature NLE into an excellent grading and DIT tool. One with a decent Digital Audio Workstation also integrated. Blackmagic Design are definitely fulfilling their vision of providing professional tools for lens-to-viewer workflows, while continuing to reduce the cost of entry.

When you hear that editors in major reality TV production companies don’t balk at Resolve, despite being Media Composer traditionalists, I do worry that Avid may be challenged in its core market. Not that any big ProdCo has switched yet, but I wouldn’t be surprised to see significant uptake of Resolve as an editing too in 2018.

My only disappointment with Resolve is that, as of 14.1, there is now way to bridged timed metadata into Resolve. Not only does that mean we cannot provide Lumberjack support, but no transcript (or AI derived metadata) import either. It’s frustrating because version 14 included Smart Collections that could function like Keyword Collections.

In another direct attack on Avid’s core markets, both Resolve and Premiere Pro CC added support for bin locking and shared projects. Implemented slightly differently by each app, they both mimic the way Media Composer collaborates. Resolve adds a nice refinement: in app team messaging.

The technology that will have the greatest affect on the future of production has only just begun to appear. While generally referred to as Artificial Intelligence, what most people mean, and experience, are some variation on Machine Learning. These types of systems can learn (by example or challenge) to expertly do one, or two tasks. They have been applied to a wide range of tasks as I’ve written about previously.

The “low hanging fruit” for AI integration into production apps are Cognitive Services, which are programming interfaces that help interpret the world. Speech-to-Text, Facial recognition, image content recognition, emotion detection, et. al. are going to appear in more and more software.

In 2017 we saw several apps that use these speech-to-text technologies to get transcripts into Premiere Pro CC, Media Composer and Final Cut Pro X. Naturally that’s an area that Greg and I are very interested in: after all we were first to bring transcripts into FCP X (via Lumberjack Lumberyard). What our experience with that taught us is that getting transcripts into an NLE that doesn’t have Script Sync wasn’t a great experience. Useful but not great.

Which is why we spent the year creating a better solution: Lumberjack Builder. Builder is still a work in progress, but it’s a new NLE. An NLE that edits video by editing text. While Builder is definitely an improvement on purely transcription apps, it won’t be the only application of Cognitive Services.

I expect we at Lumberjack System will have more to show later in the year, once Builder is complete. I also expect this is the year we’ll see visual search integrated into Premiere Pro CC. Imagine being able to search b-roll by having computer vision recognize the content. No keywording or indexing.

Beyond Cognitive Services we will see Machine Learning driving marketing – and even production – decisions. In 2018, the terms Artificial Intelligence, Machine Learning, Deep Learning, Neural Networks will start appearing in the most unexpected places. (While they describe slightly different things, all those terms fall under the Artificial Intelligence umbrella.)

I’m excited about 2018, particularly as we do more with our new intelligent assistants.

Categories
Apple The Technology of Production

Multicamera Broadcasting: Then and Now

My first exposure to a multi-camera live broadcast event was in an earlier career when the local TV station set up for a broadcast in the theater I worked for. There were two trucks in the alley way. Cables running everywhere, and two days to set up. Running it all was a team of 12 people.

Yesterday, we did a three camera live broadcast and I carried the broadcast gear in my shoulder bag and was set up in an hour.

That’s a pretty serious change!

Categories
Interesting Technology The Technology of Production

Music Videos and Narrative Scripts by Artificial Intelligence?

A couple of recent articles have pointed to Artificial Intelligence writing, or contributing to, a screenplay. A narrative script. I find this fascinating, even though my own area of interest in applied AI is in non-scripted.

There is no doubt that computer algorithms – up to true AI – will be involved in productions future. Smart people will work out how to master it.

Categories
Assisted Editing Interesting Technology Item of Interest The Technology of Production

Which technological innovation will take your job?

We’re all aware that technology changes the workplace. Jobs disappear; sometimes to be replaced by other jobs that didn’t exist before. During the industrial revolution we were replacing manual labor with machines. The coming revolution is for white collar “knowledge” jobs. How soon will yours be among them?