CAT | Technology
I’ve (along with many other people) have been beta testing SpeedScriber, an unreleased app that combines the power of an API for speech to text with a well thought out interface for correcting the machine transcription. Feed the SpeedScriber output to Lumberyard (part of Lumberjack System) and extract Magic Keywords and in a very short period of time (dependent largely on FCP X’s import speed for the XML) and you have an organized, keyworded Event with a fully searchable Transcript in the Notes field.
One of the powerful way Artificial Intelligence ‘learns’ is by using neural networks. Neural Networks are trained with a large number of examples where the result is known. The Neural Network adjusts until it gives the same result as the human ‘teacher’.
However, there’s a trap. If that source material contains biases – such as modeling Police ‘stop and frisk’ – then whatever biases are in the learning material will be contained in the subsequent AI modeling. This is the subject of an article in Nature: There is a blind spot in AI research and also the praise of Cathy O’Neil’s book Weapons of Math Destruction that not only brings up that issue, but the problem of “proxies”.
Proxies, in this context, are data sources that are used in AI programs that are not the actual data, but rather something that approximates the data: like using zip code as a proxy for income or ethnicity.
Based on O’Neil’s book, I’d say the authors of the Nature article are too late. There are already institutionalized biases in very commonly used algorithms in finance, housing, policing and criminal policy.
Rather than take up more screen real estate with a new button, we repurposed an existing function in Lumberyard. Previously, any logged Keyword Range less than 5 seconds long was ignored. We figured anything that short was a mistake. Now it creates a Marker.
The Marker will be named using the Keyword as the name, but it will be applied at the starting point as a single frame Marker.
Tim Cook – Apple’s CEO – has said in a new interview with BuzzFeed, that Augmented Reality (AR) will be more important than Virtual Reality (VR). Virtual Reality creates a new environment that is immersive for the viewer. Augmented Reality overlays computer generated data on the real word (as captured by a camera).
While VR is undoubtedly going to be a significant technology in the future, I think it will mostly enhance games, exhibits and remote presence rather than everyday activities. AR can overlay translated text over foreign signage. AR can create geotagged games like the recent Pokemon Go.
I can see how AR will become part of everyday life. I’m not sure I see the same for VR.
As you probably all know, I have two day jobs heading Intelligent Assistance Software and Lumberjack System. We’re very proud of the work we’ve done through both companies. We make a decent income from them for sure, but what makes us particularly happy when our tools get people’s work done faster. They get to go home to their families earlier and production has less drudgery.
So it pleases us greatly when that gets recognized, as it did this trip.
In this latest episode of The Terence and Philip Show Terry and I discuss metadata, my citizenship, smart APIs, Artificial Intelligence and more.
The extensive article by Steven Levy – The iBrain is Here – is a fascinating read on how Apple are using Machine Learning, neural networks and Artificial Intelligences across product lines. It’s well worth the time to read through, but this quote from Phil Schiller stood out:
“We use these techniques to do the things we have always wanted to do, better than we’ve been able to do,” says Schiller. “And on new things we haven’t be able to do. It’s a technique that will ultimately be a very Apple way of doing things as it evolves inside Apple and in the ways we make products.”
The ways this could all be aligned with editing? Speech-to-text; keyword extraction (just like Magic Keywords in Lumberjack System); sentiment extraction; image recognition; facial detection and recognition; speech controlled editing (if anyone really wants that), and the list goes on.
I’d like to believe the Pro Apps Team are working on this.
UPDATE: Ruslan Salakhutdinov is Apple’s first Director of AI.
It’s a competition piece, so if you’d all like to go to http://indi.com/7fqks and vote for Marlon Braccia, we’d appreciate it.
Edited in FCP X I used significant amounts of speed change, chroma key, crop and blur on the background. Those in LA can see it in person, and learn how it was done in detail at the August 24 meeting of LACPUG.
When I discovered I could do in two keystrokes what took 9 mouse clicks and keystrokes in Soundtrack Pro, I never looked back and now edit all my audio only projects in FCP X.
I got together with Marcelo Lewin of DigitalMedia Pros and explained how I do it.
Most of the thinking – the little that’s done – around the affect of Artificial Intelligence and Robotics replacing jobs, is somewhat negative, so it was almost a relief to read John Hagel’s perspective that we could use this transition as an opportunity to rethink the nature of work.