How to teach a Machine

In a follow up Tweet to my comments about Resolve 16, the Tweeter suggested they could use the Neural Engine to “improve translation of XML from other apps,” which led me to spend a little time (oversimplifying) how we train Machines for Machine learning.

There are three approaches: training sets, oppositional training, simple goal.

Most Machine Learning neural networks are trained using massive amounts of “qualified” data: examples that have been tagged with the desired results to train the machine, then more known examples to test the machine. This has been the approach used for facial recognition, speech-to-text, emotion extraction, color grading, etc. Most of the things we’ve known Machine Learning for so far, have been trained using this type of data set.

Oppositional training is where you have one machine – for example – trying to create a human face, and another machine determining if what is presented is a human face. Because these are machines they iterate very, very quickly and some amazingly realistic human faces have resulted.

Then there’s the Simple goal. The clearest example I’ve seen so far is where a bipedal stick figure in a simulated environment was given the challenge “Stay upright and move forward.” After millions of iterations and experiments the goal was achieved, so they made it more complex with terrain.

Given those gross oversimplifications of some very clever technology, let’s examine the idea of improving XML translation. I’m not aware of any existing training set, or how you’d go about creating a training set that had perfectly matched timelines in two different apps, and the XMLs that they represent. The matching timelines would have to be built manually (or corrected manually) so they were a perfect representation of the other app’s timeline. Not particularly practical.

I don’t see how we can simplify the request to use either oppositional training or boil it down to a simple goal.

Now, another Tweeter suggested they use the Resolve Neural Engine to improve cut detection, and that’s an entirely reasonable, and feasible goal, as we have a substantial body of timelines with EDLs representing the cuts.

Ultimately, what we can do with Machine Learning will come down to how we train it, which is why I am not expecting a Machine-based editing tool for a very long time.