Armies of Expensive Lawyers, Replaced by Cheaper Software

This is a fascinating article on how software can detect patterns of behavior from thousands and thousands of pages of documents. This technology goes way beyond searching for keywords:

The sociological approach adds an inferential layer of analysis, mimicking the deductive powers of a human Sherlock Holmes. Engineers and linguists atCataphora, an information-sifting company based in Silicon Valley, have their software mine documents for the activities and interactions of people — who did what when, and who talks to whom. The software seeks to visualize chains of events. It identifies discussions that might have taken place across e-mail, instant messages and telephone calls.

Then the computer pounces, so to speak, capturing “digital anomalies” that white-collar criminals often create in trying to hide their activities.

For example, it finds “call me” moments — those incidents when an employee decides to hide a particular action by having a private conversation. This usually involves switching media, perhaps from an e-mail conversation to instant messaging, telephone or even a face-to-face encounter.

“It doesn’t use keywords at all,” said Elizabeth Charnock, Cataphora’s founder. “But it’s a means of showing who leaked information, who’s influential in the organization or when a sensitive document like an S.E.C. filing is being edited an unusual number of times, or an unusual number of ways, by an unusual type or number of people.”

I can’t help but wonder how this type of software “smarts” is going to apply to postproduction technology? Recognizing intent and sentiment is a big step beyond where we are now, but will it come in the future.

I particularly like the paragraph toward the end where one of the developers of the technology applied it to his own company’s lawyers work in the 1980’s and 1990’s:

His human colleagues had been only 60 percent accurate, he found.

Creative work that can “only” be done by humans may not be so exclusively human after all. Software could detect the interesting moments in reality source, just by analyzing the content (transcribed of course).