advertisement

Apple’s latest billion-dollar lawsuit relies on a little-known technology called predictive coding. Here’s how it works–and why every company should learn to use it.

The Amazing Forensic Tech Behind The Next Apple, Samsung Legal Dust-Up (And How To Hack It)

BY Chris Dannenlong read

In a San Jose federal court today, Apple will attempt to capitalize on this summer’s $1 billion win against Samsung by alleging that six more Samsung devices, including the wildly popular Galaxy S III, boast features stolen from the iPhone.

To demonstrate that the Korean phone-maker intentionally ripped off Apple, lawyers will have to show a pattern of let’s-copy-Apple talk among tens of millions of Samsung’s internal documents–too many for any legal team to sort through manually, except at enormous cost. So when a billion-dollar verdict is on the line, companies like Samsung and Apple are turning to a relatively new forensic artificial intelligence technique to prove who’s wrong. And since this type of cyberforensics is being used by the offense, the ramifications for business could be massive.

With a single Linux box and some open-source machine-learning software, corporations can now undertake risky data-driven lawsuits that, until recently, would have been expensive, prolonged, and circumspect in court. But “predictive coding,” as it’s known in legal circles, is also a boon for any company trying to take stock of what it knows, comply with regulators, reverse-engineer a decision, or complete a merger. And since the software is based on an open-source project, almost any company can undertake to use it. Here’s how it works.

How Machine Learning Makes Lawsuits

In 2009, Joe Looby led the team that cracked Bernie Madoff’s “black box” servers, appointed by case trustee Irwin Picard to determine if any of Madoff’s $65 billion in trades were real. From his office high above Times Square, Looby can see into a neighboring dance studio where Broadway performers rehearse their acts. In much the same way, predictive coding gives his firm, FTI Consulting, a detailed forensic window into a company’s practice.

“A couple good things are happening now,” Looby says. “Courts are beginning to endorse predictive coding, and training a machine to do the information retrieval is a lot quicker than doing it manually.”

The process of “Information retrieval” (or IR) is the first part of the “discovery” phase of a lawsuit, dubbed “e-discovery” when computers are involved. Normally, a small team of lawyers would have to comb through documents and manually search for pertinent patterns. With predictive coding, they can manually review a small portion, and use the sample to teach the computer to analyze the rest. (A variety of machine learning technologies were used in the Madoff investigation, says Looby, but he can’t specify which.)

“Every case is different, so it’s hard to give an estimate of the IR time saved,” says Looby, “but if we use reasonable assumptions and model the time for two similar teams to pass through a million documents, a team with a predictive discovery machine should finish the job in less than a third of the time it takes the manual team to finish.”

PluggedIn Newsletter logo
Sign up for our weekly tech digest.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Privacy Policy

ABOUT THE AUTHOR

I've written about innovation, design, and technology for Fast Company since 2007. I was the co-founding editor of FastCoLabs More


Explore Topics