One of the very things that makes the Internet so useful to human traffickers is being turned against them with a new tool developed at the USC Viterbi School of Engineering’s Information Sciences Institute.
The sheer size of the Internet can make it easy for digital criminals to hide in plain sight — brazenly posting escort ads for underaged individuals online, for example — knowing there already are millions of these ads, with more created every day. There are far too many for law enforcers to search individually, and there are plenty of simple ways to obscure identifying information from large-scale searches (for example, listing a phone number as “Eight-1-eight, Five 55…”).
Pedro Szekely and Craig Knoblock of ISI treated it as a big data problem, creating a tool that combs through escort ads — mining, decoding and organizing the relevant data into an enormous but easily searchable database.
Known as DIG (for Domain-specific Insight Graphs), the tool allows officers who are searching for a missing child who is believed to be trapped in the escort industry to search by phone number, location, alias — even by photo — and pin down a way to reach them.
“The Internet contains seemingly limitless information, but we’re constrained by our ability to search that information and come up with meaningful results. DIG solves that problem,” said Szekely, research associate professor at ISI.
DIG is simple enough that it won’t require special training to use. The database it utilizes currently has 50 million Web pages, 2 billion records and grows at a rate of roughly 5,000 pages per hour.
As the database continues to grow, DIG will be able to uncover new connections and patterns in the data.
“As the database continues to grow, DIG will be able to uncover new connections and patterns in the data, making it even more useful,” said Knoblock, research professor and director of information integration at ISI.
The funding and initiative to create DIG came from Memex, a Defense Advanced Research Projects Agency (DARPA) program aimed at developing the next generation of Internet search tools in hopes of helping law enforcement agencies fight online human trafficking.
The code for DIG is open-source — and therefore free to law enforcement agencies — and will be upgraded quarterly over the course of the three-year project, which began in 2014. For example, Szekely and Knoblock plan to improve DIG so that it automatically flags potential victims and identifies trafficking rings through their ads and the victims under their control.
USC is a leader in the fight against human trafficking online, hosting the Technology and Human Trafficking Initiative, a project by the USC Annenberg Center on Communication Leadership and Policy to study the use and implications of communications technology in modern slavery.
DIG is in its sixth month. Szekely and Knoblock are now seeing how else it might be useful, including the analysis of papers on material science research.