- The collection and combination of massive datasets is revolutionizing our ability to model and predict individual human behavior.
- On the whole, such analyses hold promise for improving global socioeconomic outcomes but also raise troublesome privacy and civil liberty concerns.
Sandy Pentland is, per Edge, “one of the most-cited computer scientists in the world and was named by Forbes as one of the world’s seven most powerful data scientists.” He is also the Director of both the Massachusetts Institute of Technology (MIT)’s Human Dynamics Laboratory as well as its Media Lab Entrepreneurship program. Worth mentioning, MIT recently announced the founding of the Intel Science and Technology Center (ISTC) for Big Data in addition to announcing several other big data initiatives.
Given Pentland’s background at MIT, his interview with Edge is a revealing peek behind the curtain of how a diverse set of groups is working in various ways to break down data silos, connect datasets and perform advanced predictive analytics on large datasets often referred to under the umbrella term of “big data.” In Edge’s edited transcript, which is part of a broader collection of Edge conversations on technology, Pentland summarizes his definition of big data as follows:
“It’s not about the things you post on Facebook, and it’s not about your searches on Google…Big Data comes from things like location data off of your cell phone or credit card, it’s the little data breadcrumbs that you leave behind you as you move around in the world. What those breadcrumbs tell is the story of your life. It tells what you’ve chosen to do. That’s very different than what you put on Facebook. What you put on Facebook is what you would like to tell people, edited according to the standards of the day…Big data is increasingly about real behavior, and by analyzing this sort of data, scientists can tell…whether you are the sort of person who will pay back loans…if you’re likely to get diabetes.”
As Pentland alludes to, big data can be used to make incredible predictions. Political campaigns can (and do) correlate intensive personal databases to develop a voter’s probability of support. Taken to its full potential, healthcare and insurance companies could develop hyper-personalized risk assessments, disease control officials could prevent epidemics by immediately identifying those struck with a pandemic before a broader outbreak is allowed to occur, and dramatic market failures such as the famous 2010 “flash crash” can be predicted and prevented.
But in the process, such data can also be used to expose incredibly private insights into a person’s life–consider predictions of a couple’s divorce likelihood or an individual’s sexuality–let alone the simple fact that strangers might gain access to timestamped data on everywhere you’ve been in the past few years, in addition to maybe your payment history and personal company. As Pentland puts it:
“The fact that we can now begin to actually look at the dynamics of social interactions and how they play out, and are not just limited to reasoning about averages like market indices is for me simply astonishing. To be able to see the details of variations in the market and the beginnings of political revolutions, to predict them, and even control them, is definitely a case of Promethean fire. Big Data can be used for good or bad, but either way it brings us to interesting times. We’re going to reinvent what it means to have a human society.”
Related to Pentland’s work is the recent U.S. Consumer Privacy Bill of Rights and the European Union’s proposed data protection reforms. Existing legislative frameworks and more recent reforms have generally fostered a primarily opt-in, market-like data protection system. In essence, private companies are forced to secure the permission of individuals in order to the use their data, permission they typically receive by offering some kind of consumer benefit. If organizations provide enough of an incentive for citizens to give up their data and the organizations then use the data within the applicable government regulations, Pentland considers it a win-win-win.
Understandably, big data and the computational science underlying it has the potential to play a dramatic supporting role–if not a leading one–in the redesigning of our government institutions.
For an extended tutorial on big data, check out this lecture by the Josžef Stefan Institute Artificial Intelligence Laboratory. For a shorter video about an MIT Media Lab project, watch this Youtube video on privacy preserving personal data storage.
FOR ADDITIONAL STUDY
- Cox, Lauren. “Getting More Value from Cell-Phone Data.” Technology Review (2011).
- Issenberg, Sasha. The Victory Lab: The Secret Science of Winning Campaigns. Crown, 2012.
- Wen Dong, Bruno Lepri, and Alex ‘Sandy’ Pentland. “Modeling Co-Evolution of Behavior and Social Relationships Using Mobile Phone Data.” Mobile and Ubiquitous Multimedia (2011).
- Wen Dong, Katherine Heller, and Alex ‘Sandy’ Pentland. “Modeling Infection with Multi-Agent Dynamics.” International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction (2012).