While some places are think tanks, the GovLab is a Do Tank. We endeavor to undertake real world projects to test innovations in governance and see what works. Some places deliberate about the wisdom of which policy is the right one (or the left one). At the GovLab we focus on how to make policy; how to decide and solve problems more legitimately and effectively with the benefit of advances in science and technology and new insights from the social sciences. One of the most significant such innovations of the last five year is open data, which has been a key part of the Obama administration’s Open Government Initiative. It was therefore with the greatest pleasure that we celebrated last night the publication of Joel Gurin’s Open Data Now, a first-of-its kind new book about the power of open data as a technique for solving complex problems. The book is remarkable for its lucidity, clarity and the range and depth of its stories.
Top, L to R: Steven Waldman (fmr Sr. Advisor, FCC), Laura Manley (Open Data 500), Joel Gurin. (All photos courtesy of Paloma Baytelman.)
Bottom, L to R, seated: Gurin, Marc DaCosta (Enigma), Krishna Venkatraman (OnDeck)
Data can be considered open data if it comes in computable datasets that are freely re-usable without significant legal and technological restriction. The rationale behind open data is that citizens are entitled to access and use data that they have paid to collect as taxpayers nds (in the case of government data) or that they are the subjects of (in the case of personal data). But the instrumental rationale for open data and the reason it is a powerful lever for problem solving is two-fold.
First, raw datasets in the right formats lend themselves to computation. Data that is merely readable to humans but not to machines, such as a PDF of a report of CEO salaries, can only be looked up one entry at a time. Data that is provided as open data in a format that computers can “read” can easily be used to create visualizations, models, analyses and new tools, enables us to spot trends and patterns across a whole industry, such as who is the highest paid CEO or which industries have the lowest paid ones.
Second, because open data are malleable and manipulable by people other than the owners of the data, they are the means to the end of fostering collaboration among diverse actors around a problem.
As Joel explains in Open Data Now, open data are driving more accountable and efficient government. When public spending data are released, we can scrutinize it for fraud, waste and abuse. Open data also enables people to cut across red tape and empowers consumer safety and choice. FoodSafety.gov is a one-stop shop for people to find out about recalls without having to know which agency regulates meat pizza (USDA) and which one cheese pizza (FDA). It is made possible by combining data from various agencies.
While the open data movement is made possible by a government policy to make public data freely available, the real success story, according to Gurin, is what entrepreneurs do with that data. He writers about companies like CarFax, who pull together data from dozens of different government agencies to provide consumers with a detailed purchase history of a used car, or Climate Corporation, whose data-driven weather insurance is helping farmers increase agricultural productivity by twenty percent or more.
At last night’s party, we got to hear from:
- Russell Graney, CEO of Aidin, a web application used by hospitals to find their patients the best post-acute care providers, provide referrals in the area, and save hospitals money.
- Krishna Venkatraman, SVP of Data and Analytics at OnDeck, a company that streamlines the borrowing process for small businesses by using open data to evaluate a company’s business performance.
- Marc DaCosta, Co-Founder, Enigma, whose software helps users see hidden facts and connections across the often messy universe of public data.
- David “Doc” Searls, Author, The Intention Economy; Co-Author, The Cluetrain Manifesto, founder of Project VRM at Harvard’s Berkman Center, and a thought leader on personal data and open data.
We also got to meet several of the groups that are helping to organize the movement of activists and users who are organizing the non-corporate participants, including the many activists, students, and geeks who are using this data to make useful tools. The event drew representatives of DataKind, which connects expert data scientists with social change organizations to help them visualize and understand data; the NYC Open Data Meetup Group, which helps visualize public open data and runs workshops and talks;and BetaNYC, a part of the Code for America Brigade program that mobilizes civic-minded volunteers around platforms for local government and community service.
Left to right: Venkatraman, DaCosta, Noveck, Gurin, Searls, Graney
There’s a virtuous cycle when government releases data, which, in turn, provides the asset for companies to build solutions that create jobs, lead to innovations and improve people’s lives. As Doc Searls pointed out to the assembled crowd, open data is like the Internet. It’s another example of an innovation where the government is leading the pack. And just like the Internet, which CIOs didn’t want at the desktop a generation ago, soon companies, too, will not only use government data, but start to realize the value to opening up and sharing the data that they collect from and hold about their customers.
The GovLab’s Open Data 500, led by Joel Gurin is the first comprehensive study of U.S. companies using open government data to develop new products and services. The study will identify, describe, and analyze companies that use open government data in their businesses. It will help map the landscape of open data’s application today – and, we hope, help encourage even more uses of open data.
To be clear, open data is not big data. Big data are the enormous data collections of our data exhaust that we spew out every time we use our mobile phones, search on Google or make a credit card purchase. In recent months, there has been an outcry about the ways in which the NSA has its mouth directly on the tailpipe of this data exhaust and is mining these vast collections of aggregated human behavior data for its own purposes. (Companies can mine big data intrusively too, though the outcry there has been less.) Big data is what happens to us or, as Joel puts it, a “problem to be managed.” We are victims of surveillance and passive participants in the story of big data.
But open data is “data with a mission” to accomplish good. In Open Data Now, we learn about the pragmatic, positive and purposeful things we can affirmatively do with open data to improve all of our lives and create a world, as Gurin exhorts us, that is “fairer and more abundant.”