Around the world, national and local governments are making the data they collect available as Open Data, for a host of good reasons. Open Data can help make government more accountable, empower and engage citizens in the workings of government, and deliver public services more efficiently and effectively. In the private sector, Open Data can create new opportunities for entrepreneurs, help established businesses operate more strategically, give investors valuable information, and accelerate the pace of R&D.
While this is all to the good, several questions remain unanswered. Who, exactly, is using open government data? What datasets are they using, and how? And, most difficult to answer: What, exactly, is open government data worth?
These are more than theoretical questions. Recent policies set by the U.S., the UK and the G8 have embraced the idea that government data should be “open by default” – that is, it should be open for the public to use unless there are security, privacy, or other compelling reasons to keep it closed. That’s great in principle. But in practice, the U.S. government manages about 10,000 information systems using different kinds of technology, with varying levels of data quality. To justify the time, effort, and expense of turning them all into usable, reliable open data, we need to be able to quantify the costs and benefits.
The Open Data 500 study, which I’m now directing at the GovLab, will give economists and other researchers a new information base to help assess open data’s value. It’s the first real-world, comprehensive mapping of companies that use government open data in health, finance, education, energy, and other sectors. By asking these companies specific questions about the open datasets they’re using, we also hope to help government agencies prioritize which kinds of data are most important for business use, and in what formats.
Determining the overall value of open data isn’t easy. Many companies that use it are so new that it’s too early to measure their success. On the other hand, many established companies use open government data as just one resource for their work, making it hard to figure out how much it contributes to their business. So far, the efforts to put a value on open government data have come up with a wide range of numbers. The data may be worth:
- Between 30 and 140 billion euros across the European Union (about $40 billion to $185 billion U.S.)
- Between about $3 billion and $9 billion U.S. for data in the UK
- Between $1.5 billion and more than $30 billion annually for U.S. weather data
- $90 billion, or some smaller number, for U.S. GPS data
Why such different figures? The first two estimates – one done by Graham Vickery for the European Union, and one for the UK from the consulting firm Deloitte – are for what the Europeans call “public sector data.” This is data that’s publicly available but is not truly Open Data because it may require a fee or have restrictions on re-use. Each of those estimates shows the range between direct uses of the data, for example in new informational websites and apps, and the larger value that could come from indirect benefits, such as creating a market for data-analysis technology or making government more efficient.
The U.S. numbers depend on exactly what you count. Open weather data supports an estimated $1.5 billion in applications in the secondary insurance market – but much greater value comes from accurate weather predictions, which save the U.S. more than $30 billion annually. The value of GPS data has been estimated at $90 billion, but that number comes from an industry study and may be too high. Another ambiguous number: While a McKinsey study found that the value of U.S. health data could be $300 to $450 billion, that estimate included private data sources as well as public ones, making it difficult to tease out the value of open data on health. McKinsey is now working on a total estimate for the value of all open government data, to be released at an O’Reilly Strata conference later this month.
The Open Data 500 won’t give us better numbers immediately: We won’t be able to simply add up the total number of jobs or revenue for the companies in our database, multiply by some X factor, and get a value for all the data provided by the U.S. government. (We’re studying only U.S. companies in this first survey.) But we hope to provide data that will help economists develop more accurate and precise estimates going forward. We plan to make our findings available on a website where researchers can download our data, new companies can complete our survey, and members of the open data community can suggest future research.
We also hope to launch a long-overdue dialogue between data-holders (federal agencies) and data-users (the open data companies themselves). We’re asking each company to tell us what government data it uses, how useful each dataset is, and how the datasets could be improved. With this kind of feedback, which has never been collected systematically, agencies will be able to prioritize the most important datasets for public use and rapidly make them more useful. In the UK, an Open Data User Group advises the government on how to improve the data it provides. The same kind of feedback is important for U.S. federal agencies too.
Going into this project, we know there are already several strong examples of companies that have profited from Open Data. Vivek Kundra, who served as the first U.S. Chief Information Officer under President Obama, recently pointed to three examples of “companies that were built using raw government data”: Zillow, which uses housing data, is valued at more than $1 billion; the Weather Channel, built on government weather data, sold for about $3.5 billion in 2008; and Garmin, using GPS data, has a market cap of more than $7 billion. Just last week, the Climate Corporation, a startup whose CEO I interviewed recently, was sold for about $1 billion.
The question isn’t whether Open Data can be a business resource, but how to make it as useful a resource as possible. The issues include data availability, data quality, data formats, and the even larger policy, technical, economic, and legal issues that can affect Open Data’s value. Now that we know that open government data can be good for business, we need to learn what works, when, and why.
At this writing, we’ve identified more than 300 companies that use open government data and have invited them to participate in the Open Data 500. If you have a company to recommend, please fill out the form here; if you think your own company is a good candidate, you can find our survey here. Tweet your suggestions to us at #OD500, or send them to us at [email protected]. All input is welcome as we map this new territory.
Read more about this project and our methodology HERE.