From Open Data to Data Collaboratives: Panel Reflects on the Case for Data Stewardship

by Andrew J. Zahuranec and Alexandra Shaw

From left to right: Niels Ploug (Danmarks Statistik), Nick Eng (LinkedIn)Adrienne Schmoeker (City of New York), Brennan Lake (Cuebiq), and Stefaan Verhulst (The GovLab).

From left to right: Niels Ploug (Danmarks Statistik), Nick Eng (LinkedIn)Adrienne Schmoeker (City of New York), Brennan Lake (Cuebiq), and Stefaan Verhulst (The GovLab).

As of 2019, over 2,000 datasets have been published on New York City’s open data platform, making the City a global leader in data sharing. This work has the potential of improving governance, empowering citizens, creating new economic opportunity, and helping to solve public problems.

These achievements are important to celebrate. But it’s also important to think how the value of open data can be further amplified.

From March 1st through March 9th, The GovLab is commemorating the return of Open Data Week, a week-long event series recognizing New York City’s Open Data Law and all the advances the field has made in unlocking the value of open government data. To support the celebration, The GovLab hosted an official New York City Open Data Week panel on Monday, March 4th.

Together with Reaktor, The GovLab focused on how open data’s potential might be fully realized by supplementing it with data collected and held by the private sector and other actors. Using “data collaboratives”—where private and public actors work together around their data and where the supply and demand side meet to answer commonly agreed upon questions—society can tackle some of the biggest public challenges.

Through its work, The GovLab has compiled more than 150 examples of data collaboratives in action around the world—in addition to facilitating several with partners around the world.

The panelists represented both the demand and supply sides of data collaborative relationships. They included Adrienne Schmoeker (City of New York) and Niels Ploug (Danmarks Statistik) as well as Brennan Lake (Cuebiq) and Nick Eng (LinkedIn).

Stefaan Verhulst, The GovLab’s Co-Founder and Chief Research and Development Officer, moderated the discussion and raised questions about the potential and limitations of data collaboratives.

“The focus of this year is to think about how we expand open data to include corporate data through data collaboratives,” Stefaan said at the start of the conversation. “We need to ask how we can unlock the value of private data for good in a responsible way.”

The Potential of Data Collaboratives

In turn, each panelist provided a perspective on the ways data collaboratives could benefit the City and what the incentives are for the private sector to engage.

“Data collaboratives are incredibly powerful because the private sector is sitting on treasure troves of data that is often not accessible to those in the public sector,” said Brennan Lake, reflecting on the work done by Cuebiq’s Data for Good program.

The sentiment was echoed by Nick Eng, who noted how LinkedIn supported partners without racking up significant costs. The LinkedIn Economic Graph, for instance, uses job, company, and user data already uploaded to the company’s platform to provide a digital representation of the global economy.

“Most of the data we use [for these relationships] is already what was on hand.” LinkedIn adheres to a “Members First” policy when conducting this work, and makes a strong effort to both preserve member privacy and respect users’ intentions for their data.

The participants discussed how data collaboratives could draw together siloed datasets and expertise. They also reviewed how partnerships could maximize data’s potential by allowing the most effective institutions and individuals to bring about new, innovative social solutions.

Importantly, they also noted the major corporate benefits.

“It gives us an opportunity to apply our data in novel and interesting ways,” said Brennan. “It is important to bring for-profit companies into the conversation so we can do things in a way that doesn’t inhibit growth, that lets others understand why we are doing what we are doing.”

“There are tertiary sales benefits through these insights,” said Nick, reflecting on both the reputational and revenue benefits. “Regional governments and others are now interested in what we have to bolster their economic workforce.”

The Need for Legal and Ethical Frameworks

While Adrienne Schmoeker and Niels Ploug agreed private companies have much to offer, they also wanted to ensure data usage took place in an ethical, legally compliant framework.

“The City of New York is hungry for new data. We are looking at what the private sector has to offer […] Something we have to keep in mind, though, is that we are responsible for considering both the worst-case scenario in addition to what the brightly lit future may hold. We are responsible for protecting the 8.6 million people who call New York their home,” Adrienne said.

She also added her concerns about becoming dependent on data provided pro bono, potentially creating vulnerabilities should that “data philanthropy” dry up. By working carefully and being selective about partnerships, the City could ensure it is responsible and sustainable.

When asked about what would make his organization’s work easier, Niels Ploug suggested, “Legislation that gives us access to data with restrictions [would be useful] so we can ensure we don’t violate company or individual privacy.”

The business side of the conversation offered similar views.

“For us and a lot of organizations, there is both a financial and ethical imperative here,” said Brennan, explaining the need for data responsibility. “We are a for-profit company interested in long-term profit potential, so we need to make sure that the data we’re collecting contributes positively to society and the lives of the individual users who are generating that data.”

The Importance of Data Stewardship

The panel closed with audience questions spanning from the responsibilities of private companies reliant on public infrastructure to the transaction costs associated with data-sharing agreements. 

Audience members asked about a variety of topics, from the costs of data-sharing agreements to the responsibilities of private corporations.

Audience members asked about a variety of topics, from the costs of data-sharing agreements to the responsibilities of private corporations.

Still, all the panelists agreed that data stewards, responsible data leaders seeking new ways to create public value through cross-sector data collaboration, were integral to making both their organizations and these relationships systemic, sustainable, and responsible.

“There’s a slow march toward culture change taking place in the City of New York, an institution that is hundreds of years old,” Adrienne noted. “It’s slow, but the city has made immense progress thanks to the efforts of several tech champions.”

A member of the audience agreed.

“The good [collaboratives] have often been opportunistic, low-hanging fruit. We need institutional forms, data stewardship, that can align incentives.”

Establishing and sustaining these new collaboratives entail substantial risks. They also require significant resources and time-consuming efforts for both data holders on the supply side and institutions that represent the demand.

Organizations across the City of New York are studying and testing the possibilities presented by data collaboratives as well as the different uses of private-sector data. While there is still plenty of work to be done, this panel shows organizations across the City is on the road to realizing its true data potential.