Jos Berens (Centre for Innovation, Leiden University) and Stefaan G. Verhulst (GovLab)
As part of an ongoing effort to build a knowledge base for the field of opening governance by organizing and disseminating its learnings, the GovLab Selected Readings series provides an annotated and curated collection of recommended works on key opening governance topics. In this edition, we explore the literature on Data Governance. To suggest additional readings on this or any other topic, please email [email protected]
Our work on Data Collaboratives starts from the assumption that sharing and opening-up private sector datasets has great – and yet untapped – potential for promoting social good (See for instance GovLab selected readings on data collaboratives). At the same time, the potential of data collaboratives depends on the level of societal trust in the exchange, analysis and use of the data exchanged. Strong data governance frameworks are essential to ensure responsible data use. Without such governance regimes, the emergent data ecosystem will be hampered and the (perceived) risks will dominate the (perceived) benefits. Further, without adopting a human-centered approach to the design of data governance frameworks, including iterative prototyping and careful consideration of the experience, the responses may fail to be flexible and targeted to real needs.
To help develop new approaches to sharing corporate data assets for social good, GovLab is working with Leiden University (The Netherlands) and the World Economic Forum Data-Driven Development project. Our Data Governance Project aims to design and implement the approaches and tools needed to unleash the datasets that could be used to improve people’s lives. Our work builds upon existing efforts and findings, some of them curated and documented below. For more information about the Data Governance Project please contact Jos Berens or Stefaan Verhulst.
Selected Readings List (in alphabetical order)
- Better Place Lab – Privacy, Transparency and Trust – a report looking specifically at the main risks development organizations should focus on to develop a responsible data use practice.
- The Brookings Institution – Enabling Humanitarian Use of Mobile Phone Data – this paper explores ways of mitigating privacy harms involved in using call detail records for social good.
- Centre for Democracy and Technology – Health Big Data in the Commercial Context – a publication treating some of the risks involved in using new sources of health related data, and how to mitigate those risks.
- Center for Information Policy Leadership – A Risk-based Approach to Privacy: Improving Effectiveness in Practice – a whitepaper on the elements of a risk-based approach to privacy.
- Centre for Information Policy and Leadership – Data Governance for the Evolving Digital Market Place – a paper describing the necessary organizational reforms to effectively promote accountability within organizational structures.
- Crawford and Schulz – Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harm – a paper considering a rigorous ‘procedural data due process’.
- DataPop Alliance – The Ethics and Politics of Call Data Analytics – a paper exploring the risks involved in using call detail records for social good, and possible ways of mitigating those risks.
- Data for Development External Ethics Panel – Report of the External Review Panel – a report presenting the findings of the external expert panel overseeing the Data for Development Challenge.
- Federal Trade Commission – Mobile Privacy Disclosures: Building Trust Through Transparency – a report by the FTC looking at the privacy risks involved in mobile data sharing, and ways to mitigate these risks.
- Leo Mirani – How to use mobile phone data for good without invading any ones privacy – a paper on the use of data produced by mobile phone use, and the steps that need to be taken to ensure that user privacy is not intruded upon.
- Lucy Bernholz – Several Examples of Digital Ethics and Proposed Practices – a literature review listing multiple sources compiled for the Stanford Ethics of Data conference, 2014.
- Martin Abrams – A Unified Ethical Frame for Big Data Analysis – a paper from the Information Accountability Foundation on developing a unified ethical frame for data analysis that goes beyond privacy.
- NYU Centre for Urban Science and Progress – Privacy, Big Data and the Public Good – a book on the privacy issues surrounding the use of big data for promoting the public good.
- Neil M. Richards and Jonathan H. King – Big Data Ethics – a research paper arguing that the growing impact of big data on society calls for a set of ethical principles to guide big data use.
- OECD Revised Privacy Guidelines – a set of principles accompanied by explanatory text used globally to inform the governance and policy structures around data handling.
- Whitehouse Big Data and Privacy Working Group – Big Data: Seizing Opportunities, Preserving Values – a whitepaper documenting the findings of the Whitehouse big data and privacy working group.
- World Economic Forum – Pathways for Progress – a whitepaper considering the global data ecosystem and the constraints preventing data from flowing to those who need it most. A lack of well-defined and balanced governance mechanisms is considered one of the key obstacles.
Annotated Selected Readings List (in alphabetical order)
Better Place Lab, “Privacy, Transparency and Trust.” Mozilla, 2015. Available from: http://www.betterplace-lab.org/privacy-report.
- This report looks specifically at the risks involved in the social sector having access to datasets, and the main risks development organizations should focus on to develop a responsible data use practice.
- Focusing on five specific countries (Brazil, China, Germany, India and Indonesia), the report displays specific country profiles, followed by a comparative analysis centering around the topics of privacy, transparency, online behavior and trust.
- Some of the key findings mentioned are:
- A general concern on the importance of privacy, with cultural differences influencing conception of what privacy is.
- Cultural differences determining how transparency is perceived, and how much value is attached to achieving it.
- To build trust, individuals need to feel a personal connection or get a personal recommendation – it is hard to build trust regarding automated processes.
Montjoye, Yves Alexandre de; Kendall, Jake and; Kerry, Cameron F. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, 2015. Available from: http://www.brookings.edu/research/papers/2014/11/12-enabling-humanitarian-use-mobile-phone-data.
- Focussing in particular on mobile phone data, this paper explores ways of mitigating privacy harms involved in using call detail records for social good.
- Key takeaways are the following recommendations for using data for social good:
- Engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.
- Accepting that no framework for maximizing data for the public good will offer perfect protection for privacy, but there must be a balanced application of privacy concerns against the potential for social good.
- Establishing systems and processes for recognizing trusted third-parties and systems to manage datasets, enable detailed audits, and control the use of data so as to combat the potential for data abuse and re-identification of anonymous data.
- Simplifying the process among developing governments in regards to the collection and use of mobile phone metadata data for research and public good purposes.
Centre for Democracy and Technology, “Health Big Data in the Commercial Context.” Centre for Democracy and Technology, 2015. Available from: https://cdt.org/insight/health-big-data-in-the-commercial-context/.
- Focusing particularly on the privacy issues related to using data generated by individuals, this paper explores the overlap in privacy questions this field has with other data uses.
- The authors note that although the Health Insurance Portability and Accountability Act (HIPAA) has proven a successful approach in ensuring accountability for health data, most of these standards do not apply to developers of the new technologies used to collect these new data sets.
- For non-HIPAA covered, customer facing technologies, the paper bases an alternative framework for consideration of privacy issues. The framework is based on the Fair Information Practice Principles, and three rounds of stakeholder consultations.
Center for Information Policy Leadership, “A Risk-based Approach to Privacy: Improving Effectiveness in Practice.” Centre for Information Policy Leadership, Hunton & Williams LLP, 2015. Available from: https://www.informationpolicycentre.com/privacy_risk_framework/.
- This white paper is part of a project aiming to explain what is often referred to as a new, risk-based approach to privacy, and the development of a privacy risk framework and methodology.
- With the pace of technological progress often outstripping the capabilities of privacy officers to keep up, this method aims to offer the ability to approach privacy matters in a structured way, assessing privacy implications from the perspective of possible negative impact on individuals.
- With the intended outcomes of the project being “materials to help policy-makers and legislators to identify desired outcomes and shape rules for the future which are more effective and less burdensome”, insights from this paper might also feed into the development of innovative governance mechanisms aimed specifically at preventing individual harm.
Centre for Information Policy Leadership, “Data Governance for the Evolving Digital Market Place”, Centre for Information Policy Leadership, Hunton & Williams LLP, 2011. Available from: http://www.huntonfiles.com/files/webupload/CIPL_Centre_Accountability_Data_Governance_Paper_2011.pdf.
- This paper argues that as a result of the proliferation of large scale data analytics, new models governing data inferred from society will shift responsibility to the side of organizations deriving and creating value from that data.
- It is noted that, with the reality of the challenge corporations face of enabling agile and innovative data use “In exchange for increased corporate responsibility, accountability [and the governance models it mandates, ed.] allows for more flexible use of data.”
- Proposed as a means to shift responsibility to the side of data-users, the accountability principle has been researched by a worldwide group of policymakers. Tailing the history of the accountability principle, the paper argues that it “(…) requires that companies implement programs that foster compliance with data protection principles, and be able to describe how those programs provide the required protections for individuals.”
- The following essential elements of accountability are listed:
- Organisation commitment to accountability and adoption of internal policies consistent with external criteria
- Mechanisms to put privacy policies into effect, including tools, training and education
- Systems for internal, ongoing oversight and assurance reviews and external verification
- Transparency and mechanisms for individual participation
- Means of remediation and external enforcement
Crawford, Kate; Schulz, Jason. “Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harm.” NYU School of Law, 2014. Available from: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2325784&download=yes.
- Considering the privacy implications of large-scale analysis of numerous data sources, this paper proposes the implementation of a ‘procedural data due process’ mechanism to arm data subjects against potential privacy intrusions.
- The authors acknowledge that some privacy protection structures already know similar mechanisms. However, due to the “inherent analytical assumptions and methodological biases” of big data systems, the authors argue for a more rigorous framework.
Letouze, Emmanuel, and; Vinck, Patrick. “The Ethics and Politics of Call Data Analytics”, DataPop Alliance, 2015. Available from: http://static1.squarespace.com/static/531a2b4be4b009ca7e474c05/t/54b97f82e4b0ff9569874fe9/1421442946517/WhitePaperCDRsEthicFrameworkDec10-2014Draft-2.pdf.
- Focusing on the use of Call Detail Records (CDRs) for social good in development contexts, this whitepaper explores both the potential of these datasets – in part by detailing recent successful efforts in the space – and political and ethical constraints to their use.
- Drawing from the Menlo Report Ethical Principles Guiding ICT Research, the paper explores how these principles might be unpacked to inform an ethics framework for the analysis of CDRs.
Data for Development External Ethics Panel, “Report of the External Ethics Review Panel.” Orange, 2015. Available from: http://www.d4d.orange.com/fr/content/download/43823/426571/version/2/file/D4D_Challenge_DEEP_Report_IBE.pdf.
- This report presents the findings of the external expert panel overseeing the Orange Data for Development Challenge.
- Several types of issues faced by the panel are described, along with the various ways in which the panel dealt with those issues.
Federal Trade Commission Staff Report, “Mobile Privacy Disclosures: Building Trust Through Transparency.” Federal Trade Commission, 2013. Available from: www.ftc.gov/os/2013/02/130201mobileprivacyreport.pdf.
- This report looks at ways to address privacy concerns regarding mobile phone data use. Specific advise is provided for the following actors:
- Platforms, or operating systems providers
- App developers
- Advertising networks and other third parties
- App developer trade associations, along with academics, usability experts and privacy researchers
Mirani, Leo. “How to use mobile phone data for good without invading anyone’s privacy.” Quartz, 2015. Available from: http://qz.com/398257/how-to-use-mobile-phone-data-for-good-without-invading-anyones-privacy/.
- This paper considers the privacy implications of using call detail records for social good, and ways to mitigate risks of privacy intrusion.
- Taking example of the Orange D4D challenge and the anonymization strategy that was employed there, the paper describes how classic ‘anonymization’ is often not enough. The paper then lists further measures that can be taken to ensure adequate privacy protection.
Bernholz, Lucy. “Several Examples of Digital Ethics and Proposed Practices” Stanford Ethics of Data conference, 2014, Available from: http://www.scribd.com/doc/237527226/Several-Examples-of-Digital-Ethics-and-Proposed-Practices.
- This list of readings prepared for Stanford’s Ethics of Data conference lists some of the leading available literature regarding ethical data use.
Abrams, Martin. “A Unified Ethical Frame for Big Data Analysis.” The Information Accountability Foundation, 2014. Available from: http://www.privacyconference2014.org/media/17388/Plenary5-Martin-Abrams-Ethics-Fundamental-Rights-and-BigData.pdf.
- Going beyond privacy, this paper discusses the following elements as central to developing a broad framework for data analysis:
Lane, Julia; Stodden, Victoria; Bender, Stefan, and; Nissenbaum, Helen, “Privacy, Big Data and the Public Good”, Cambridge University Press, 2014. Available from: http://www.dataprivacybook.org.
- This book treats the privacy issues surrounding the use of big data for promoting the public good.
- The questions being asked include the following:
- What are the ethical and legal requirements for scientists and government officials seeking to serve the public good without harming individual citizens?
- What are the rules of engagement?
- What are the best ways to provide access while protecting confidentiality?
- Are there reasonable mechanisms to compensate citizens for privacy loss?
Richards, Neil M, and; King, Jonathan H. “Big Data Ethics”. Wake Forest Law Review, 2014. Available from: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2384174.
- This paper describes the growing impact of big data analytics on society, and argues that because of this impact, a set of ethical principles to guide data use is called for.
- The four proposed themes are: privacy, confidentiality, transparency and identity.
- Finally, the paper discusses how big data can be integrated into society, going into multiple facets of this integration, including the law, roles of institutions and ethical principles.
OECD, “OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data”. Available from: http://www.oecd.org/sti/ieconomy/oecdguidelinesontheprotectionofprivacyandtransborderflowsofpersonaldata.htm.
- A globally used set of principles to inform thought about handling personal data, the OECD privacy guidelines serve as one the leading standards for informing privacy policies and data governance structures.
- The basic principles of national application are the following:
- Collection Limitation Principle
- Data Quality Principle
- Purpose Specification Principle
- Use Limitation Principle
- Security Safeguards Principle
- Openness Principle
- Individual Participation Principle
- Accountability Principle
The White House Big Data and Privacy Working Group, “Big Data: Seizing Opportunities, Preserving Values”, White House, 2015. Available from: https://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_5.1.14_final_print.pdf.
- Documenting the findings of the White House big data and privacy working group, this report lists i.a. the following key recommendations regarding data governance:
- Bringing greater transparency to the data services industry
- Stimulating international conversation on big data, with multiple stakeholders
- With regard to educational data: ensuring data is used for the purpose it is collected for
- Paying attention to the potential for big data to facilitate discrimination, and expanding technical understanding to stop discrimination
William Hoffman, “Pathways for Progress” World Economic Forum, 2015. Available from: http://www3.weforum.org/docs/WEFUSA_DataDrivenDevelopment_Report2015.pdf.
- This paper treats i.a. the lack of well-defined and balanced governance mechanisms as one of the key obstacles preventing particularly corporate sector data from being shared in a controlled space.
- An approach that balances the benefits against the risks of large scale data usage in a development context, building trust among all stake holders in the data ecosystem, is viewed as key.
- Furthermore, this whitepaper notes that new governance models are required not just by the growing amount of data and analytical capacity, and more refined methods for analysis. The current “super-structure” of information flows between institutions is also seen as one of the key reasons to develop alternatives to the current – outdated – approaches to data governance.