Guide to Building Data-Driven Organizations in the Public Sector

Best Practices for Open Data

(Team 6) Tommy and Dennis

Topic Overview

These chapters discuss why the sharing and licensing of data is important and the current US government implementation of its Project Open Data.

Chapter Summaries

In this reading, the Sunlight Foundation has laid out some best practices for how we, as a society, can determine what data should be public, how we can go about making data public, and how to implement policy for making data public. There are 31 recommendations in this paper, all of which have merit and should be considered. I just want to highlight three that need to be considered but will be difficult, to say the least, to implement.

First, specifying methods of determining the prioritization of data release. This is one that could get very political. Due to the necessity to ensure that data will not harm any entity through release to the world, all data should have some kind of delay for release. This delay could be as short as hours, or as long as decades, depending on the type and potential harm of the data. Additionally, this is definitely an issue that would be politicized and hampered by special interests.

Second, mandate data be explicitly license-free. Where do we start? The biggest issue here is information that may be held by the government, but in some way be proprietary. The government has an interest in protecting information from which it gains an advantage against other nations. The easiest example of this is military, but there are probably other agencies that deal in data created for government use, are proprietary, and although not classified, should not be immediately released for public consumption.

Lastly, ensure sufficient funding for implementation. In a fiscally constrained environment, this is a tough nut to crack. It will need to be made explicitly clear how and why this open data vision is a requirement over and above other items in the budget. Not to mention the question of who is actually paying for it. Gaining agreement within our own country is hard enough. Gaining agreement among all the nations of the world as to how data will be used, maintained, and paid for may be considered the next giant leap for mankind.

This reading focuses on the importance of licensing your data, how to license your data, the different kinds of rights (intellectual property or other) and how other countries interpret those rights. The material has a pretty serious disclaimer that should be considered “This information is collected by altruistic individuals most of whom are not lawyers; those who are lawyers are not your lawyers nor experts in your situation. You use this information at your own risk. Nothing in this page should be considered as legal advice”. If that isn’t reassuring I don’t know what is. The material wants to educate the reader to the importance of licensing data not just to prevent others from using your data but if you want your data to be available to others, where they may be legal limitations to others use.

The reading does emphasis that if you are planning to make your data available you should put a license on it. The material is fairly technical but helps the reader think through the process and importance of licensing their data. The reading also seeks to help the reader understand the language of data collection and the importance of ensuring that the language is understood example: “the structural elements of a database will generally be covered by copyright. However, we need to be a bit careful because the word isn’t particularly precise: “data” can mean a few or even single items (for example a single bibliographic record, a lat/long etc) or “data” can mean a large collection (e.g. all the material in the database). To avoid confusion we shall reserve the term “contents” to mean the individual items, and data to denote the collection”. As my old boss use to say words have meaning.

Forms of protection for your data will fall most likely into two cases, copyright for compilations and a sui generis (constituting a class alone, unique, peculiar) right for collections of data. It’s critical that you understand how countries will interpret the protection that will afforded based on copyrights and licenses. The material does a good job of providing the reader an over view of how the European Union, Canada and the United States will interpret data protection. The overall reading is fairly technical but required if the owner of data is wanting to understand how their data will be protected and what they have to do it the want others to have access to use the data.

Project Open Data

This reading is designed to educate the reader in regards to the US Government project called “Project Open Data”. The reading is a located on the GitHub website that allows readers to participate in the process and refinement of the website information. The US government recognized that “data is a valuable national resource and a strategic asset to the U.S. Government, its partners, and the public.” The website seeks to assist data owners with “wherever possible, release it to the public in a way that makes it open, discoverable, and usable.” The website seeks to assist with the implementation of the US Government Open Data program by providing tools to assist in the process.

The reading provides definitions of the principles of open data(public, accessible, described, reusable, complete, timely and managed post release), the standards, specifications and formats developmental process, open data glossary of common terms and a project open data metadata Schema (guidance to support the use of the Project Open Data metadata to list agency datasets and application programming interfaces (APIs). The definitions provide a common language that all organizations can reference to make sure that miscommunication is minimized and all participants can operate on a shared baseline of information that will be recognized by others. There is also references on how the open data will be implemented by the US Government and tools that are available. The references are the Executive Orders and President Memorandums that provide the framework and authority for implementation of the Open Data program. The references even provide how APIs will be documented. APIs are a program that allows one computer software to talk to another computer software. The tools provide technical data to assist with the data transmission.

There is information on resources that are available to organizations that are required to implement the Open data Program. Resources like Metadata Resources, Business Case for Open Data, and Examples of Policy Documents just to name a few, there is also a potion that covers case studies of organizations that are leading the way ion open data so that others can use as information to see how their own organization is doing and may be find a best case to assist them with implementation.

The final portion is dedicated to engagement with others and your own organization to seek feedback and act on that feedback. Provides format on how to hold engagements and who to achieve the best results by setting objectives and how to seeking engagement with external partners. This reference helps educate not just the government employee, but those organizations that seek access to government data for their own studies. By understanding the government’s requirement to implement open data, users may have a better understanding of what data and when will be available for external use.

Discussion Questions

  1. Do you see contradictions in your field between the goods and bads of open data?
  2. Do you know if you your data will be protected? Why should I license my data and if I do need to license my data how do I do it.
  3. What is the US Government’s policy in regards to the data that they create each and every day, is there data that I will be able to use in the future.

YouTube Summary Video LINK

References