Guide to Building Data-Driven Organizations in the Public Sector

Best Practices for Privacy

(Team 5)

Topic Overview

The concept of managing data privacy by putting it in the hands of users themselves served as a central theme in this week’s readings. In Social Physics, Pentland discusses the OpenPDS system that he has created to put data in the hands of users directly, giving them control of the content of their data and its use. Brandom discusses the General Data Protection Regulation (GDPR) in the European Union which is requiring user opt-in when sharing their data and requiring greater transparency for companies using the data.

Generally this idea of making a user’s data more accessible and transparent in how it is being used is progress in the right direction, however, there remains concern over helping users understand the implications of this shift in ownership. Once you put the power in the hands of the user, you have to equip them to be successful. For example, Arizona is home to one of the strongest school-choice landscapes in the nation. Parents have the option to send their child to just about any public school they desire. What has resulted is a need to help parents understand the options available to them and how to pick the best school for their child. Parents need to be critical consumers of information about schools. Instead of sending their child to the neighborhood school, they need to assess a school’s philosophy of education, curriculum, assessments, principal leadership style, parent group involvement, access to athletics and electives, and more to find the best fit for their child’s education. The parents who understand this information and how to use it win.

This same concept applies with regard to shifting the power of data ownership to its users. If we are going to put the control of data into their hands, then we must educate them on what the data means, why it is important, how it will be used, and how to change their data when needed. Educating the public would require a massive, multi-year public outreach, paid media, and public relations campaign. It could also spur the creation of an industry of organizations to help consumers manage their data, nonprofit and for-profit alike.

Also important is that those businesses that hold the data have practices that truly enable users to manage their data with ease. When users access the data it should be in a format that they can understand and manipulate easily, which will require some type of graphic interface, the ability for a user to create an account to have ongoing access, and request help when needed. Additionally, if users need to make changes or want to revoke access to their data, those businesses who have the data cannot make the process overwhelmingly difficult. For example related to today’s environment, when users call to correct something on their credit histories, story after story illustrate that it takes multiple phone calls and months of work to get those items resolved. The process is so difficult and complicated that it is discouraging to the users and does not inspire confidence. As users have access to their data, these processes should be streamlined and made more efficient in order to support user requests.

Conversely, there are a host of other questions that arise regarding data privacy, in addition to public understanding and education, including developing technology to integrate billions of pieces of individual data, hardware and software that support such technology, guidelines or laws about usage and storage, and how to collect data on non-technology users, just to name a few. The power that personal data holds has potential to bring advancement in almost every aspect of daily life, but it holds the equivalent potential to wreak havoc. There is a give and take with data that must be considered and the forthcoming decisions regarding personal data is putting the decision burden onto the individual. Instead of an institutionalized process, each person will have to take personal ownership and accountability for their data. Much like HIPPA laws are in place to protect medical health data, the next wave of data collection looks to institute a similar concept on all personal information.

Proponents of this type of data advancement present concepts about how the data would be self-managed and controlled, allowing users to share only the elements they are comfortable with or feel is of value in trading: Do I want my mapping software to know where I am so it in return can help me navigate? As the data models become more sophisticated and the data collection becomes broader, there is concern that assumptions and conclusions will be drawn, thereby nullifying personal choice, individuality and independence. What inferences are being made about the individual? Will they be accurate? Will a user be able to refute the interpretation? Will decisions be made on data alone with no regard to context or background? These are practical and ethically based question that are worthy of proactive debate.

Often, data users are painted as the big bad wolf, but individual responsibility may be cause for suspect as well. Consider if someone is looking for new healthcare and the carriers require personal data about weight, diet, and exercise to be shared in order to consider coverage. If an individual could control that data, would they manipulate it to their advantage? If the carrier required the individual to average 10,000 steps per day would FitBits get strapped to bustling 4-years-olds worldwide, just to make the data work? Clearly this is just but one example, but the idea is that gaming the data system is a very likely possibility. With this in mind, how much regulation are users comfortable with in order to prevent the ‘gaming’? Would fingerprints or retina scans be required to prove authenticity? Some may see this as a farfetched question, but the possibility is not outrageous.

Data collection and evaluation is a clearly a compelling and motivating arena. Numbers do not lie, but that cannot be said of the collectors or the users of the data. We have discussed throughout this class that data alone is not sufficient for decision making, but is there reason to be concerned that this is exactly what may happen with the rise of the personal data store concept? Advocates for this type of data instrument show how these tools can unlock a better, brighter future, not only on a personal level but for society as whole. Concerns like climate change, food shortages, and pollution are all potential areas that can be addressed and improved with the use of better data collection and more importantly, better data sharing. But there seems to be convincing reasons to be cautions as this level of data collection and distribution has equal potential for converse effects and unintended consequences. The production and circulation of data does not come without risk, and there is no such thing as having your cake and eating it too.

Chapter Summaries

Social Physics: How Social Networks Can Make us Smarter Chapter 10: Data Driven Societies Appendix 2: OpenPDS

Pentland charges into the idea of “a new deal on data” advocating that privacy concerns can be dramatically minimized if the ownership of personal data belonged to the individual. He proposes that in order to develop more effective and efficient use of big data, the information that is collected should not be stored in individual “silos” by each collecting entity but should be stored and shared by each person on an individual level. Currently, the model of data collection is that each app a person uses, each social media subscribed to, and each technology gadget used collects data singularly. This data is not shared nor aggregated. It is horded for internal use only therefore limiting it potential to be used multilaterally for bigger, more universal purposes. Pentland goes on to point out that extensive data collection is already happening online, though often unaware by the contributor. This unregulated format has been called into question and is already on the road to making changes, or at a minimum, placing the share process in the hands of the user. Web giants such as Google have begun the discovery process and are leading the charge for change through more transparency about what information is or is not stored within their products.

The result of this idea is the concept of Personal Data System (PDS). This system would essentially function as a portfolio of your personal data. No two portfolios would be a like because they are person specific, just as fingerprints or DNA are unique to each individual. This concept already exists within distinct technology platforms- these personal portfolios are how songs, or movies, or books are suggested to users. But Pentland’s concept takes all of those preferences and data points and compiles them by person, resulting in a unique data set for every individual. Then the PDS would be used individually, sharing what information, when and with whom they see fit. The information would be owned and governed by the individual.

Pentland recognizes his concepts would be a radical change from current methods and would require a society-wide paradigm shift in order to succeed. Today, the influx of big data is just too great to be managed in traditional formats of fixed confines and closed systems. Pentland argues that the stream has developed into a rushing river and the pond is just not sufficient to contain it. He advocates for a ‘living laboratory’ where real time data is used to test connections and new hypothesis. In essence, Pentland argues for a ‘guinea pig’ society that is willing to try out of the box options in order to find revolutionary improvements. In so doing, our human understanding of data and its potential will require a much deeper connection and language if society is to integrate the full potential of big data. But while societal change is quite the obstacle, Pentland and other advocates of big data offer that this change must happen if the world is to improve and conquer the problems of the future.

Reality Mining: Using Big Data to Engineer a Better World Chapter 2: Using Personal Data in a Privacy Sensitive Way to Make a Person’s Life Easier and Happier

Eagle and Greene use this chapter to highlight a number of big data-based solutions that have the potential to have benefit to users; however, warn the public that these tools must be implemented with care with regard to data privacy. They share examples of data solutions that can break bad habits, like smoking or other behavioral issues that identify the triggers that lead to the habit and provide a warning to the user. They also discuss tracking personal behavior in order to send alerts when behavior is out of the norm to monitor the elderly or add to antitheft systems in cars. They also discuss employer-based health incentive programs that reduce the cost for healthcare (and sometimes provide a financial incentive to the insured), auto insurance programs that reduce premiums based on behavior, and efforts to protect victims of domestic violence.

These data-based solutions hold a lot of promise in changing behaviors, keeping people safe, and saving people money. However, Eagle and Greene caution that data ownership, legality, and other impacts to the users need to be carefully thought through and meaningfully addressed. They argue for “various approaches to data collection and analysis that offer a range of privacy options.” Some options might include giving users the ability to opt out, having clear and easily understood information about how data will be shared or used, and more.

Everything You Need to Know about the EU’s New Privacy Law: GDPR

General Data Protection Regulation (GDPR) is a new law in the European Union (EU) that “sets new rules for how companies manage and share user data.” Because of the global nature of the Internet, the rules extend far beyond just the EU and are affecting businesses across the globe, which is why there is a sudden proliferation of “click to proceed” requests online. GDPR requires that companies get permission from users any time they are collecting personal data from those who are EU citizens. It also puts users more in the driver’s seat by giving them the ability to revoke that privilege and request all of the data on themselves from the company that collects the information.

The author notes that perhaps the most important thing that GDPR does is place restrictions on sharing behind the scenes between companies. The impact is that “companies have to rethink how they approach analytics, logins, and, above all, advertising.” Sites who share data with other companies now will have to bring those relationships into the open and the companies receiving the data will have to justify its use.

The law took effect in May 2018 and since then companies have been scrambling to rewrite their terms of service and contracts. Companies that violate GDPR face a stiff penalty of up to 4 percent of their global turnover, or $20 million, whichever is higher.

Brandom calls this a “sea change for how data is handled across the world” and cites that GDPR “could fundamentally flip the relationship between massive tech companies that gather data, and the users they gather it from.

Key Take-Aways (for Yellowdig)

Video: https://voicethread.com/share/12143546/

Discussion Questions

GDPR shifts the power of online data to the user for EU citizens. Do you believe something like this would be beneficial to consumers in the United States? Would it give you greater comfort in knowing that you had the ability to request your data and revoke the websites’ ability to collect your data? Do you think it matters to the general public?

Would you sign up to have your car insurance company track your driving behavior so that you could get a discount if you were a “safe” driver? Likewise, would you sign up for your employer’s health incentive program where they would help you track your personal health data in exchange for a less expensive insurance premium? Is the tradeoff of a cheaper insurance bill worth having these elements of your life tracked?

Do you have concerns that if data was self-monitored, it would be accurate and/or unbiased? If you knew you could edit your data to show a more positive or ‘cleaner’ version of yourself, would you? Do you think others would? If others could see your data (much like they can see your age, birthday and location on social media) would that influence what you edited and what you didn’t? And if so, would the value of the data degrade, making it less impactful in decision making?

If a Personal Data System was developed, would you adopt it? Would it be as tightly guarded as a social security number? How would non-technology users navigate without a PDS?

References