Guide to Building Data-Driven Organizations in the Public Sector

Prediction Moneyball

“Playing Moneyball with Prediction”

(Team 5)

Topic Overview

Summary

These chapters discuss the use of data for predictive purposes. Eagle and Greene discuss how big data has the ability to be a “real-time crystal ball” to track health threats, crime and traffic. Used in a mathematical model, big data also has the ability to predict behavior and simplify decision making in order to optimize resources. O’Neil discusses how these models are used every day to inform decisions in Major League Baseball, access to credit and financing (FICO scores), crime (CompStat), recidivism and more. But do the costs outweigh the benefits, and does the over simplification of big data lead to effectiveness in solving the problems they are created to address?

Eagle and Greene make the point that trust should not be put solely into big data and suggest that the data needs to be backed up to validate what is happening in real life. The concerns that O’Neil raises in Weapons of Math Destruction about the dangers of mathematical models and the assumptions they use, could be enhanced with this solutions-oriented approach of validating what is happening on the ground. Instead of creating their own “pernicious feedback loops,” as she calls them, what if data models added a human touch by gathering real-time feedback and information from people affected by those models? Could it be possible that they could create their own feedback loop that was not grounded in assumptions?

This leads to another question: “Does data ever tell the full story?” Arizona students take an annual statewide exam called the AzMERIT. The test provides information at the end of the year to know whether students are proficient in math and English. AzMERIT produces both small and big data. For example, parents use their child’s score to understand how their child is doing – is he on track or behind? Schools use the scores at the school and classroom levels to know how well their teachers are teaching and students are learning. Policymakers and administrators use the scores at the largest scale to determine if the school needs help or if they should receive a financial reward for their good performance. But does the data tell the full story of how a teacher taught during the school year or what a child fully understands? What if an A student woke up sick and had a bad day and did poorly on the test as a result? What if students did not have breakfast or had a traumatic event happen the day before? Could they perform as their best? Further, for teachers, does one good test score mean they are an incredible teacher, or a bad test score mean they are terrible? There are so many variables that go into test score results that are not reflected in the data. Test scores may not show that the majority of the class started the year three grade levels behind and the teacher worked overtime providing tutoring and Saturday classes so that the students could catch up one and a half grade levels in a year, however, they still fell a year behind their peers making their scores look terrible. It also may not show that the 5th grade class was using the 6th grade class curriculum (their school’s norm), thus blowing away all other 5th graders across the state. In this case it was not the teacher teaching that led to their success, but the school’s accelerated curriculum and the relative wealth of the student population. It also doesn’t show the obstacles that students have to overcome just to show up to class every day, let alone achieve. The point being that data does not tell the whole story.

Data models often prioritize efficiency of decision making over understanding the entirety of the situation that the data reflects. Data models could be made stronger by adding a human level that would increase the understanding of the data. This could be accomplished through surveys in person or via an electronic mechanism, or the creators of the data models spending time in the field building a greater understanding of the people’s lives that they are trying to account for in the models. What if mathematicians calculating FICO scores spent a day in the life of individuals with low, medium and high levels of credit and debriefed their observations at the end of the day with their peers? Could they learn something that would better inform their models or change the way they count their data? IDEO does this famously with product design, using a design thinking process that puts the designers of products in the shoes of its ultimate users. Their job is to understand how the product would be used in real-life and to build empathy for the users to create a better product. Why can’t mathematicians do the same? Could this personal touch add greater value to data models so that they are not creating the “pernicious feedback loops” that O’Neil warns of in her text?

Further, what happens when data models are flawed? Often these flaws are accepted and those impacted are marginalized as collateral damage for the sake of progress. There is often little recourse and little compassion for those who are stuck in this no-win situation. Those who do seek justice are often those that have the means (money) and resources (connections) to make waves. But it is often the under privileged who fall victim to the Weapons of Math Destruction and are the least likely to have the capacity to make said waves. What if the people affected by these models had the ability to understand what was in the model, how it was computed, and the ability to refute its contents? Often, when decisions are made by the WMD, the end user has no idea they have even been a victim. They are unaware of the decision process or how the conclusion was drawn. They are simply informed they are not qualified for the loan or that they have not been offered the job. If the process was more transparent, would that nudge this trend of mechanical decision making in the other direction? Could people make the case that they are worthy of credit, despite having a poor FICO score, by using other measures? FICO includes payment history (making payments on time), current levels of debt, types of credit held, credit history, and new accounts. (The Motley Fool) If someone has $2 million in cash in the bank, but has never used a credit card or had a loan, their FICO score would be poor and they would pay more in interest than someone with a stronger score. However, with $2 million cash-on-hand this person is likely to not pose a threat to the credit company and should have a lower interest rate. What if it was possible for individuals to use other measures, like the amount they have in savings, their personal net worth, amount invested, their ability to pay for their short-term expenses in cash, or others to make the case that their FICO score should increase? While this runs against the very nature of a model’s purpose, it makes the case that models can be flawed, and their simplified nature does not tell the whole story of the people it is trying to represent.

It has been made clear that the introduction of data was intended to help the decision process become less bias and more quantitative. It was intended to add a finite element of measurement and eliminate arbitrary, qualitative ‘feelings’ that can interfere with human judgement. But it can be argued that this change has only led to a similar outcome. The use of data as the primary driver for decisions has resulted in unfair judgments, bias calculations and systematic stereotyping; all of which where the impetus for using it in the first place.

Chapter Summaries

Reality Mining: Using Big Data to Engineer a Better World

Chapter 6: Optimizing Resource Allocation

In Chapter 6, Eagle and Greene discuss how big data can be used to allocate resources in the most efficient manner. They discuss how crime and traffic data can be used to make predictions about future needs. Examples include proactively placing police officers in specific locations known for high levels of crime incidents, predicting needs road infrastructure, and tracking the spread of disease. First, they look at how technology has enabled companies like Inrix and Google to track the speed of traffic and estimate travel times. They also discuss the idea of surprise modeling, which grew out of a project at Microsoft and enables a more accurate prediction of unexpected events on the road (like a traffic jam or when the jam clears) and how much time those might entail. Engle and Greene also discuss the Vehicle Probe Project, which is crowdsourcing data from GPS units in cars and other sources. The data provides more accuracy for drivers themselves, but also gives transportation planners the ability to solve for infrastructure needs and to evacuate people more efficiently in a disaster. They also discuss how traffic flow can be used to track how germs move within a city and can give public health professionals a window into how quickly and in what places a health issue may be spreading.

Chapter 10: Engineering a Safer and Healthier World

In Chapter 10, Eagle and Greene discuss how various forms of technology are being used to inform public health by tracking, predicting and stopping the spread of various communicable diseases. They discuss how Google searches, public posts on Facebook and Twitter, quick surveys, air travel and shipping routes, and digital footprints from cell phone use can help with these efforts. Air and travel shipping routes, for example, track the movement of people and goods from place to place. If people are ill and infect others, disease spreads. Likewise, if there is produce coming from one country to another that has infected mosquitoes living in it, those mosquitoes could move from one country to another and facilitate the spread of disease. Eagle and Green note that possibly a more direct predictor of the movement of people is to use data gathered from actual cell phone use and location. CDRs (Call Detail Records) can be used to see how people move at a much greater scale than air routes. The authors discuss pairing CDR data with syndromic surveillance to ascertain public health trends, which includes things like video of people coughing at train stations or orange juice sales. Likewise, Eagle and Greene discuss how Google uses search terms and locations of users to track possible incidences of the flu and how it may be spreading across the country. Google Flu Trends has tracked the spread of the flu in multiple countries using their technologies and expanded it to also track Dengue as well through Google Dengue Trends. Others have found similar success in tracking the flu via Facebook and Twitter.

Engle and Greene also discuss the potential of how big data and mining its contents has the potential to be a “real-time crystal ball” about health threats. However, they also argue that trust should not be put solely into big data. The data needs to be backed up to validate what is actually happening on the ground, which could be accomplished through surveys delivered to people in the affected areas.

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Chapter 1: Bomb Parts

O’Neil introduces the idea of Weapons of Math Destruction (WMDs) in this chapter in greater depth. She outlines the critical elements of what makes a WMD: opacity, scale and damage. Opacity is the extent to which the model is transparent. Can people affected by the mathematical model be made aware of and understand the data included and how it works? The second element is scale: does the model have the potential to grow and be applied to other people or to an entire industry? The third criteria is damage: does the model have the potential to negatively impact a person or many people’s lives?

She provides three examples of models that have been used and their alignment with these criteria. The first being the Moneyball example, where Major League Baseball teams use models to inform their offense and defensive strategies. For example, using this data model, teams are better able to shift field position to be in the location where the batter is most likely to hit in that situation, with say two outs and with a left handed pitcher. O’Neil believes this type of model is generally transparent, as data is available publicly and to the players who are in the models. It is also constantly updated with new information to adjust the models and uses relevant data, rather than proxies.

The second example the author provided was one related to her family meals. She shared that the users of the model (her family) could question the model and its results and she could explain it, they could weight the criteria differently, and that the model is useful for her family but is unlikely to scale.

The third model is focused on recidivism. She cites the LSI-R questionnaire that prisoners fill out. They are not made aware of its purpose, nor do they ever learn the results. The model is not transparent and she believes, has the ability to destroy lives, given its pernicious feedback loop. She discusses the scalability of the model and its use already in a number of states.

Chapter 5: Justice in the Era of Big Data

The focus of this chapter is on the overall fairness of the algorithms and formulas used in big data. O’Neil’s point is that the models are used to increase efficiency abut often at the cost of equality. She looks at several examples relating to crime prevention programs as well as “Stop and Frisk” policies as being primary examples of how Weapons of Math Destruction (WMD’s) are mistakenly believed to be both scientific and fair. She highlights how many crime prevention programs (PredPol, CompStat) are actually self-fulfilling in that they highlight areas where crime is expected to be more likely, such as a poorer neighborhood or community. As a result of this data output, police step up enforcement in that area as a means of deterring criminal activity. They make arrests on nominal charges, believing this will prevent larger, more heinous incidents from occurring. While that is one side of the arguments, these actions also inadvertently confirm the model and thus increase the data for future predictions. She offers the comparison of police cracking down in the Gold Coast, the very affluent neighborhood off of Lake Shore Drive in Chicago. Police might find violations like unpaid parking tickets, jaywalking, or other nominal infractions. Over time, these arrests or citations would create a data trail indicting crime was running rampant in this part of town! Data is used in the interest of efficiency, but at the cost of fairness. She goes on to offer a further comparison in recidivism and the idea that ‘high risk’ prisoners often get longer sentences, which translates into more difficulty in finding employment and thus making it difficult to secure a stable income, dependable housing, reliable transportation etc. These factors make it more likely that this newly released inmate will become a repeat offender. If and when that happens, the data is added and once again the self-fulfilling cycle is confirmed. The flaw in this type of Big Data application is summed up by O’Neil in this way: “All too often, they use data to justify the working of the system but not to question or improve the system.”

Chapter 8: Landing Credit

In chapter 8, O’Neil takes us further down the scary path of Big Data and its impact on the finance landscape. Be it borrowing and buying or getting hired and promoted, data analytics and number crunching often have more to do with these decisions than we realize. Just as would-be criminals were identified by data algorithms as probable offenders, lending data algorithms offer the same type of ‘probability bucket’ linking potential borrowers with others that have similar data points. O’Neil references this as the “Birds of a Feather” approach, meaning that subjects who exhibit similar qualities are assumed to have other qualities in common. If everyone in your zip code that drives the same model car and makes the same salary range default on their mortgage payments, then its is inferred that you will too- its simple association by data points. And while many may point to the truism in this concept, the bigger picture is that computers can only account for so much. They cannot see the nuances. The data produces a general picture but cannot account for the finer details of context, situation, or common sense. The rub is that we as a society have essentially come full circle as data collection was inteneded as a means to incorporate fairness and equality, eliminating judgements on race and religion. But the overuse of data and the resulting typecasting has introduced unintentional consequences by being unforgivingly flawed and lacking human reasonability.

Key Take-Aways (for Yellowdig)

https://youtu.be/Q8SFAbqHYu4

Discussion Questions

1 - From your perspective, have data models added value to society or are they causing more harm than they should?

2- Have data models eliminated human bias or just disguised it using technology? Do you think some positive things can come from the use of data models?

3 - Has a data model ever affected your life? How did you find out about it? How would you rank it on O’Neil’s three criteria of: opacity (or transparency of the model), ability to scale and damage caused?

4 – Should the industry of big data be regulated? Meaning, should companies that collect and sell people’s data be given a set of rules and industry guidelines to be held accountable to, including to allow people transparency to see what is being collected about them and have the ability to change things that are incorrect?

5 - What role do you think data algorithms should play in our day to day life? Is there a level to which you feel it is appropriate? When is it too much?

6 - Have you experienced a situation where data was used to confirm or refute a decision that resulted in others being negatively impacted? (The proverbial collateral damage)

References