Guide to Building Data-Driven Organizations in the Public Sector

Harnessing Social Media Data

William Seeley and Lauren Zajac Team 1

Topic Overview

The readings this week focused on new innovations to manage Big Data and the volume of information on social media. As discussed in earlier chapters, all of this information can be overwhelming, and can cause information blindness. Patrick Meier explains that sifting through this vast quantity of information is even harder than finding a “needle in a hay stack…you are actually trying to find a needle in a meadow” (p. 96). Chapter three and five are used to explore new tools, concepts and platforms in development and to help digital humanitarians accurately and quickly make sense of vast quantities of information. These tools can be used in a crisis to help to quickly classify information on social media into requests for help and asks for assistance, and offer built in digital “rules”, using multiple people verify the same information, to ensure that there is quality control and assurance. The faster processing time enables disaster response teams to respond faster. While the readings focused almost exclusively on the use of artificial intelligence and other data platforms for digital humanitarian response teams, there are some important management lessons for all of us, including the hope that we can better understand many of these technical innovations, and begin to determine ways that they can help us tackle Big Data in our daily work.

Chapter Summaries

Chapter 3 describes some of the early attempts to find the needle in the meadow, including a “match” program used in 2010 in response to the fires in Russia. Patrick Meier explains that this was a “crisis map with purpose…to match people in need of help with those willing to provide help” (p. 50). The digital humanitarians also ran a “phone match” service for those who did not have access to the internet (p. 52). This match concept allowed the team the ability to better manage Big Data.

Meier and his team of digital humanitarians, now known as the Standby Volunteer Task Force (SBTF) liked the concept of a match program to quickly get resources to people in need, but they also knew that manual processes to make sense of Big Crisis data was ineffective and can be discouraging for volunteers. They began to experiment with “micro tasking” which is a processes of slicing and dicing information into smaller manageable sections to increase accuracy and ability to process a great deal of information (p. 62). This way, instead of 100 people trying to work through 10,000 tweets, they can each look at only 100 distinct tweets. Crowd crafting was an open source platform available at the time that used this concept, and with some alterations, it was used by the digital humanitarians following the typhoon in the Philippines. It also had an important quality control feature, and it required that pieces of information are verified by three to five other people to increase accuracy and reliability (p.65).

This concept led Meier and his team to develop and implement their “MicroMappers Platform” to allow their digital humanitarians to quickly tag tweets, images, and videos. The program also has a voting methodology, which requires 3-5 people review and agree on information, which helps to ensure quality control and agreement on the severity of the damage (p. 69). Micromappers also has an algorithm to sort and identify only unique tweets, which allowed a faster way to sort through a great deal of information (p. 71).

These innovations were critical, but still imperfect. In Chapter 5, Patrick Meier begins to explain some additional technological advances that are also helpful to sort through all of the vast quantities of social media information available following a disaster. One innovation is data mining, or an “automatic analysis of large data sets” (p. 97). First used to assist the World Health Organization monitor outbreaks, using a “Health Map” Harvard University helped the digital humanitarians use the same concept to document human rights abuse and violence in Syria (p. 98). The program uses text classifications for social media posts on Facebook and twitter. In order to be effective, these text classifications or “codes” need to be modified based upon the type of disaster, the location, language and culture (p. 104). Patrick Meier and his team first applied this technology in the response to the tornado in Oklahoma, and were able to discover that out of over two million tweets, only several hundred were actually individuals seeking or offering to help (p. 103).

Because the text codes and classifications need to be customized, Meier and his team went on to build the Artificial Intelligence for Disaster Platform or AIDR (p. 104). This platform is user friendly, and open source, and allows digital humanitarians the ability to….”quickly crowdsource the creation of hundreds of classifiers” in response to a disaster (p. 104). AIDR also been designed to require that multiple volunteers verify tweets and images to ensure quality control and accuracy, which also helps the AIDR program learn and improve their accuracy (p.105). In addition to social media uses, Meier and his team are also partnering with UNICEF to develop the AIDR platform to use the same concept for SMS text messaging (p. 108).

Key Take-Aways (for Yellowdig)

In conclusion, the readings this week explored a number of new concepts, platforms and strategies to analyze a great deal of information in a very short amount of time. This is critical for digital humanitarian efforts, but will also be increasingly critical and important in our daily lives as Big Data makes it harder to process information in any manual way. We are truly looking for a needle in a meadow

YELLOWDIG BOARD

Discussion Questions

Questions:

  1. Meier described microtasking as a method where information is sliced and diced into smaller, more manageable sections to assist in processing a great deal of information in a short amount of time (p. 62). Have you used any similar techniques in your data analysis?
  2. Artificial Intelligence (AI) seems like science fiction, but we continue to be exposed to more examples of it in our daily life- “Just ask Siri”. What are some ways AI has or can impact your daily work?
  3. Emergency Management has very rigid standards for data collection and responses, which makes additional data collection nearly impossible. Does your industry have the same issue? How can one overcome this barrier, especially in the age of social media?

References