Andy Whitford at the University of Georgia wrote a blog on using DOI Codes to find data appropriate for scholarship in public affairs.


These are a great resource for students looking for ideas and scholars looking for new data on a topic.

For example, this study by Ringwalt (2017) was designed to “develop a set of metrics to identify prescription drug providers with unusual or uncustomary prescribing practices and how these metrics can be used to mitigate the misuse, abuse, and diversion of controlled substances.”

Ringwalt, Chris. Synthetic North Carolina Prescription Drug Monitoring Program (PDMP) Data, 2009-2013. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2017-12-12.

The DOI page often makes it easy to quickly assess the data from the study by providing useful metadata:

Study Design

Researchers secured a copy of North Carolina’s prescription drug monitoring program (PDMP) data for the years 2009-2013 inclusive, after all patient-level data were de-identified. Provider and dispenser names also were removed from the dataset. The dataset received from the PDMP vendor required extensive cleaning and variable creation (see SAMPLING). After consulting with multiple state agencies, the researchers developed the following metrics:

  • identifying providers writing high numbers of prescriptions for high doses of opioids (greater than 100 morphine milligram equivalents)
  • identifying providers who consistently provide high levels of opioids that fall below this threshold to identify those who may seek to avoid detection
  • searching for providers who wrote multiple prescriptions for various classes of controlled substances regardless of dose (paying particular attention to those who co-prescribed opioids and benzodiazepines)
  • identifying providers who wrote high numbers of overlapping prescriptions (defined as a prescription written more than seven days before the expiration date of an earlier prescription for the same class of controlled substance)
  • examining patients manifesting unusual behaviors in regards to filling prescriptions for controlled substances
    examining providers of multiple patients who traveled long distances from their homes either to secure prescriptions from their providers or fill these prescriptions at distant pharmacies
  • identifying providers of patients who visited multiple providers or pharmacies to secure or fill controlled substances


After Prescription Drug Monitoring Program (PDMP) data was received, records with non-controlled substances and incomplete data were removed. All DEA numbers for pharmacies that were duplicated (presumably in error) were removed from the prescriber field. Records with Schedule V drugs were also removed due to having the least potential for abuse among the other legal controlled substances. Prescription drugs used to treat opioid addiction and controlled substances dispensed in vials were also excluded from the final data set.

Time Method



Data collected by the North Carolina Controlled Substances Reporting System from dispensers on each controlled substance dispensed in Schedule II-V.

Unit(s) of Observation


Data Source

North Carolina Controlled Substance Reporting System

Method of Data Collection

administrative records data

Mode of Data Collection

record abstracts

Description of Variables

This study contains one excel data set. The data file (2015_NC_PDMP_Synthetic_Data_Set.xlsx, n=10,000, 13 variables) includes a unique ID variable, deidentified dates of when the prescription was dispensed and written, a calculated date of when the prescription can be refilled, formulation of drug as patch versus non-patch, pharmaceutical class of drug, daily dose of opiate medications in morphine milligram equivalents (mme), and total mme in the prescription (deidentified). The file also includes deidentified distances of patient to pharmacy and patient to prescriber, as well as Pharmacy ID, Prescriber ID, and Patient (recipient) ID.