“This guide provides practical guidance on how to obtain and use nonpublic administrative data for a randomized evaluation. Administrative data are information collected, used, and stored primarily for administrative (i.e., operational), rather than research, purposes. Government departments and other organizations collect administrative data for the purposes of registration, transaction, and record keeping, usually during the delivery of a service. Examples of administrative data include credit card transactions, sales records, electronic medical records, insurance claims, educational records, arrest records, and mortality records. This guide focuses on nonpublic (i.e., proprietary or confidential) administrative data that may be used in an individual-level randomized evaluation….
Many of the concepts in this guide are applicable across countries and contexts. However, sections pertaining to compliance (particularly [the and specific ethics requirements) are directly applicable only in the United States. Other jurisdictions with similar regulatory contexts may have similar legislation (e.g., the European Union has legislation on data protection), resulting in the general applicability of concepts across countries.
This guide focuses on the following topics:
- Standardized processes for accessing administrative data
- The ethical and legal framework surrounding the use of administrative data for randomized evaluations
- Common challenges in using administrative data” (p.3).
The first half discusses the advantages of using administrative data; potential biases when using administrative data; how to find administrative data pertinent to your research; cost of administrative data; and ethics and compliance in using this type of data for impact studies.
The second half walks readers through the administrative data collection process. It starts with an overview of formulating a data request and encourages researchers to develop a data flow strategy during the design phase of their evaluation. Next it discusses data use agreements (DUAs), such as some of the items commonly found in these agreements. The guide closes with a discussion around maintaining the confidentiality of the administrative data and provides external resources for using this data.
(Abstractor: Author and Website Staff)
Major Findings & Recommendations
“There are a number of advantages to using administrative data for research: Cost and ease. Using administrative data may be less expensive and logistically easier than collecting new data” (p.6). “Reduced participant burden. Subjects are not required to provide information to researchers that has already been shared in other contexts. Near-universal coverage. Many existing administrative databases provide a near census of the individuals relevant to a given study” (p.6). “Accuracy. Administrative data may be more accurate than surveys in measuring characteristics that are complex or difficult for subjects to remember…” (p.6). “Minimized bias. Using administrative data that are captured passively, rather than actively reported by individuals or program staff, minimizes the risk of social desirability or enumerator bias” (p.6.) “Long-term availability. Administrative data may be collected systematically and regularly over time, allowing researchers to observe outcomes for study participants across long spans of time” (p.6). Cost data. Some administrative data sources are the authoritative data source of cost data, enabling research on public finances or cost-effectiveness analysis” (p.7). The authors also advise to allow sufficient time to access data. “Gaining access to administrative data is a multifaceted process that should be initiated during the design phase of any research project. The time required to establish a data use agreement can vary widely, and depends on the data provider’s capacity to handle such data requests, the sensitivity of the data, and the levels of review that must be undertaken by both sides before a legal agreement can be signed. In a 2015 analysis of data acquisition efforts with 42 data agencies, the [authors] found that it typically takes 7 to 18 months from initial contact with a data provider to the completion of a legal agreement” (p.31). When formulating a data request, the guide states that “[r]esearchers…should write clear…data requests that include: • Timeframe for request, in calendar months/years • Acceptable or preferable data format (e.g., ASCII, SAS, Stata) • Data structure (e.g., multiple tables with unique keys for merging versus single, pre-merged data set, long or wide form) • List of variables requested • Clarification or notes for each variable” (p.18). (Abstractor: Author and Website Staff)