Katherine McLaughlin is a statistical detective of sorts, employing sampling and data analysis methods to identify and understand hard-to-reach or hidden populations. An assistant professor in the Department of Statistics at Oregon State University, McLaughlin’s work explores a large number and wide variety of at-risk populations around the globe and involves collaborations with epidemiologists, statisticians, and public health officials.
Last summer the onset of the pandemic drove her statistical research towards a different direction. McLaughlin was appointed co-principal investigator with Oregon State University’s nationally recognized TRACE-COVID-19 project. She is TRACE’s lead researcher on providing statistical analyses and guidance. In this role, McLaughlin develops robust and innovative sampling designs and data analyses for community testing, as well as dissemination of public facing results for TRACE-OSU.
The highly successful TRACE-COVID-19 project has set a national example in containing COVID-19 risk. Oregon State researchers, including McLaughlin, recently received a $2 million grant from the David and Lucile Packard Foundation to create a national TRACE Center that will expand OSU’s COVID-19 public health project to other states.
“I was drawn to respondent-driven sampling because it merges my interest in designing specialized sampling methods tailored to the needs of a population with the possibility to help groups that are typically underserved and face elevated risk of HIV and other diseases.”
Team-based Rapid Assessment of Community-Level Coronavirus Epidemics, or TRACE-COVID-19, was launched by OSU in April, 2020 with door-to-door sampling in Corvallis, home to Oregon State’s main campus, and expanded to other cities around the state while also adding a wastewater testing component.
In late September, at the start of the academic year, TRACE also started conducting prevalence testing among OSU students, faculty and staff in Corvallis, at OSU-Cascades in Bend and at the Hatfield Marine Science Center in Newport, Oregon.
Putting statistics to use in the cause of public health
McLaughlin’s research on sampling methods made her the ideal scientist for TRACE. “My work on TRACE aligns well with my interest in sampling methodology, where the challenge is to design a data collection strategy tailored to the unique needs of the population that will best allow the research question to be addressed,” said McLaughlin.
While spearheading sampling and modeling methods at TRACE, McLaughlin has responded to unique challenges in different communities across Oregon, with several variables at play. For instance, she and her colleagues had to figure how to best incorporate wastewater data collected in advance of TRACE sampling to inform allocation of field teams to households within the community.
In her own research on vulnerable populations, McLaughlin has similarly estimated prevalence of HIV or the proportion of a population that are victims of human trafficking, and typically elude standard sampling and estimation procedures. More broadly, McLaughlin is interested in social science applications of statistics, for example, understanding how human behaviors contribute to things like missing data on surveys of at-risk populations.
This dimension of her research also exists in her work at TRACE, where she analyzes “how differential participation rates may be impacting our estimates.”
Hidden populations constitute socially stigmatized groups that are reluctant to disclose their identities, remaining largely invisible to researchers. Due to this difficulty in locating members of a small target population, researchers have encountered a dearth of data on the characteristics and demographics of hidden populations.
In her statistical research, McLaughlin has developed new data sampling designs and computational models to estimate characteristics of hidden populations around the world. These comprise female sex workers (FSW), men who have sex with men (MSM), victims of sexual violence, people who inject drugs (PWID) and migrants — some of the groups most vulnerable to infectious diseases, substance misuse and behavioral health issues.
Her models to estimate the size of hidden populations attempt to address shortcomings in existing methods of population inference. Her research contributions have included important modifications and extensions to respondent-driven sampling — a type of chain-referral sample used by the CDC, WHO, UNAIDS and other organizations that utilizes the social/peer network of relationships and friendships of a population to recruit and enroll individuals who may be at high risk for HIV/AIDS and related infections.
McLaughlin is currently working on new models that account for measurement error and examining their effectiveness using different parameters and in a wide variety of real data applications. According to McLaughlin, these models have the potential to correct numerous biases that arise from self-reported social network size due to missing data and intentional and unintentional misreporting.
She has worked with data from populations of FSW, MSM, migrants, and PWID in Morocco; women with sexual violence related pregnancies in the Democratic Republic of the Congo; FSW, MSM, and PWID in Armenia; MSM and PWID in Kosovo; MSM in Italy, Lithuania, Romania, and Slovakia; and PWID from the United States in collaboration with the Centers for Disease Control and Prevention (CDC).
Discovering statistics
McLaughlin took her first statistics course as an undergraduate student at UC Berkeley and, in her own words, “was hooked from the first class.” She loved learning how to gather information and solve complex real-world problems using quantitative methods.
A REU project (Research Experiences for Undergraduates) on election auditing introduced McLaughlin to research and inspired her to apply to graduate school. As a graduate student at UCLA, McLaughlin wanted to pursue a thesis topic that would have “real-world impact and would benefit people.” As she learned more about the statistical challenges of sampling hidden populations, she discovered there was room for much needed improvement of standard sampling and estimation techniques, and it became a fruitful area to work on during her Ph.D.
“I was drawn to respondent-driven sampling because it merges my interest in designing specialized sampling methods tailored to the needs of a population with the possibility to help groups that are typically underserved and face elevated risk of HIV and other diseases,” said McLaughlin.
Several undergraduate and graduate mentors helped McLaughlin connect with statistics and succeed in the field. She considers herself fortunate for having the support of amazing mentors at all stages of her education.
“In particular, my undergraduate research and thesis mentor, Phillip Stark, helped me develop research and scientific writing skills and encouraged me to apply to graduate school. My Ph.D. advisor, Mark Handcock, guided me through the complexities of academia and introduced me to a wide range of collaborators. Lisa Johnston, an epidemiologist I collaborate with frequently for respondent-driven sampling studies, has also been a great mentor for interdisciplinary work,” said McLaughlin.
She is currently involved with a wide range of research projects that revolve around making hidden population estimation more broadly applicable. These include using multiple years of data in a capture-recapture framework (a collaboration with Brian Kim from the University of Maryland) and extending the methodology for clustered hidden populations (work by Oregon State Ph.D. student Laura Gamble).
McLaughlin received the College of Science Research and Innovation Seed Program Award to support her research on estimating the number of people who inject drugs in metropolitan areas in the U.S. as a way to contain the HIV epidemic and slow the rate of transmission. In collaboration with CDC, she is developing innovative sampling analysis and statistical methodologies to obtain more precise estimates of the size of the hidden populations that are at high risk for contracting and transmitting HIV.
McLaughlin has also helped to broaden the use of respondent-driven sampling to other types of hidden populations, including trafficked populations.
From designing models to collect accurate information about social groups to estimating the spread of the COVID-19 pandemic in Oregon, McLaughlin’s research has had tangible impact and implications for public health. Her statistical models hold great promise for unbiased estimates of hidden populations and effective public health interventions.