Skip to main content

How Statistics Can Help At-Risk Children


By the time March rolls around, winter in Provo doesn’t exactly feel like the most wonderful time of the year. Aside from the hassle of bundling up and stepping over the ubiquitous soggy slush, the weather this month isn’t all bad. In fact, we all hope to soon see more sunlight, greener landscapes, and finally less illness.

Winter is famous for surges in the flu and the common cold, but there are other viral illnesses that deserve caution. Statistics graduate student Celeste Ingersoll studies one particularly pervasive yet overlooked disease: respiratory syncytial virus (RSV). Her goal is to help health care providers and parents predict the optimal time to administer preventative medicine to at-risk children.

The CDC reports that RSV is a common virus causing “mild, cold-like symptoms”[1] that actually infects most children before age two. However, the illness can be fatal to infants and elderly people with certain lung and heart conditions. Each year, hospitals treat an estimated 57,000 children under age five for RSV infection, most of whom are discharged within a few days. There is currently no vaccine for the virus.

RSV’s potential severity can be concerning for families with young children, but parents are not entirely alone in protecting their kids from the illness. In some cases, parents can choose to have especially at-risk children receive three monthly injections of a preventative medicine called Palivizumab. For the drug to be most effective, doctors advise that children receive all of the injections before RSV season starts. However, when it comes to implementing that advice, doctors are left with a critical question: when exactly should children receive the first shot?

This is where Ingersoll comes in. Generally, researchers have known that the number of RSV cases in the US picks up from mid-September to mid-May, but the exact timing of RSV season varies from state to state. To get a better grasp of how RSV behaves in different parts of the US, Ingersoll analyzed a decade’s worth of county health records and modeled the annual RSV season in each state.

“Making a model for every county would be a lot because there’s a lot of counties and not enough data for us to do that, so we aggregated [the data] onto a state level,” Ingersoll said. “The overall goal is to be able to predict…when is the season going to start, peak, and end.”

This state-specific knowledge of RSV’s behavior is critical because it will help doctors determine the best time to administer Palivizumab and, perhaps someday, a vaccine.

However, the process of developing Ingersoll’s models was a challenge right down to her source of data, a database containing millions of medical records. To protect patient confidentiality, the data had been deliberately manipulated so that researchers could not determine the identities of individuals in the database.

“[Database creators] take the real data and scramble it a little bit and intentionally introduce noise,” Dr. Matthew Heaton, Ingersoll’s thesis advisor, said. “And so, we have to sort through that mess, first of all, and then…format [it] the way that we need it to do the analysis.”

For Ingersoll, this “formatting” involved logging on to the database’s special servers, running hours of computations, merging a myriad of data sets together, and finally scrounging the data for specific patient markers. Once those steps were complete, Ingersoll was at last ready to move on to analysis and modeling.

Ingersoll currently plans to publish her work in a peer-reviewed journal and submit her findings to the BYU Department of Public Health which supplied her with the data. Heaton hopes that Ingersoll’s research will eventually enter a “broader body of knowledge” that will help health care providers stave off RSV once vaccines are available. Until then, the rest of us will have to continue to brace ourselves during winter and think Spring.