Statistics research group answers the call of COVID-19

August 6, 2020

An Iowa State research team has developed groundbreaking statistical models that are helping the Centers for Disease Control and Prevention (CDC) track and predict COVID-19 deaths in the United States. Led by Lily Wang, associate professor of statistics, the team is the first to produce COVID-19 forecasts for all 3,104 counties in the continental United States.

And it all began with a January phone call.

A pandemic emerges across the Pacific

“My parents live in Shandong Province in China, which is 500 miles from the city of Wuhan, in Hubei Province—the epicenter of the COVID-19 crisis,” Wang said. “My relatives described an unknown virus that was rapidly spreading and they called to ask if I would use my statistics skills to help them understand the situation.”

Wang answered the call. She began running the numbers and framing out a statistical model that could provide insight into the spread and severity of this new, mysterious virus.

Lily Wang 

A frantic phone call to Wang (back left) from friends and family in China, initiated the development of her team's COVID-19 statistical models.  (Submitted photo)

“COVID-19 was overwhelming many areas of China,” Wang said. “I had many friends and relatives who were contacting me. They were very concerned and desperate for answers.”

Within days, COVID-19 spread around the globe. Soon, the first cases surfaced in the United States.

COVID-19 appears on America’s west coast

It was Jan. 20 when COVID-19 was confirmed in Washington state. One case became two, then two became 10. Within days, the United States was in the throes of its first outbreak. Evidence of community spread appeared in California and swept across the state. On Jan. 30, the World Health Organization declared a global health emergency.

Wang pivoted her statistical analysis to the growing crisis in the United States. She began sifting through U.S. news reports, articles and press releases to confirm COVID-19 cases, deaths and potential hot zones. Wang was grappling with a deluge of data.

“There are many uncertainties with a novel virus like COVID-19 and one way to answer these important questions is through statistical modeling.”

“A desire to help my relatives in China was my primary motivation for creating a statistical model,” Wang said. “But the exponential growth of COVID-19 in the United States grew the initial work into a complex research project that continues today.”

And a serious research project called for a serious research team.

Building a COVID-19 modeling team

Wang hand-picked the team’s members: Shan Yu, (’20 statistics, Ph.D.); Myungjin Kim, (’21 statistics, Ph.D.); Yueying Wang, (’21 statistics, Ph.D.); and Zhiling Gu, ( ‘24 statistics, Ph.D.). Xinyi Li (’18 statistics, Ph.D.), a statistician at the Statistical and Applied Mathematical Sciences Institute and a postdoctoral fellow at the University of North Carolina, Chapel Hill, also joined the group. Wang served as Li’s adviser and mentor during graduate school.

Lei Gao, assistant professor of finance, also contributed expertise in data-acquisition and risk-analysis.

“It was amazing to assemble a research team with talented people that I trust,” Wang said. “We were constantly processing data that was coming in from all over the United States.”

Lily Wang

Members of Lily Wang's research team attend a 2019 conference in Nankai University in Tianjin, China. L-R Shan Yu ('20 statistics, Ph.D.); Lily Wang; GuanNan Wang; and Xinyi Li ('18, statistics, Ph.D.). (Submitted photo)

Data collection during the initial outbreak was challenging because infection and fatality counts weren’t centralized or standardized. The team worked countless hours, often late into the night—scouring reports, detecting errors and correcting data discrepancies.

As their statistical model evolved, they added COVID-19 infection and fatality data from John Hopkins University, The New York Times, The Atlantic, USAFacts, the World Health Organization and the CDC.

A game-changing achievement

By late February, the group documented a critical breakthrough. Their models were producing accurate COVID-19 infection and fatality forecasts. This caught the attention of the COVID-19 Forecast Hub, a consortium of modeling teams that provide COVID-19 forecasts to the CDC.

In May, the team’s projections became an official part of the CDC’s COVID-19 website which forecasts national and state COVID-19-related deaths into the next four weeks.

The far reach of local impacts

Wang is honored to be a part of the CDC’s national COVID-19 forecasts. But what she values most is the positive contributions their county-level models make to local communities. Wang’s team is the first to provide such detailed COVID-19 modeling for every county in the lower 48 states.

“We delve deep into the dynamics of these counties, down to the number of hospital beds and the average age of residents.” Wang said.

“It’s important that everyone—from policymakers and parents to CDC officials—has access to facts and data to keep themselves, their families and their communities safe.”

The county models contain an impressive cache of information, including data from local health departments, census records and government news releases. Detailed demographics, such as male-to-female ratios, income levels and ethnicity statistics are also analyzed, as well as local control policies—such as shelter-in-place orders, social-distancing guidelines and regulations on area businesses and schools. In addition, anonymized cell-phone data from local Department of Transportation offices provides information about local mobility trends.

“County-level data allows us to provide a deeper and richer understanding of what is happening in local populations,” Wang said. “Whether a parent is deciding to return their child to daycare or a business owner is contemplating a re-opening, it is important that everyone has access to evidence-based decision-making tools that are rooted in science.”

Bringing essential COVID-19 data to the masses

To increase access to their data, Wang’s group developed an interactive, web-based dashboard.

Wang’s former Ph.D. student and long-time collaborator GuanNan Wang, assistant professor of mathematics at William & Mary in Virginia, spearheaded the website development and created the software and applications to support the interactive dashboard.

Web image from the COVID-19 dashboard, showing a map of the United States.

Lily Wang's research team developed a web-based dashboard, where users can access COVID-19 infections and fatality data for U.S. states and counties.

The website allows users to navigate state and county maps. With a few simple clicks—users can navigate state and county maps to view COVID-19 infection and fatality forecasts, as well as a detailed risk analysis for their local area. The website’s popularity has sparked plans for a mobile app.

“There are many uncertainties with a novel virus like COVID-19 and one way to answer these important questions is through statistical modeling,” Wang said. “It’s important that everyone—from policymakers and parents to CDC officials—has access to facts and data to keep themselves, their families and their communities safe.”