Mathematical modeling draws more accurate picture of coronavirus cases

man smiling outside
Dr. Arni Rao

Mathematical modeling can take what information is reported about the coronavirus, including the clearly underreported numbers of cases, factor in knowns like the density and age distribution of the population in an area, and compute a more realistic picture of the virus’ infection rate, numbers that will enable better prevention and preparation, modelers say.

“Actual pandemic preparedness depends on true cases in the population whether or not they have been identified,” says Dr. Arni S.R. Srinivasa Rao, director of the Laboratory for Theory and Mathematical Modeling in the Division of Infectious Diseases at the Medical College of Georgia at Augusta University. “With better numbers we can better assess how long the virus will persist and how bad it will get. Without these numbers, how can health care systems and workers prepare for what is needed?”

Better numbers also are critical to better protecting the population and overall pandemic preparedness, Rao and his colleague Dr. Steven G. Krantz, professor of mathematics at Washington University in St. Louis, Missouri write in the journal Infection Control and Hospital Epidemiology.

man smiling
Dr. Steven G. Krantz

“We wanted to provide info on the real magnitude of the problem, not just the tip of the iceberg,” corresponding author Rao says.

They used their mathematical model, which takes COVID-19 numbers from sources like the World Health Organization, then used factors like an area’s population density, proportion of population living in urban areas where people tend to live in closer proximity, and populations in three age groups  — ages zero to 14, 15 to 64 and 65+ —  to grow more accurate numbers. Because this virus is so infectious, they also considered “transmission probability,” Rao says.

They also looked at the number of new cases daily above 10 and up to the first reported peak, and the date ranges for those peaks as an indicator of the trend in reported case numbers. Emerging information about how long the virus survives on a variety of surfaces and in the air will further refine their model, Rao says. The cutoff date for this study was March 9.

They found, for example, that Italy — where images of jammed intensive care units were one of the clearest indicators of the virus’ impact in this geographically small but fifth mostly densely populated European country with a high urbanization score — did a comparatively good job of reporting early on, with 1 case reported for every four cases that Rao and Krantz projected. That means about 30,223 cases were not reported, according to their model, and Rao noted that Italy had not reached its peak by their March 9 study deadline.

With such a small percentage of people actually being tested in all countries, particularly at that time, South Korea also was reporting 1 case for about every four likely cases. In Spain — where drive-thru funerals have been reported as occurring every 15 minutes, with nearly 20,000 deaths and a peak in cases March 19, when they reported a 27% increase in active cases — the country was reporting 1 case for about every 53 likely actual cases, based on the mathematical model. That translates to about 87,405 cases and people not reported. The two modelers saw some of the higher numbers they projected actually playing out within a week of their study’s conclusion in several of these European countries, Rao says.

In China, with its huge population numbers at more than 1.4 billion and widely perceived inconsistencies in data reporting, they projected two ranges for the number reported compared to the actual number of cases: 1 in 149 and 1 in 1,104, which translates to anywhere from 12 million to 89 million cases not reported.

A rate could not be calculated March 9 for the United States, where the virus appears to have shown up later, and reported case numbers were just reaching 500, a baseline for projections by mathematical modelers. Rao suspects that the actual number at that date in the U.S. was probably more like 90,000 cases.

A quick follow-up assessment of U.S. numbers by Rao on April 6 using their model indicated more than 561,000 cases with 367,000 actually reported and 8,910 deaths at that date. That calculates to a reporting rate of 2 out of every 3 actual cases in the U.S., reflecting the improvement in tracing positive cases, he says.

Among those 194,000 not yet reported, he projects that includes 3,298 children age 14 and younger, 147,441 ages 15-64 and 43,262 age 65+.

That also means that in the U.S., at least 194,000 people at that April 6th moment likely don’t know they were positive, more clear evidence of the need for social distancing and other preventive measures currently underway, Rao says.

The modelers visualized the disparities between reported cases and what they projected with a Meyer wavelet, which as the name implies goes up, peaks, and then recedes like a wave. In this case the higher the wave, the higher the underreporting, and lowering the wave means improved reporting, Rao says, of the consistent oscillations generated.

If reported numbers were more precise, mathematical models wouldn’t be needed, Rao says, noting that underreporting is a problem for many conditions, not just COVID-19, including common, noninfectious problems like heart disease. “A model tells us something which has not been directly observed,” he says. “It’s a biological experiment done on computers rather than in a lab.”

Rao notes the accuracy of reported cases likely has improved since March 9 with the slowly increased availability of testing, and that the earlier the testing, the earlier the actual peaking of infections.

In the meantime, he encourages everyone to continue to use steps like social isolation and self-quarantine to protect themselves and others by helping fight continued spread of the virulent virus.

“Social distancing is a must, must, must,” Rao says.

As of March 9, there were 109,000 cases and 3,800 deaths reported worldwide, the majority from China as well as Italy, South Korea, Iran, France, Germany and Spain. As of the first full week in April — about a month after their study deadline —nearly 1.4 million cases with more than 81,000 deaths were reported worldwide. More recent figures include the U.S. having more than 362,000 cases and about 11,000 deaths, Spain with more than 135,000 cases and more than 13,000 deaths, and Italy, Germany, France, China, Iran, the United Kingdom, Turkey and Switzerland falling next in line.

In a related research letter in the journal Current Science, they reported about 1 in 4 COVID-19 cases were detected in the month of March in India, that social distancing and other prevention/treatment policies should continue until new cases are not seen there and that spread from urban to more rural populations should be controlled.

Rao is also a professor in the Division of Health Economics and Modeling in the MCG Department of Population Health Sciences.

Read the full study here and related studies about mathematical modeling for coronavirus from Rao and Krantz  here, here and here.


Like Love Haha Wow Sad Angry
Written by
Toni Baker

Toni Baker is the Communications Director at the Medical College of Georgia at Augusta University. Contact her to schedule an interview on this topic or with one of our experts at 706-721-4421 or

View all articles
Written by Toni Baker

Jagwire is your source for news and stories from Augusta University. Daily updates highlight the many ways students, faculty, staff, researchers and clinicians "bring their A games" in classrooms and clinics on four campuses in Augusta and locations across the state of Georgia.

graphic that says download jag mobile with icon buttons below that say download on app store and download on google play with a picture of a phone