Why vital regional COVID race and ethnicity data is still MIA nearly a year into the pandemic

The consequences could be serious, experts said: Potentially wider disparities could go unnoticed, and efforts to address them go unplanned.

Registered nurse Laura Moore, left, swabs Andry Laurens during testing for COVID-19 in Mifflin Square Park

Registered nurse Laura Moore, left, swabs Andry Laurens during testing for COVID-19 organized by Philadelphia FIGHT Community Health Centers, Thursday, Dec. 10, 2020, at Mifflin Square Park in South Philadelphia. (AP Photo/Matt Slocum)

Ask us about COVID-19: What questions do you have about the current surge?

Across the United States, communities of color are being disproportionately ravaged by the coronavirus. Yet despite being nearly a year into the pandemic, race and ethnicity data for COVID-19 is still largely unaccounted for in Pennsylvania.

One look at the state’s COVID-19 numbers and an entirely new race emerges: “Unknown.”

This population makes up more than 325,000 of Pennsylvania’s cases since the start of the pandemic. For comparison, Black people represent roughly 70,000 cases.

  • WHYY thanks our sponsors — become a WHYY sponsor

There is a similar issue with ethnicity, but with a twist: “Not reported” represents more cases than Hispanics and non-Hispanics combined.

The consequences of incomplete demographic information could be serious, experts said: The lack of clear and concise racial and ethnic data could mean that potentially wider disparities go unnoticed — and that programs and funding to address them go unplanned.

Lack of comprehensive data is not just an issue for the many small counties in rural Pennsylvania that don’t have local health departments. Counties in the Philadelphia region also are not collecting complete racial data for COVID-19 cases and vaccinations.

New Jersey and Delaware, like many other states, are having the same issue with virus case data, though to a lesser extent. With the recent rollout of vaccines, several states are noticing uneven delivery to communities of color, as well as a lot of missing data. On Thursday, several Democratic lawmakers, including U.S. Sens. Elizabeth Warren and Ed Markey and U.S. Rep. Ayanna Pressley, sent a letter to federal officials pushing for states to publish better demographic data.

Gaps in the data go all the way back to April, when there were concerns about who was receiving coronavirus testing, said Dr. Utibe Essien, an assistant professor of medicine at the University of Pittsburgh. After realizing there were racial and ethnic disparities, Essien and his team wanted to bring a related issue to light.

“So my colleagues and I back in May of last year published this report that showed if we looked at the 50 states and looked at COVID-19 deaths, yes, that disparity and mortality existed, but what was really striking was that only 28 of the states were reporting any race and ethnicity related to COVID-19,” Essien said.

Now, almost a year has passed. “And, unfortunately, the story is continuing to play out,” Essien said.

COVID-19 case information is both reported and collected through what is called the data supply chain. On one end of the chain are the hospitals, labs, and health care providers administering and confirming tests. In the middle of the chain are the state and local health departments. At the far end: the Centers for Disease Control and Prevention.

In Philly and the suburbs

WHYY News reached out to officials in Philadelphia and the surrounding suburban counties to figure out why race and ethnicity identifiers for COVID-19 cases remained “unknown.”

In Philadelphia, the racially unknown group makes up more than 24% of the total cases. For comparison, white Philadelphians represent roughly 23% of total cases. That would place the racially unknown group second behind Black Philadelphians for getting the virus.

“We encourage all of our testing providers, so again this is just for cases … to collect that information and put it into the record when they send the samples off to the lab. And we think that by to at least a large degree that they’re doing this,” said James Garrow, spokesperson for the city’s Department of Public Health.

Somewhere along the chain, however, there is miscommunication, according to Garrow.

  • WHYY thanks our sponsors — become a WHYY sponsor

“I don’t want to overstate this, but in enough cases, the testing provider will collect this information and put it to the lab report and that won’t get transmitted to us. So that makes our unknown counts for cases high,” Garrow said.

Philadelphia’s COVID-19 vaccination plan is independent of the rest of Pennsylvania, and a racial gap already exists among those who have received the shots. There are even a few unknowns — about 6%, representing a larger number than the Hispanic category.

The city attributed that largely to the Philly Fighting COVID vaccine clinic at which there was a “glitch” involving demographic data.

Because Delaware County does not have a health department of its own, Chester County has been taking point on both counties’ COVID-19 response. A peek at the county’s website shows race and ethnicity data nowhere to be found.

“We didn’t [post it], because there were so many missing. It didn’t seem like a good representation — a fair representation, because, even now, we’re not always getting good data on the race,” said Jeanne Casner, director of the Chester County Health Department.

Casner said that the state drives data requirements for what must be reported to a common centralized system, and that “a lot of the issues we’re seeing in regards to missing case data is related to labs being flooded with tests.”

“It may not necessarily be that the data was not collected. It may be that the data never made it into the system,” Casner said.

Race and ethnicity data for Bucks County is also unavailable on the county website.

WHYY News reached out to health officials for an interview and was told that they were busy and currently “slammed.” But a county spokesperson did comment on the data collection methods Bucks uses.

“Testing data is reported directly to the state Health Department by those administering the tests,” a spokesperson said.

When asked about the lack of complete racial data on COVID-19 cases, Montgomery County officials suggested it was a side effect of removing “barriers” to testing.

“There’s a couple of steps in that process that really what it comes down to is the individual’s willingness to self-report their rates,” said Dr. Valerie Arkoosh, who chairs the Montco Board of Commissioners. “So when people sign up for our testing site, they are asked to report their race, but it is not required because we want to make sure there’s just no barriers or perceived barriers to accessing our testing sites, It’s the same process for our vaccination sites, we are encouraging people to self-identify their race, and many do, but not everyone.”

When asked whether the county had received any guidance from the state on racial data collection, officials said that they had not — at least to their knowledge.

The Pennsylvania Department of Health said otherwise regarding its communication to county health officials and medical providers. In fact, it pointed to two statewide health alerts, issued in April and December, about data collection.

“This Health Alert Notice was sent to all health care personnel to remind them of the importance of reporting all demographic fields when collecting specimen samples … For COVID-19 deaths as well as vaccinations, providers are required by order to report patient information including but not limited to a patient’s race and ethnicity following the administration of the COVID-19 vaccine,” said a spokesperson for the state.

After a follow-up inquiry, a Montgomery County spokesperson provided some additional details regarding testing and vaccination data.

“We are using the PA DOH required registration system ‘PrepMod’ for vaccine scheduling. Race and ethnicity are required fields for PrepMod, but there is a menu choice for ‘Decline to Answer,’” the county spokesperson said in an email. “We are using a registration system called ‘SOLV’ for our COVID Test Site scheduling. Race and ethnicity are optional in SOLV. Our rates show that less than 10% of people are not putting in this information, which means that we are collecting it for most people. To reiterate, we are most focused on getting people tested without barriers. There are definitely people who would refuse testing if this identification was required.”

The potential for worsening disparities

Because both race and ethnicity data are required by order, the state’s own PrepMod registration system for vaccines essentially undermines its ability to get complete data by providing a “Decline to Answer” option under race and ethnicity.

Essien said he understands why self-reporting racial data might be perceived as a barrier for some.

“For decades, again, checking a box that says that you’re Black can literally change your life, it can change whether you get a loan at the bank, it can change whether you get that new job that you’ve been hoping for more, whether you’re even getting into a certain school,” Essien said.

Right now, however, Essien sees new consequences to not having that information.

“It’s just, in a way, potentially widening the disparities that we’re seeing,” Essien said. “I just believe as a researcher of course that data is power, and without those data we’re really gonna have a tough time moving this conversation forward.”

Philadelphia has epidemiologists on the back end working to fill in the gaps, but it is usually not nearly enough. For the rest of the unidentified positives, Garrow said, the city believes that the unknowns are evenly distributed across the population.

“Our assumption is that the breakdown in that unknown group is probably similar to what we’re seeing in the rest of cases,” he said. “So about a third of the cases that would be African American, about a quarter would be white, etc. So … it’s definitely not good enough for us to publish on, but it gives us at least the scope and scale of [what] the overall pandemic looks like in our case.”

That assumption has recently been put to the test, though — and research shows it may be flawed.

Katie Labgold is a doctoral student at Emory University in Atlanta, who with a team of fellow students and faculty was asked to partner with the Fulton County Board of Health on a study to analyze its missing COVID-19 data.

Some of the researchers were already working on developing methods for fixing epidemiologic errors such as missing data for cancer cases.

Then the lightbulb moment occurred: They thought that maybe they could apply those same methods to fill in the blanks for missing coronavirus case data.

“So in our study what we specifically did is, we use these types of epidemiology methods, known as quantitative bias analysis, to basically compute a couple of different measures, one of which is notification rates. So that’s the number of tests per population. We also calculated hospitalization rates and case fatality rates by each race and ethnicity group before accounting for the missing data and after accounting for missing data,” Labgold said.

It was a two-step process.

“We use information on the patient’s surname and where they lived to first attempt to predict what we usually call ‘impute their race and ethnicity,’ but we didn’t stop there. We know that computation is not good for race and ethnicity most of the time,” Labgold said. “So we did use these quantitative bias analysis methods to actually correct for the fact that we knew the prediction would be wrong for people, and you’d be wrong a different amount of the time depending on what race you were predicted as.”

That allowed the researchers to create better measures by running simulations, according to Labgold. And what they found was an even wider racial disparity in cases between people of color and non-Hispanic whites.

“We actually saw that the disparity was 30% to 60% greater than the disparity that we calculated when we left out those missing cases,” Labgold said.

Labgold said that it’s standard to remove data when it is incomplete.

“And that sounds like that’s really crazy, but that’s really common practice for how we deal with missing data generally,” she said.

But it becomes a problem in cases such as this, where the racial data is not missing at random, she said.

Labgold, like many of the public health officials WHYY News spoke to, said the testing labs were not prepared for something like this — and that despite federal and state mandates requiring data reporting, the issue is actually implementation.

Essien wants public officials to know that if they want to address this issue, there has to be more seats at the table.

“I think step one is really leaning on leaders in this space,” Essien said. “So not just having the handful of folks in the room who can be deemed to provide a diverse perspective, but actually going out to the experts — so folks who have been working in health equity for years and not just as researchers, but also doing and leading community-based participatory research.”

Get daily updates from WHYY News!

WHYY is your source for fact-based, in-depth journalism and information. As a nonprofit organization, we rely on financial support from readers like you. Please give today.

Want a digest of WHYY’s programs, events & stories? Sign up for our weekly newsletter.

Together we can reach 100% of WHYY’s fiscal year goal