Back to Journals » Clinical Epidemiology » Volume 18
Association Between Race and COVID-19 Outcomes During the Pre-Vaccination Period in the United States (2020–2021)
Authors Adimadhyam S
, Hawrusik R, Lee HS, Jjingo CJ, Gwira JA, Kempner ME, Wiley M, Petrone AB, Zhao Y, Stojanovic D, Eworuke E, Ajao A
Received 10 December 2025
Accepted for publication 17 March 2026
Published 24 April 2026 Volume 2026:18 581524
DOI https://doi.org/10.2147/CLEP.S581524
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 3
Editor who approved publication: Dr Laura Horsfall
Sruthi Adimadhyam,1 Rebecca Hawrusik,2 Hye Seung Lee,3 Caroline J Jjingo,4 Jane A Gwira,5 Maria E Kempner,2 Megan Wiley,2 Andrew B Petrone,2 Yueqin Zhao,3 Danijela Stojanovic,6 Efe Eworuke,7 Adebola Ajao6
1Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA; 2Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, MA, USA; 3Division of Biometrics VII, Office of Biostatistics, Office of Translational Science, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA; 4Division of Anti-Infectives, Office of Infectious Diseases, Office of New Drugs, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA; 5Office of Biostatistics and Pharmacovigilance, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA; 6Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA; 7Epidemiology and Drug Safety, Real World Solutions, IQVIA Inc., Durham, NC, USA
Correspondence: Sruthi Adimadhyam, Email [email protected]
Objective: To determine the association between race, a proxy for social determinants of health, and COVID-19 outcomes adjusted for demographic, clinical, and socioeconomic differences.
Methods: We conducted a retrospective cohort study using administrative claims data in the Sentinel Distributed Database. We identified 841,628 individuals diagnosed with COVID-19; and separately, 133,773 individuals hospitalized with COVID-19 between April 1, 2020, and March 31, 2021, in the US. Eligible individuals required at least 6 months of enrollment in a health plan prior to cohort entry. Crude and adjusted associations between self-reported race (Asian, Black or African American, American Indian or Alaska Native [AIAN], Native Hawaiian or Other Pacific Islander [NHOPI], White, or Unknown) and COVID-19 outcomes (hospitalization with COVID-19; critical COVID; 30-day all-cause inpatient mortality) were determined using multivariable logistic regression.
Results: Of the 841,628 individuals with COVID-19, 45.5% were White, 42.1% were of Unknown race, 9.5% Black or African American, 2% Asian, 0.5% NHOPI, and 0.3% AIAN. All subpopulations had increased odds of hospitalization compared to White population (AIAN, adjusted Odds Ratio 1.45 [95% confidence interval 1.11– 1.90]; Asian population 1.58 [1.38– 1.80]; Black or African American population 1.58 [1.40– 1.78]; NHOPI 1.32 [1.22– 1.42]). There were 133,773 individuals hospitalized with COVID-19, 56.2% of whom were White, 23.4% of Unknown race, 17.6% Black or African American, 1.5% Asian, 0.9% NHOPI, and 0.4% AIAN. Over half of all hospitalized individuals progressed to critical COVID (58%) and 14.2% died within 30 days. Progression to critical COVID was significantly higher in NHOPI population compared to White population (1.17 [1.03– 1.32]). AIAN population (1.52 [1.20– 1.91]), Asian population (1.30 [1.13– 1.49]), and NHOPI population (1.24 [1.06– 1.45]) had significantly higher odds of 30-day inpatient mortality compared to White population.
Conclusion: We identified significant differences in COVID-19 outcomes in different subgroups within a diverse US population.
Keywords: COVID-19, real-world data, multivariable analysis, health outcomes
Introduction
SARS-CoV-2, the causative agent in Coronavirus 19 disease (COVID-19) was first detected in the United States (US) on January 20, 2020.1 On May 11, 2023, the US federal government declared an end to the COVID-19 Public Health Emergency (PHE).2 As of January 2024, there have been over 6.7 million hospitalizations and over 1.1 million deaths from COVID-19 in the US.3
Race is a social construct often used in epidemiologic and health outcomes research across a range of diseases (eg, cardiovascular, metabolic, and infectious diseases such as HIV) as a proxy for social determinants of health, to capture cultural, systemic, and structural factors, including socioeconomic status and access to healthcare, across various subpopulations in the US.4–7
Prior literature estimated the burden of COVID-19 in different subgroups of the US population during the pre-vaccination era.8–12 These studies reported a higher burden of preexisting comorbidities such as asthma, chronic obstructive pulmonary disease (COPD), diabetes, hypertension, congestive heart failure (CHF), and chronic kidney disease (CKD),10,12,13 at the time of COVID-19 testing,11–13 and higher likelihood of progression to COVID-19-related hospitalization8–10,14 among Black or African American and/or Hispanic/Latino populations compared to White populations. However, while in-hospital, Black or African American populations were no more likely than White populations to progress to severe COVID-19 (Intensive Care Unit [ICU] care, use of mechanical ventilation or renal replacement therapy) or in hospital death.8,12–14 The validity of findings from these preliminary studies may be limited by small sample size, restriction to specific time period or geography, use of broad racial and/or ethnic categories, and lack of control for most socioeconomic factors.
The objective of our study was to assess the association between self-reported race as a proxy for social determinants of health and COVID-19 outcomes including hospitalization, critical COVID-19, and mortality, after controlling for baseline differences in demographic, clinical, and socioeconomic factors at a population-level during the first full year of the COVID-19 pandemic prior to widespread availability of COVID-19 vaccines. We used a large, nationally representative database for this study. The significant size of the database enabled us to retain granular race categories as captured in the database, allowing more accurate characterization of subpopulation-specific differences in the US. Preserving granularity in race categories prevents obscuring variation in disease outcomes and avoids underestimating the burden of disease experienced by various subpopulations that can be a consequence of averaging across heterogenous groups. Reporting findings by granular categories can help better direct essential resources.
Materials and Methods
We conducted an observational retrospective cohort study using administrative claims data from national and regional insurers in the US to determine the association between race and COVID-19 outcomes from 2020 to 2021. This activity was conducted under the authority of the US Food and Drug Administration (FDA) as part of its Sentinel Initiative, a national system for medical product safety surveillance congressionally mandated by the FDA Act of 2007 Section 905. Consistent with the Health Insurance Portability and Accountability Act of 1996 (HIPAA) (45 C.F.R. §164.512(b)(1)(i)) and the Amended Common Rule, Sentinel activities are deemed to be public health surveillance by a relevant public health authority (here, FDA). Consequently, these activities are permitted under HIPAA without oversight by an Institutional Review Board or consent procedures. Data partners that participate in the Sentinel Initiative rely on FDA’s determination that Sentinel activities are public health surveillance and grant permission to access their data to Harvard Pilgrim Health Care Institute, which administers Sentinel activities acting under FDA authorization.15–17
Data Source
We used longitudinal administrative claims data from three insurers, including 2 large national health plans and 1 regional integrated delivery network, that provide medical and drug coverage for their commercially insured beneficiaries in the US. These 3 insurers participate in the FDA’s Sentinel Initiative18,19 and contribute the most up-to-date claims data as part of the Rapid COVID-19 Sentinel Distributed Database.20 This database includes deidentified data routinely collected as part of administering healthcare to insured beneficiaries including details on diagnoses recorded, procedures performed in the inpatient and outpatient care settings, and medications dispensed in the outpatient setting, billed to insurers. No primary data collection was conducted for this study. The Rapid COVID-19 Sentinel Distributed Database is a distributed network where each participating site retains physical and operational control of their own data that reside behind an institutional firewall to preserve patient privacy. Each site transforms their data into the Sentinel Common Data Model.21 These transformed data go through a rigorous quality assurance process prior to being made available for querying.22 Utilizing a Common Data Model format enables the distribution of data to all participating sites via one common analytic package.23
Study Cohorts
We created two study cohorts based on the following cohort entry (index) defining events: (1) COVID-19 diagnosis; and (2) hospitalization with COVID-19 occurring between April 1, 2020, and March 31, 2021. Our COVID-19 diagnosed cohort included any eligible individual with evidence of an International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis code of U07.1 in any position on the discharge summary, in any care setting, or a positive result for a SARS-COV-2 PCR test. Index date was the date of first qualifying diagnosis or positive test result during the study period. The hospitalized with COVID-19 cohort included any eligible individual with evidence of an inpatient encounter with an ICD-10-CM diagnosis code of U07.1 documented in their discharge summary. Index date was the date of first qualifying hospitalization during the study period. In both cohorts, eligible individuals had to be continuously enrolled in a health plan with medical and drug coverage for at least 183 days prior to the index date so that their baseline health can be characterized. We permitted gaps of up to 45 days when evaluating continuous enrollment as these are typically considered administrative gaps and not lapses in insurance coverage.
Main Independent Variable
Race (American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, White, or Unknown race) was the main independent variable in the analyses. Race is self-reported by beneficiaries through closed-ended questionnaires administered at the time of enrollment in a health plan. The Sentinel Common Data Model accommodates one value of race per individual. Self-reported ethnicity was not adequately captured (58% missing24) in the Sentinel Distributed Database and therefore not analyzed.
Study Outcomes
Individuals with COVID-19 were evaluated for the occurrence of COVID-19 related hospitalization within 30 days of the index date. COVID-19 related hospitalization was defined as an inpatient encounter with ICD-10-CM diagnosis code of U07.1 on the discharge summary. In the cohort of individuals hospitalized with COVID-19, we evaluated the occurrence of 2 outcomes within 30 days of index date: (1) critical COVID-19; and separately, (2) inpatient mortality. Critical COVID-19 was defined as an inpatient encounter with ICD-10-CM diagnosis code of U07.1 with evidence of admission to the ICU or requiring mechanical ventilation, extracorporeal membrane oxygenation, or renal replacement therapy with an accompanying diagnosis of acute renal failure.25 Inpatient mortality was defined as documentation of a patient discharge disposition of expired during an inpatient encounter. Long-term outcomes were not considered because of the lack of knowledge at the time of study conduct as to what these might be.
Demographic, Clinical, and Socioeconomic Characteristics
We assessed age, sex, census bureau region, urbanicity, and proxies for socioeconomic status namely median household income, median property value and percent unemployment on the index date. Individual-level data on income, housing, and employment are often lacking in administrative claims data. Therefore, proxies for socioeconomic status and urbanicity were measured at the ZIP code level of an individual’s most recent primary residence available on the index date and was aggregated across all individuals in each racial population. Socioeconomic markers for each ZIP code tabulation area in the US were gathered from the 2019 American Community Survey (ACS) and mapped to the level of ZIP codes.26–30
We described various clinical characteristics at baseline (defined as 183 days prior to index date) as evidence of at least one healthcare encounter with diagnosis, procedure, or treatments pertaining to the characteristics. These characteristics were deemed relevant to the prognosis and risk of severe illness due to COVID-19 by the US Centers for Disease Control and Prevention31 and included metabolic disease and related chronic comorbidities (diabetes, hypertension, obesity, cardiovascular disease, etc.), immunosuppressive conditions (cancer, cystic fibrosis, etc.), use of immunosuppressive medications, and respiratory conditions (asthma, COPD) (Table 1 and Additional File 1). Additionally, we described the use of select cardiovascular therapies 30 days prior to index date in line with hypotheses at the time of study conduct regarding increased COVID-19 risks for those taking these medications.32 We described the presence of COVID-19 related symptoms the day prior through 14 days after index date.
|
Table 1 Characteristics of Individuals with COVID-19* Between April 1, 2020 to March 31, 2021 in the Sentinel Distributed Database, by Race |
Statistical Analysis
The association between race and COVID-19 outcomes was determined at each Data Partner using unadjusted and multivariable adjusted logistic regression. We ran two pre-specified multivariable adjusted models for each cohort. The first model adjusted for demographic and clinical factors associated with illness severity (age, sex, combined comorbidity score,33,34 alcohol or drug abuse, asthma, autoimmune conditions, cancer, CKD, COPD, CHF, coronary artery disease (CAD), cystic fibrosis, diabetes mellitus, HIV/AIDS, hypertension, interstitial lung disease (ILD), liver disease, neurological conditions, obesity, pulmonary conditions, sickle cell disease, smoking, solid organ or stem cell transplant, vascular disease, chemotherapy, immunosuppressants, immune modulators, non-systemic glucocorticoids, systemic glucocorticoids, angiotensin converting enzyme inhibitors, angiotensin receptor blockers, and vasopressors). The second model additionally adjusted for socioeconomic factors (median household income, median property value, and percent unemployment in ZIP code of primary residence) and was run in a subset of the cohort with non-missing values for all proxies of socioeconomic status. We used the Firth option, a simplistic solution in the logistic regression procedure in anticipation of sparse data to mitigate bias caused by rare events in a data set.35
Execution of Distributed Analysis
Each site shared aggregated summary data and effect estimates generated from a common analytic package with the coordinating center. Results from each site were reviewed prior to aggregation in a random-effects meta-analysis.36 All analyses were conducted using the Sentinel Routine Querying System37 version 11.3.0 with additional custom programming using SAS 9.4 (SAS Institute, Cary, NC). The analytic package (ie, set of SAS programs) used to query the Sentinel Distributed Database is publicly posted on the Sentinel website for transparency and replicability.38 These programs may be independently executed by the other researchers to replicate findings in similar databases provided the data are formatted in the Sentinel Common Data Model.
Results
Of the 985,837 individuals with a COVID-19 diagnosis claim in the database during the study period, 841,628 (85.4%) met the continuous enrollment eligibility requirement for cohort entry. Similarly, 133,773 (86.8%) of the 154,138 individuals hospitalized for COVID-19 met the continuous enrollment eligibility requirement.
Characteristics of Individuals Diagnosed with COVID-19
A total of 841,628 individuals were diagnosed with COVID-19 between April 1, 2020, and March 31, 2021. Of these, 45.5% were White, 42.1% Unknown, 9.5% Black or African American, 2% Asian, 0.5% Native Hawaiian or Other Pacific Islander, and 0.3% American Indian or Alaska Native. The mean age overall was 55.2 (±19.7) years, with Asian population being the youngest (48.5 ± 15.4 years) and Native Hawaiian or Other Pacific Islander populations the oldest (66.6 ± 10.2 years). Prevalence of clinical comorbidities varied by race (Table 1). Native Hawaiian or Other Pacific Islander populations had the highest prevalence of CAD, liver disease, and ILD at baseline compared to all other populations. American Indian or Alaska Native populations and Black or African American populations had the highest prevalence of alcohol and drug abuse. Black or African American population had the highest prevalence of hypertension, diabetes, obesity, asthma, COPD, and CKD. All three proxies for socioeconomic status were available for 92% of this cohort. There were no numerical differences in baseline characteristics between the overall population and the subset of the population with known socioeconomic status (Additional File 2). Compared to other populations, Asian population predominantly lived in neighborhoods with the highest household incomes and property values, and the lowest unemployment, whereas Black or African American population predominantly lived in neighborhoods with the lowest household incomes and property values, and the highest unemployment.
Hospitalization with COVID-19 Following Diagnosis with COVID-19
Of the 841,628 individuals diagnosed with COVID-19, 129,802 (15.4%) had the study outcome of hospitalization with COVID-19 in the 30-day follow-up period. Individuals hospitalized with COVID-19 tended to be older (71.6 ± 13.1 vs 52.2 ± 19.2 years) and had a higher burden of comorbidities at baseline compared to those without the outcome at 30 days (Additional File 3).
Unadjusted analyses showed that Black or African American population diagnosed with COVID-19 had a higher odds of hospitalization with COVID-19 relative to White population (Odds Ratio [OR] 1.55 [95% Confidence Interval (95% CI) 1.27–1.90]). After controlling for baseline demographic and clinical risk factors, all subpopulations had higher odds of hospitalization with COVID-19 relative to White population (Table 2). The adjusted odds of hospitalization were highest for Black or African American population (OR 1.70 (95% CI: 1.46, 1.97)), followed by American Indian or Alaska Native population, Asian population, and Native Hawaiian or Other Pacific Islander populations. The increased risk for hospitalization was partially explained by baseline differences in socioeconomic factors with the exception of Asian population. Among Asian population, additionally adjusting for socioeconomic factors further increased the risk of hospitalization relative to White population.
|
Table 2 Association Between Race and COVID-19 Hospitalization Among Those with COVID-19 Between April 1, 2020, and March 31, 2021, in the Sentinel Distributed Database |
Characteristics of Individuals Hospitalized with COVID-19
There were 133,773 individuals hospitalized with COVID-19 between April 1, 2020, and March 31, 2021. Of these, 56.2% were White, 23.4% Unknown, 17.6% Black or African American, 1.5% Asian, 0.9% Native Hawaiian or Other Pacific Islander, and 0.4% American Indian or Alaska Native. The mean age overall was 71.6 (±13.1) years with Asian population being the youngest (67.5 ± 12.9 years) and White population the oldest (74.5 ± 11.1 years). Among hospitalized individuals, White population had the highest prevalence of COPD, CAD, neurological conditions, and ILD when compared to other racial populations. American Indian or Alaska Native population had the highest prevalence of alcohol or drug abuse, liver disease, autoimmune, and pulmonary conditions, whereas Black or African American population tended to have the highest prevalence of all other baseline chronic comorbidities. Similar to the cohort of individuals diagnosed with COVID-19, Asian population resided in neighborhoods with the highest median incomes and property values and the least unemployment, while Black or African American population resided in neighborhoods with the lowest median incomes and highest unemployment (Table 3 and Additional File 4).
|
Table 3 Characteristics of Individuals Hospitalized with COVID-19* Between April 1, 2020 and March 31, 2021 in the Sentinel Distributed Database, by Race |
Critical COVID-19 Following Hospitalization with COVID-19
Of the 133,773 individuals hospitalized with COVID-19, 77,349 (58%) developed critical COVID within 30 days of admission. Individuals with critical COVID were slightly older than those without (72.1 ± 12.3 vs 71.0 ± 14.2 years) and had a higher prevalence of comorbidities at baseline (Additional File 5).
Crude analyses showed that Native Hawaiian or Other Pacific Islander population was the only population with an increased odds of critical COVID-19 relative to White population (OR 1.23 (95% CI: 1.09–1.39)) (Table 4). This increased risk persisted after controlling for demographic and clinical factors (OR 1.22 (1.04 −1.42)) and socioeconomic factors (OR 1.17 (1.03–1.32)). After controlling for all risk factors, a small but non-significant increased risk of critical COVID-19 was also observed among Asian population (OR 1.13 (0.99–1.29)).
|
Table 4 Association Between Race and Critical COVID or Inpatient Mortality Among Those Hospitalized with COVID-19 Between April 1, 2020, and March 31, 2021, in the Sentinel Distributed Database |
Mortality Following Hospitalization with COVID-19
Of the 133,773 individuals hospitalized with COVID-19, 19,002 (14.2%) had a discharge disposition of expired within 30 days of admission. The deceased were on average older than those discharged alive (76.6 ± 10 vs 70.8 ± 13.4 years) and had a higher prevalence of comorbidities at baseline (Additional File 6).
Crude analyses did not show any significant difference in 30-day inpatient mortality following hospitalization with COVID-19 in the overall population (Table 4). After adjusting for demographic and clinical risk factors, American Indian or Alaska Native population (OR 1.57 (1.24–1.97)), Asian population (OR 1.26 (1.10–1.44)), and Native Hawaiian or Other Pacific Islander populations (OR 1.22 (1.04–1.42)) had an increased risk of mortality compared to White population. After adjusting for socioeconomic differences, the increased burden of mortality in these populations persisted (American Indian or Alaska Native population (OR 1.52 (1.20–1.91)), Asian population (OR 1.30 (1.13–1.49)), Native Hawaiian or Other Pacific Islander population (OR 1.24 (1.06–1.45)).
Discussion
We evaluated differences in COVID-19 outcomes experienced by various subpopulations within a large US population during the pre-vaccination era (between April 2020 and April 2021) and found that American Indian/Alaskan Native, Asian, and Native Hawaiian or Other Pacific Islander populations had an increased burden of hospitalization with COVID-19 and 30-day inpatient mortality compared to White population. We interpret these non-causal associations as a reflection of underlying social determinants of health in subpopulations. Our results are consistent with other smaller studies that have also shown a significant increased risk of hospitalization in these populations compared to White population after adjusting for baseline demographic, clinical8–11 and socioeconomic factors11 over the initial months of the pandemic. We found no increase in COVID-19 related mortality among Black or African American population vs White population as shown in some prior smaller studies.9,10 Using a large national and geographically diverse dataset representing the insured US population, our study expands upon existing evidence on the disproportionate burden of COVID-19 pandemic in the pre-vaccination period when effective COVID-19 interventions were minimal by including granular categories of self-reported race and more effectively addressing confounding variables such as clinical and socioeconomic factors.
We saw an increased risk of mortality among AI/AN, Asian, and NHOPI populations compared to White population. Our findings are corroborated in scientific literature reporting excess mortality due to COVID-19 for AI/AN and Asian groups.39,40 Life expectancy for AI/AN population before the availability of COVID-19 vaccines in 2021 declined by 6 years compared to estimates from 2019, with differences attributed to poor baseline health and systemic factors such as access to quality healthcare and infrastructure. The Asian population appeared healthier and socioeconomically advantaged in our study at baseline. However, after adjusting for demographic, clinical, and socioeconomic factors, we saw higher odds of hospitalization and in-patient mortality in Asian population compared to White population. This is consistent with a study that examined COVID-19 disparities among Asian Americans.41 Potential explanations for this could be due to factors like healthcare-related occupation and multigenerational household living that are not captured in administrative claims data but are associated with an increased risk of transmission of SARS-COV-2.42 The AI/AN and NHOPI subpopulations have not typically been represented in other studies owing to insufficient sample size or inconsistent data collection.43 Race, conceptualized in our study as a proxy for social determinants of health, can help serve as a proxy for some but not all social experiences, and therefore may not offer a fully representative view of the population. Adequately capturing and accounting for social determinants of health in real-world data might advance our understanding of inequities in health outcomes across subpopulations.
While we observed a higher burden of chronic diseases among various subpopulations in the 183 days prior to COVID-19 diagnosis or hospitalization, it is important to note that such differences in prevalence of comorbidities predate the COVID-19 pandemic in the US. Preexisting differences in burden of chronic diseases, social economic factors, and access to health care that can increase the risk of poor health outcomes were reinforced during the COVID-19 pandemic.4 This data demonstrates that targeted health interventions, such as funding for access to and delivery of effective preventive care services, better directed to communities in proportion to their need, are necessary to reduce the burden of comorbidities across the US population and disparities in health outcomes.
Our study findings should be considered in the context of some limitations. The administrative claims database used for this study has a high proportion of unknown race (42%). Unknown race can be a heterogenous category comprising individuals who choose not to report their race, or, depending on the site, may report more than one race, or whose self-reported race is not available. We addressed this analytically by treating Unknown race as its own population category of self-reported race alongside other subpopulations. We descriptively compared the baseline characteristics of those with known and unknown race and included them alongside other categories of self-reported race in the regression analyses with White population as the referent category. The risk estimates for the Unknown group, while reported, are not interpreted or discussed. We deemed these estimates uninterpretable given expected heterogeneity in the Unknown group. The capture of socioeconomic status was at a neighborhood level as opposed to individual level. Therefore, complete control of confounding by individual socioeconomic status was not feasible. Our study period included a period of rapid transmission of the virus as well as evolution of clinical practices. Therefore, our findings represent differences in COVID-19 burden in various groups on an average and aggregate level. Our findings may be subject to bias if different subgroups were less likely to be admitted to the hospital due to discriminatory practices which could potentially limit the generalizability of our findings to those with severe presentation of COVID-19. However, we note increased risks even after controlling for measured baseline differences highlighting potential persistent inequities. Lastly, our study population is representative of commercially insured beneficiaries in the US so our data may underestimate the differences among uninsured individuals diagnosed or hospitalized with COVID-19.
As a proxy for social determinants of health, race is frequently linked to differences in health outcomes. There continue to be many challenges stemming from a multitude of factors such as mistrust, language and cultural differences, health literacy, limited access within the health care system that limit access to quality educational, economic, residential, and health opportunities across all populations in the US. The scale of the COVID-19 pandemic in the US has further illuminated preexisting differences in disease burden and health outcomes across US subpopulations.4 Health outcomes research frequently utilizes secondary sources of data such as administrative claims data. Our study shows that these databases are imperfect in their ability to capture and account for social determinants of health. Therefore, there is need for greater investment of resources to improve collection of social determinants of health through innovative data linkages44 in secondary healthcare databases to improve interpretability of population-based health outcomes research and to inform targeting of healthcare resources to the right populations.
Conclusions
In conclusion, our study, leveraging a large US commercially insured population during the pre-vaccination era (April 2020–March 2021), highlighted significant differences in COVID-19 outcomes across US subpopulations. We found that American Indian or Alaska Native, Asian, and Native Hawaiian or Other Pacific Islander populations experienced an increased burden of hospitalization with COVID-19 and 30-day inpatient mortality compared to the White population, even after accounting for demographic, clinical, and neighborhood-level socioeconomic factors. The persistence of inequities, despite controlling for numerous baseline factors, points to the need for targeted healthcare interventions that can address the longstanding, pre-existing differences in the burden of chronic diseases across various US subpopulations to mitigate the impact of future public health emergencies. Our study shows that race, used as a proxy for social determinants of health, is imperfectly captured in administrative claims data that are often used for observational research in the US. Improved collection of social determinants of health, including race, in secondary databases will help researchers and organizations identify patient risk factors, design targeted interventions, and advance health equity in the US.
Data Sharing Statement
The data generated in this study are not publicly available. Sentinel uses a distributed data approach in which Data Partners maintain physical and operational control of their own electronic health data after transforming it into a common data model. Sentinel does not save, maintain, or post individual level datasets to preserve patient privacy. The code that was used for querying standardized Sentinel data used for this project and related documentation is posted on the Sentinel website and referenced in Methods.
Ethics Approval and Informed Consent
This project was deemed to be a public health surveillance activity conducted under the authority of the Food and Drug Administration (FDA) and was therefore exempt from review by an Institutional Review Board.
Acknowledgments
Many thanks are due to Data Partners who provided data used in the analysis: CVS Health (Aetna), Blue Bell, PA; Humana Healthcare Research Inc., Louisville, KY; Kaiser Permanente Colorado Institute for Health Research, Aurora, CO.
The data reported in this manuscript were presented at the 39th Annual Meeting of the International Society for Pharmacoepidemiology (ISPE), held August 23-27, 2023, in Halifax, Nova Scotia, Canada. The corresponding conference abstract was published in Pharmacoepidemiology and Drug Safety (doi:10.1002/pds.5687). The poster presentation is available on the Sentinel Initiative website: https://www.sentinelinitiative.org/news-events/publications-presentations/association-between-race-and-covid-19-outcomes-united-states
Funding
This publication was supported by the Food and Drug Administration (FDA) Office of Minority Health and Health Equity of the US Department of Health and Human Services (OMHHE/HHS) as part of a financial assistance award [FAIN] totaling $150,000 with 100% funded by FDA OMHHE/HHS. The contents are those of the author(s) and do not necessarily represent the official views of, nor an endorsement, by FDA/HHS, or the US Government.
Disclosure
SA, RH, MEK, MW, ABP, and AA are employees of Harvard Pilgrim Health Care Institute, a non-profit organization that conducts work for government and private organizations, including pharmaceutical companies. EE was an employee of the US Food and Drug Administration during the conduct of this study. The authors report no other conflicts of interest in this work.
References
1. Centers for Disease Control and Prevention. CDC museum COVID-19 timeline. 2023. Available from: https://www.cdc.gov/museum/timeline/covid19.html.
2. Centers for Disease Control and Prevention. End of the federal COVID-19 public health emergency (PHE) declaration. 2020. Available from: https://www.cdc.gov/coronavirus/2019-ncov/your-health/end-of-phe.html.
3. Centers for Disease Control and Prevention. COVID data tracker. 2023. Available from: https://covid.cdc.gov/covid-data-tracker/#datatracker-home.
4. Aleligne YK, Appiah D, Ebong IA. Racial disparities in coronavirus disease 2019 (COVID-19) outcomes. Curr Opin Cardiol. 2021;36(3):360–15. doi:10.1097/HCO.0000000000000847
5. Cooper RS. Social inequality, ethnicity and cardiovascular disease. Int J Epidemiol. 2001;30(suppl_1):S48. doi:10.1093/ije/30.suppl_1.S48
6. Hassan S, Gujral UP, Quarells RC, et al. Global Inequity in Diabetes 3. Lancet Diabetes Endocrinol. 2023;11(7):509–524. doi:10.1016/S2213-8587(23)00129-8
7. Davy-Mendez T, Napravnik S, Eron JJ, et al. Racial, ethnic, and gender disparities in hospitalizations among persons with HIV in the United States and Canada, 2005–2015. AIDS Lond Engl. 2021;35(8):1229–1239. doi:10.1097/QAD.0000000000002876
8. Adegunsoye A, Ventura IB, Liarski VM. Association of black race with outcomes in COVID-19 disease: a retrospective cohort study. Ann Am Thorac Soc. 2020;17(10):1336–1339. doi:10.1513/AnnalsATS.202006-583RL
9. Golestaneh L, Neugarten J, Fisher M, et al. The association of race and COVID-19 mortality. EClinicalMedicine. 2020;25:100455. doi:10.1016/j.eclinm.2020.100455
10. Egede LE, Walker RJ, Garacci E, Raymond JR. Racial/Ethnic differences in COVID-19 screening, hospitalization, and mortality in Southeast wisconsin. Health Affairs Project Hope. 2020;39(11):1926–1934. doi:10.1377/hlthaff.2020.01081
11. Muñoz-Price LS, Nattinger AB, Rivera F, et al. Racial disparities in incidence and outcomes among patients with COVID-19. JAMA Network Open. 2020;3(9):e2021892. doi:10.1001/jamanetworkopen.2020.21892
12. Price-Haywood EG, Burton J, Fort D, Seoane L. Hospitalization and mortality among black patients and white patients with covid-19. N Engl J Med. 2020;382(26):2534–2543. doi:10.1056/NEJMsa2011686
13. Yehia BR, Winegar A, Fogel R, et al. Association of race with mortality among patients hospitalized with coronavirus disease 2019 (COVID-19) at 92 US hospitals. JAMA Network Open. 2020;3(8):e2018039. doi:10.1001/jamanetworkopen.2020.18039
14. Gu T, Mack JA, Salvatore M, et al. Characteristics associated with racial/ethnic disparities in COVID-19 outcomes in an academic health care system. JAMA Network Open. 2020;3(10):e2025197–e2025197. doi:10.1001/jamanetworkopen.2020.25197
15. 45 CFR Part 46 Subpart A -- Basic HHS policy for protection of human research subjects. Available from: https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-A/part-46/subpart-A.
16. Federal policy for the protection of human subjects. Federal Register. 2017. Available from: https://www.federalregister.gov/documents/2017/01/19/2017-01058/federal-policy-for-the-protection-of-human-subjects.
17. Rosati K, Jorgensen N, Plc CB. HIPAA and common rule compliance in the sentinel initiative. 49.
18. Ball R, Robb M, Anderson SA, Pan GD. The FDA’s sentinel initiative—A comprehensive approach to medical product surveillance. Clin Pharmacol Ther. 2016;99(3):265–268. doi:10.1002/cpt.320
19. Platt R, Brown JS, Robb M, et al. The FDA sentinel initiative — an evolving national resource. N Engl J Med. 2018;379(22):2091–2093. doi:10.1056/NEJMp1809643
20. Cocoros NM, Fuller CC, Adimadhyam S, et al. A COVID-19-ready public health surveillance system: the food and drug administration’s sentinel system. Pharmacoepidemiol Drug Saf. 2021;30(7):827–837. doi:10.1002/pds.5240
21. Sentinel common data model | sentinel initiative. Sentinel Initiative. Available from: https://www.sentinelinitiative.org/methods-data-tools/sentinel-common-data-model.
22. How sentinel gets its data | sentinel initiative. Available from: https://www.sentinelinitiative.org/about/how-sentinel-gets-its-data.
23. Toh S, Platt R, Steiner JF, Brown JS. Comparative-effectiveness research in distributed health data networks. Clin Pharmacol Ther. 2011;90(6):883–887. doi:10.1038/clpt.2011.236
24. Key database statistics | sentinel initiative. Available from: https://www.sentinelinitiative.org/about/key-database-statistics#ethnicity-distribution-of-members-in-the-sentinel-distributed-database.
25. World Health Organization. COVID-19 therapeutic trial synopsis. 2020. Available from: https://cdn.who.int/media/docs/default-source/blue-print/covid-19-therapeutic-trial-synopsis.pdf?sfvrsn=44b83344_1&download=true.
26. U.S. Census Bureau. About the American community survey. Census.gov. Available from: https://www.census.gov/programs-surveys/acs/about.html.
27. U.S. Census Bureau. 2019: ACS 5-year estimates data profiles. Selected economic characteristics. Median household income [CSV data file]. Available from: https://api.census.gov/data/2019/acs/acs5/profile?get=NAME,DP03_0062E&for=zip%20code%20tabulation%20area:*.
28. U.S. Census Bureau. 2019: ACS 5-year estimates subject tables. Employment status. Percent unemployment [CSV data file]. Available from: https://api.census.gov/data/2019/acs/acs5/subject?get=NAME,S2301_C04_001E&for=zip%20code%20tabulation%20area:*.
29. U.S. Census Bureau. 2019: ACS 5-year estimates data profiles. Selected housing characteristics. Median property value [CSV data file]. Available from: https://api.census.gov/data/2019/acs/acs5/profile?get=NAME,DP04_0089E&for=zip%20code%20tabulation%20area:*.
30. ZIP code to ZCTA crosswalk – UDS mapper. Available from: https://udsmapper.org/zip-code-to-zcta-crosswalk/.
31. CDC. People with certain medical conditions. Centers for disease control and prevention. 2022. Available from: https://www.cdc.gov/coronavirus/2019-ncov/need-extra-precautions/people-with-medical-conditions.html.
32. Sommerstein R, Kochen MM, Messerli FH, Gräni C. Coronavirus disease 2019 (COVID-19): do angiotensin-converting enzyme inhibitors/angiotensin receptor blockers have a biphasic effect? J Am Heart Assoc. 2020;9(7):e016509. doi:10.1161/JAHA.120.016509
33. Gagne JJ, Glynn RJ, Avorn J, Levin R, Schneeweiss S. A combined comorbidity score predicted mortality in elderly patients better than existing scores. J Clin Epidemiol. 2011;64(7):749–759. doi:10.1016/j.jclinepi.2010.10.004
34. Sun JW, Rogers JR, Her Q, et al. Adaptation and validation of the combined comorbidity score for ICD-10-CM. Med Care. 2017;55(12):1046–1051. doi:10.1097/MLR.0000000000000824
35. Karabon P. Rare events or non-convergence with a binary outcome? The power of firth regression in PROC LOGISTIC. SAS Global Forum 2020. 2020;2020:9.
36. Hertzmark E, Spiegelman D. The SAS METAANAL Macro. 2017. Available from: https://ysph.yale.edu/cmips/research/software/metaanal_340162_284_47911_v2.pdf.
37. Browse analytic development / qrp - sentinel version control system. Available from: https://dev.sentinelsystem.org/projects/AD/repos/qrp/browse.
38. Racial differences in COVID-19 outcomes (2020-2021) | sentinel initiative. Available from: https://www.sentinelinitiative.org/studies/drugs/individual-drug-analyses/racial-differences-covid-19-outcomes-2020-2021.
39. Yuan AY, Atanasov V, Barreto N, et al. Understanding racial/ethnic disparities in COVID-19 mortality using a novel metric: COVID excess mortality percentage. Am J Epidemiol. 2024;193(6):853–862. doi:10.1093/aje/kwae007
40. Goldman N, Park SS, Beltrán-Sánchez H. Life expectancy among native Americans during the COVID-19 pandemic: estimates, uncertainty, and obstacles. Am J Epidemiol. 2023;193(6):846–852. doi:10.1093/aje/kwad244
41. Kalyanaraman Marcello R, Dolle J, Tariq A, et al. Disaggregating Asian race reveals COVID-19 disparities among Asian American patients at New York City’s public hospital system. Public Health Rep. 2022;137(2):317–325. doi:10.1177/00333549211061313
42. Chin MK, Đoàn LN, Chong SK, Wong JA, Kwon SC, Yi SS. Asian American subgroups and the COVID-19 experience: what we know and still don’t know. Health Aff Forefr. doi:10.1377/forefront.20210519.651079
43. Inconsistent data masks the pandemic’s toll on Native Hawaiians and Pacific Islanders. PBS News. 2022. Available from: https://www.pbs.org/newshour/show/inconsistent-data-masks-the-pandemics-toll-on-native-hawaiians-and-pacific-islanders.
44. Tonnu-Mihara I, Rojas-Lazic C, Mack M, Eshete B, Stephenson JJ, Grabner M. SA42 innovative methods for integrating social determinants of health data with administrative claims to facilitate health equity research. Value Health. 2023;26(6):S404–S405. doi:10.1016/j.jval.2023.03.2258
© 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The
full terms of this license are available at https://www.dovepress.com/terms
and incorporate the Creative Commons Attribution
- Non Commercial (unported, 4.0) License.
By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted
without any further permission from Dove Medical Press Limited, provided the work is properly
attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.
