Back to Journals » International Journal of Chronic Obstructive Pulmonary Disease » Volume 21

Development and Validation of a Predictive Nomogram for Progression from Pre-COPD to Spirometric COPD: A Multicenter Retrospective Cohort Study

Authors Wu J, Zhang H, Yang L ORCID logo, Gan J, Wang G, Tang X, Xian J, Zhu L, Li Y, Li W ORCID logo

Received 1 December 2025

Accepted for publication 16 April 2026

Published 3 May 2026 Volume 2026:21 580462

DOI https://doi.org/10.2147/COPD.S580462

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Prof. Dr. Zijing Zhou



Jiaxuan Wu,1– 5,* Huohuo Zhang,1– 5,* Lan Yang,1– 5,* Jiadi Gan,1– 5 Guoqing Wang,6 Xiaolong Tang,1– 5 Jinghong Xian,1– 5 Lin Zhu,7 Yalun Li,1– 5 Weimin Li1– 5

1Department of Pulmonary and Critical Care Medicine, West China Hospital, State Key Laboratory of Respiratory Health and Multimorbidity, Sichuan University, Chengdu, Sichuan, People’s Republic of China; 2Institute of Respiratory Health, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan, People’s Republic of China; 3Precision Medicine Center, Precision Medicine Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, People’s Republic of China; 4The Research Units of West China, Chinese Academy of Medical Sciences, West China Hospital, Chengdu, Sichuan, People’s Republic of China; 5Institute of Respiratory Health and Multimorbidity, West China Hospital, Sichuan University, Chengdu, Sichuan, People’s Republic of China; 6Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, 100084, People’s Republic of China; 7General Office, Cohort Study Center, Institute of Respiratory Health and Multimorbidity, West China Hospital, Sichuan University, Chengdu, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Weimin Li, Department of Pulmonary and Critical Care Medicine, West China Hospital, Sichuan University, 37 Guoxue Lane, Wuhou District, Chengdu, Sichuan Province, 610041, People’s Republic of China, Email [email protected] Yalun Li, Department of Pulmonary and Critical Care Medicine, West China Hospital, Sichuan University, 37 Guoxue Lane, Wuhou District, Chengdu, Sichuan Province, 610041, People’s Republic of China, Email [email protected]

Background: Early identification of Pre-chronic obstructive pulmonary disease (pre-COPD) is vital for preventing irreversible lung damage. However, despite its high prevalence, there is a lack of practical tools to predict which individuals will progress to spirometry-defined COPD. This study aimed to identify independent risk factors and develop a clinical nomogram to quantify the risk of disease progression in a pre-COPD population.
Methods: We conducted a multicenter, retrospective cohort study in Southwest China (2019– 2023), enrolling 1088 participants with pre-COPD. Baseline data, including demographic information, smoking status, comorbidities, lung function, and hematological and biochemical indicators, were analyzed. Independent predictors were identified via multivariate logistic regression, and a risk-prediction nomogram was constructed and validated.
Results: During follow-up, 54.6% of participants progressed to COPD. The final prediction model identified six independent risk factors: age (OR=1.043), hypertension (OR=2.331), diabetes (OR=2.412), hemoglobin level (OR=1.016), lymphocyte count (OR=0.639), and basophil count (OR=1.411). The nomogram demonstrated robust discriminative ability, with an AUC of 0.758 in the training set and 0.718 in the validation set. Calibration curves showed high consistency, and Decision Curve Analysis (DCA) confirmed significant clinical net benefit.
Conclusion: Progression from pre-COPD to spirometry-defined COPD is highly prevalent and driven by age, comorbidities, and systemic inflammatory markers. Our validated nomogram provides a precise, non-invasive tool for clinicians to identify high-risk individuals, enabling targeted early intervention and optimized resource allocation in COPD prevention.

Keywords: pre-chronic obstructive pulmonary disease, nomogram, prediction model, progression, prevention

Introduction

Globally, chronic obstructive pulmonary disease (COPD) is a significant public health challenge due to its high prevalence and associated disability and mortality rates, which impose a severe economic and social burden.1 COPD is a persistent, partially reversible disease characterized by airflow limitation.2 The Global Initiative for Chronic Obstructive Lung Disease (GOLD) was formally launched in 1998 with the aim of raising awareness of COPD and improving prevention and treatment approaches for the condition. Since then, significant progress has been made in understanding the nature of COPD, its risk factors, how the disease progresses, treatment strategies, and prospects for prevention and rehabilitation.3–6 However, COPD continues to be a significant global health burden, and much work remains before the issue can be adequately resolved.7 Identifying high-risk individuals before the disease fully develops enables early intervention through risk factor management and timely treatment, thereby slowing disease progression.

Pre-chronic obstructive pulmonary disease (pre-COPD) is a concept that has emerged in recent years with the aim of prioritizing the prevention and treatment of COPD in high-risk populations. Currently, pre-COPD is defined as individuals exhibiting respiratory symptoms, structural lung changes, or respiratory physiology abnormalities, but who do not yet meet the criteria for airflow limitation.8,9 Current diagnostic criteria for COPD are typically fulfilled only after irreversible pathological changes have occurred, by which time the prognosis is already considerably compromised. The lack of early diagnostic tools results in delayed intervention, thereby accelerating disease progression.10 Even in the early stages of COPD, lung damage is often quite severe. While many individuals meet the spirometric criteria for pre-COPD, only a subset progress to clinically significant COPD. Understanding the factors that distinguish progressors from those who remain stable is critical for early intervention and risk stratification. Therefore, it is essential to prevent the decline of lung function at an early stage for patients who have not yet met the classic diagnostic criteria for COPD but are at risk of disease progression.11–14 Identifying individuals at high risk of developing COPD from the pre-COPD population has become an urgent issue requiring resolution. By focusing on this transitional stage, our work aims to provide practical tools for early risk assessment and to inform targeted monitoring and management strategies.

Multiple large cohort studies have provided crucial epidemiological evidence regarding the clinical outcomes of pre-COPD: Analysis of the Copenhagen General Population Study in Denmark concluded that the proportion of early COPD progressing to clinical COPD is higher regardless of smoking status.15 A Swedish cohort study involving over 6000 middle-aged and elderly participants found a 10-year cumulative incidence rate of chronic obstructive pulmonary disease (COPD) at 13.5%. Advanced age, smoking, and bronchial symptoms were identified as risk factors for COPD events.16 A prospective cohort study of 2749 young adults aged 18–30 years at baseline (the Coronary Artery Risk Development in Young Adults [CARDIA] Lung Study) demonstrated that chronic cough, sputum production, wheezing, and dyspnea are independently associated with accelerated long-term lung-function decline and incident airway obstruction.17 Pathological studies further reveal that in the pre-COPD population, small airway dysfunction are associated with accelerated decline in lung function, making them a viable indicator for early screening in this preliminary cohort.18 Additionally, several studies have independently indicated that pre-COPD patients exhibiting preserved ratio impaired spirometry (PRISm), reduced diffusion capacity, CT-detected emphysema, and increased airway wall thickness experience accelerated lung function decline,17,19–23 together with higher risks of hospitalization for obstructive lung disease, pneumonia-related hospitalization, and all-cause mortality.24

Despite growing epidemiological evidence on the clinical outcomes of individuals with pre-COPD features, existing risk prediction models for COPD development and disease progression remain limited in several important respects. Few studies have systematically constructed and validated predictive models for COPD incidence or progression, and most existing models rely heavily on demographic and spirometric variables, with limited discrimination and external generalizability. For example, systematic reviews of COPD predictive models have found that only a small number of models have been developed to predict COPD development, and none demonstrated robust predictive accuracy able to reliably discriminate future risk in diverse populations.25 Moreover, model performance was generally modest and the models lacked comprehensive external validation, highlighting the need for more informative predictors and rigorously validated prediction tools. In addition to traditional clinical factors, systemic biomarkers reflecting inflammation, immune activation, and hematologic status have emerged as potential predictors of COPD outcomes.26–28 The recognition of COPD as a systemic inflammatory disease has prompted extensive research into peripheral blood biomarkers. Studies have shown that markers including leukocyte subtypes, composite inflammatory indices, and acute-phase reactants are linked to disease severity, exacerbation risk, and long-term progression. These associations provide critical prognostic information that complements traditional spirometric assessment. Systemic inflammatory biomarkers not only correlate with accelerated lung function decline and higher exacerbation rates, but also provide complementary information on host susceptibility and pathobiology that is not captured by spirometry or symptom scores alone. Integrated models that incorporate routine hematological parameters therefore have the potential to improve individualized risk stratification while remaining feasible for broad clinical use. Recent systematic reviews and cohort studies support the prognostic value of such blood biomarkers when combined with clinical predictors in multidimensional models.29

Based on evidence from prior studies, we utilized routinely available clinical data, comprising blood cell counts, comorbidities, and demographic information, to screen for risk factors and develop the predictive model. Using data from a multicenter cohort study in Southwest China, we aim to identify risk factors for the progression of pre- COPD to establish a clear understanding of this stage and to develop and validate predictive models for progression from pre-COPD to COPD.

Methods

Patient Selection

This study enrolled participants from a COPD high-risk screening cohort across multiple regions in Southwest China (Longquan, Mianzhu, Pidu) between June 2019 and October 2023. Inclusion criteria were as follows: 1) Age ≥ 20 years and consent to follow-up. 2) Completion of baseline examinations: complete blood count, high-resolution chest CT (HRCT), and conventional pulmonary function tests. 3) Individuals diagnosed with pre-COPD at baseline. 4) Complete follow-up data, including at least one recorded pulmonary function outcome. The flowchart for this study is shown in Figure 1.

Flowchart of COPD study design showing enrollment, follow-up, grouping and analysis steps for progression and non-progression groups.

Figure 1 Flowchart of the study design. The diagram illustrates the enrollment, follow-up, grouping, and analysis steps. Patients from the COPD high-risk screening cohort (2019.6–2023.10) with pre-COPD and documented lung function were enrolled based on inclusion and exclusion criteria. They underwent continuous follow-up and were categorized into progression group (pre-COPD to COPD, n = 594) and non-progression group (continuous pre-COPD, n=494). Data were randomly split into training (n=762) and validation (n=326) sets at a 7:3 ratio. Logistic regression was applied to the training set to construct a nomogram prediction model. Cox regression was used to identify risk factors for disease progression, and Kaplan–Meier curves were plotted to compare differences between groups. Arrows indicate the direction of flow between consecutive steps.

Clinical Data Collection for COPD High-Risk Population Screening Cohort

The cohort for screening high-risk populations with COPD comprised individuals aged 20 years and above from western China. Screening initiatives were carried out in Longquan District, Pidu District (both in Chengdu), and Mianzhu City, Sichuan Province. Each participant completed a structured questionnaire that solicited detailed information on their lifestyle habits, medical history, and comorbidities. In addition to vital sign measurements, laboratory tests were administered, including a complete blood count, blood biochemistry assays, and urinalysis. Participants’ physical status was evaluated to determine their suitability for pulmonary function testing and imaging examinations. We systematically cleaned and integrated data across all study sites and identified subjects who had undergone multiple pulmonary function tests. Finally, subjects were stratified into three groups—normal, pre-COPD, and COPD—based on their baseline pulmonary function, clinical symptoms, medical history, and imaging findings. Demographic characteristics include age (years), gender (male/female), height (m), weight (kg), smoking status (never smoker/former smoker/current smoker), presence of hypertension (yes/no), and presence of diabetes (yes/no). Baseline pulmonary function parameters were collected, including MMEF, MEF50%, MEF25%, FEV1Pred, and FEV1/FVC. Baseline laboratory test results were collected, including red blood cell count, hemoglobin level, platelet count, white blood cell count, neutrophil count, lymphocyte count, monocyte count, eosinophil count, basophil count, albumin level, glucose level, triglycerides level, and cholesterol level. Subjects with missing or incomplete data will be excluded.

Model Construction

Patients with baseline pre-COPD were followed longitudinally and categorized into progression and non-progression groups according to spirometric results at the final follow-up. Progression was defined as the development of spirometrically confirmed COPD during follow-up. Prior to model construction, continuous variables were assessed for distributional characteristics and retained in their original scale to preserve statistical power. Linearity in the logit was evaluated for continuous predictors, and no substantial deviations were observed. Multicollinearity among candidate variables was assessed using variance inflation factors (VIF), with a threshold of VIF > 5 indicating potential collinearity; no significant multicollinearity was detected. Eligible participants were randomly divided into a training set and validation set at a ratio of 7:3 using a computer-generated randomization procedure. The primary endpoint was progression to COPD. In the training cohort, univariate logistic regression was performed to screen candidate predictors. Variables with P < 0.05 in univariate analysis were entered into a multivariable logistic regression model using a simultaneous entry approach. Adjusted odds ratios (OR) and 95% confidence intervals (CI) were calculated. Based on independent predictors identified in the final multivariable model, a nomogram was constructed to estimate individual risk. Model discrimination was evaluated using receiver operating characteristic (ROC) curves and quantified by the area under the curve (AUC). Calibration was assessed using calibration plots comparing predicted and observed probabilities. Clinical utility was evaluated using decision curve analysis (DCA) to estimate net benefit across a range of threshold probabilities. To account for time-to-event information, univariate and multivariable Cox proportional hazards regression analyses were subsequently performed. The proportional hazards assumption was tested using Schoenfeld residuals. Time-dependent ROC curves were generated to assess predictive performance at different follow-up time points.

Statistical Analysis

Continuous variables that were not normally distributed (P < 0.05) are summarized as median, while those following a normal distribution are expressed as mean ± standard deviation. Categorical variables are reported as frequency (percentage). For intergroup comparisons, categorical variables were analyzed using chi-square tests or Fisher’s exact tests, as appropriate. Continuous variables were compared using t-tests or analysis of variance (ANOVA). Univariate and multivariate analyses were conducted via logistic regression and Cox proportional hazards models, with results presented as odds ratios (OR) and hazard ratios (HR) and corresponding 95% confidence intervals (CI). A two-sided P-value < 0.05 was considered statistically significant. All analyses were performed using R software (version 4.3.2).

Results

Baseline Characteristics

A total of 1088 patients were included in this study and were randomly allocated into a training set (n=762) and a validation set (n=326) at a ratio of 7:3. (Table 1) The baseline characteristics of the training and validation sets were well-balanced and comparable. No significant differences were observed in demographic features including gender (male: 44.2% vs. 42.3%), age (median: 56 vs. 55 years), or BMI (23.72 vs. 23.63 kg/m2; all P > 0.05). Similarly, smoking history did not differ significantly between the two cohorts, with current smokers comprising 17.7% and 14.4%, respectively (P = 0.407). Comorbidity profiles were also consistent, with comparable rates of hypertension (26.0% vs. 28.2%; P = 0.491) and diabetes (19.9% vs. 22.1%; P = 0.473). Pulmonary function parameters—including MMEF%, MEF50%, MEF25%, FEV1%, and FEV1/FVC ratio—showed no statistically significant intergroup differences (all P > 0.05), indicating similar baseline respiratory function. Laboratory findings further confirmed this balance: hematologic parameters such as red blood cell count (4.62 vs. 4.62 × 1012/L), hemoglobin level (139 vs. 138 g/L), platelet count (179 vs. 183 × 109/L), and differential leukocyte counts did not differ significantly (all P > 0.05). Likewise, metabolic markers including glucose level (5.01 vs. 5.10 mmol/L), triglycerides level (1.29 vs. 1.28 mmol/L), and cholesterol level (4.88 vs. 4.91 mmol/L) were comparable between groups (all P > 0.05).

Table 1 Baseline Characteristics of Patients in the Training and Validation Sets

In summary, the training and validation cohorts exhibited high consistency across all baseline demographic, clinical, functional, and laboratory measures. The 7:3 random allocation successfully minimized selection bias, ensuring group comparability and supporting the reliability of subsequent model development and validation. Additionally, we observed significant differences in baseline characteristics between the progression group and the non-progression group (all P values < 0.001). (Supplementary Table 1)

Independent Prognostic Factors in the Training Set: Logistic Regression Analysis

To identify factors associated with disease progression, univariate logistic regression was first performed on the training set. Variables significantly associated with progression included gender, age, BMI, smoking status, hypertension, diabetes, red blood cell count, hemoglobin, lymphocyte count, basophil count, and cholesterol (all P < 0.05; Table 2). These variables were subsequently entered into a multivariate logistic regression model.

Table 2 Logistic Regression of Risk Factors for Pre-COPD Progression (Training Set)

After adjusting for gender, age, BMI, smoking status, hypertension, diabetes, red blood cell count, hemoglobin, lymphocyte count, basophil count, and cholesterol in the multivariate logistic regression model, six factors remained independently associated with disease progression (Table 2). Older age (OR = 1.043, 95% CI: 1.030–1.057, P < 0.001), hypertension (OR = 2.331, 95% CI: 1.594–3.436, P < 0.001), diabetes (OR = 2.412, 95% CI: 1.577–3.732, P < 0.001), higher hemoglobin (OR = 1.016, 95% CI: 1.004–1.027, P = 0.007), lower lymphocyte count (OR = 0.639, 95% CI: 0.489–0.832, P < 0.001), and higher basophil count (OR = 1.411, 95% CI: 1.070–1.865, P = 0.015) emerged as significant independent predictors. These factors were therefore identified as independent risk factors for pre-COPD progression in the training cohort.

Construction of the Nomogram Prediction Model

Based on risk factors identified in the training set, this study developed a nomogram to predict progression in pre-COPD stages. (Figure 2) This nomogram assesses disease progression risk by integrating multiple clinical and laboratory indicators to evaluate patient risk. The nomogram lists several variables and their corresponding scores. These variables include age, hypertension, diabetes, hemoglobin level, lymphocyte count, and basophil count. These variables are quantified using a standardized scoring system to calculate a patient’s risk of disease progression. For example, risk may increase with age, and hypertension and diabetes are significant risk factors. Patients with these conditions receive additional scores. Biomarkers such as hemoglobin level, lymphocyte count, and basophil counts are also incorporated into the model. These indicators reflect immune responses and inflammatory states, which influence disease progression. Scores for all variables are ultimately aggregated into a composite score to calculate the patient’s progression risk. Based on this score, clinicians can predict whether a patient is likely to experience progression and develop early intervention strategies for high-risk individuals. Standardizing these indicators allows the nomogram to adapt to individual patient circumstances, providing a precise assessment of disease progression risk. This helps clinicians better identify high-risk patients and implement interventions in clinical practice.

Nomogram showing progression risk based on age, hypertension, diabetes, hemoglobin level, lymphocyte count, basophil count.

Figure 2 Nomogram for predicting the progression risk, incorporating factors such as age, hypertension, diabetes, hemoglobin level, lymphocyte count, and basophil count to calculate total points and corresponding progression risk.

Evaluation and Validation of Nomogram

During validation of the model training set, the predictive model demonstrated strong discriminatory capability, with an AUC of 0.758 (95% CI: 0.730–0.787). The model exhibited high specificity (83.4%), indicating superior performance in identifying negative cases, while sensitivity was 54.4%, suggesting room for improvement in detecting positive cases. (Figure 3A) In the validation set, the model maintained moderate discriminatory performance with an AUC of 0.718 (95% CI: 0.663–0.773). Compared to the training set, the model’s sensitivity significantly increased (75.3%), indicating enhanced ability to identify positive cases. However, specificity decreased to 56.8%, suggesting reduced discrimination of negative cases (Figure 3B). This result indicates the model possesses a certain degree of generalization capability.

Two ROC curves showing predictive performance for training and validation sets.

Figure 3 Receiver operating characteristic (ROC) curves evaluating the predictive performance. (A) ROC curve for the training set, with an area under the curve (AUC) of 0.758 (95% CI: 0.730–0.787). (B) ROC curve for the validation set, with an AUC of 0.718 (95% CI: 0.663–0.773).

The model was then evaluated by plotting calibration curves. Figure 4A specifically shows that the predicted probabilities on the training set closely match the actual occurrence probabilities, indicating precise model outputs. This shows that the model effectively learned the distribution patterns of the data during training and can generate reliable predictions. Figure 4B shows the model’s performance on the validation set, which demonstrates its strong calibration capabilities. The validation set, which is independent of the training set, reflects the model’s ability to generalize. The calibration curve shows that the trained model can accurately predict event occurrence probabilities with minimal deviation between the predicted and actual outcomes on the validation set. Overall, the calibration results on the training and validation sets demonstrate the model’s robust calibration capability and its ability to maintain consistency and high predictive accuracy across different datasets. These results indicate excellent calibration performance and high practical applicability. Finally, to assess the clinical applicability of the predictive model, we conducted a decision curve analysis. Figure 5 displays the decision curves of the model on the training set (A) and validation set (B). Results indicate that the disease progression prediction model constructed in this study demonstrates sound clinical decision-making value within a reasonable risk threshold range, aiding in the identification of high-risk patients and optimizing intervention strategies.

Two calibration curves showing predicted vs observed probabilities for training (A) and validation (B) sets.

Figure 4 Calibration curves assessing model performance for the training set (A) n =762) and validation set (B) n =326). The curves show apparent (actual predictions), bias-corrected (optimism-corrected via bootstrap), and ideal (perfect calibration) predicted probabilities versus observed probabilities. The bias-corrected curves were generated using 1000 bootstrap resamples. The mean absolute error (MAE) between the bias-corrected and ideal curves is 0.006 for the training set and 0.024 for the validation set.

Decision curves for training (A) and validation (B) sets: model net benefit exceeds that of ‘All’ and ‘None’ strategies across thresholds.

Figure 5 Decision curve analysis evaluating the clinical utility of the model. (A) Decision curve for the training set. (B) Decision curve for the validation set. Curves compare the standardized net benefit of the model against “All” (treating all patients as high-risk) and “None” (treating no patients as high-risk) strategies across different high-risk thresholds and cost-benefit ratios.

Independent Prognostic Factors for Disease Progression: Cox Regression Analysis

A Cox univariate analysis revealed that the following variables were associated with disease progression (P < 0.05): gender, age, BMI, smoking status, hypertension, and diabetes. Incorporating these variables into a multivariate cox regression analysis yielded the following results: gender, age, smoking status, and comorbid hypertension and diabetes were significant independent risk factors (Table 3). These results suggest that the influence of these factors on disease progression may persist even after adjusting for other variables, which further validates their crucial role in disease progression. Males exhibited a higher risk (HR = 1.582, 95% CI: 1.257–1.991, P < 0.001). Increasing age was a significant risk factor (HR = 1.029, 95% CI: 1.021–1.037, P < 0.001). Persistent smoking significantly increased the risk (HR = 1.785, 95% CI: 1.451–2.196, P < 0.001). Concurrent hypertension (HR = 1.404, 95% CI: 1.151–1.712, P = 0.001) and diabetes (HR = 1.785, 95% CI: 1.451–2.196, P < 0.001) remained significant risk factors in the multivariate analysis. Overall, the risk factors identified by logistic regression and Cox regression are largely consistent, suggesting that our statistical results exhibit high reliability and accuracy.

Table 3 Univariate and Multivariate Cox Regression Analyses of Risk Factors for Pre-COPD Progression

To evaluate the predictive performance of the Cox model at different time points, time-dependent ROC curves were generated (Supplementary Figure 1). The AUC was 0.746 at 1 year, 0.729 at 3 years, and 0.632 at 5 years. These results indicate that the model had moderate predictive accuracy for 1- and 3-year outcomes, but its performance decreased at 5 years. The observed decline in AUC suggests a gradual reduction in the model’s discriminative ability over time. Finally, each sample’s risk score was compared to the median to classify samples into high-risk (High Risk) and low-risk (Low Risk) groups. A Kaplan-Meier curve was plotted to compare differences between groups, yielding a significant p-value (p < 0.0001), indicating that the progression difference between the high-risk and low-risk groups is statistically significant. (Supplementary Figure 2)

Discussion

This study presents the first analysis of pre-COPD progression in a screening cohort of high-risk individuals in Southwest China. The results indicate that the progression from pre-COPD to COPD is closely associated with age, gender, comorbid hypertension and diabetes, and hemoglobin level, lymphocyte count, and basophil counts. Based on these findings, we developed a predictive model for pre-COPD progression that demonstrated good predictive efficacy and clinical value.

Despite years of efforts in preventing and treating COPD, the disease burden remains substantial, with neither incidence nor mortality showing a downward trend. Based on epidemiological studies related to GOLD Stage 0, the epidemiological status of pre-COPD also warrants serious attention.30–34 The China Pulmonary Health Study revealed an age-standardized prevalence of 7.2% for pre-COPD and 5.5% for PRISm.35 These findings reveal that pre-COPD imposes a substantial disease burden, emphasizing the imperative for early therapeutic strategies directed at subtypes susceptible to disease progression.

Previous studies suggest that airway damage may occur silently before airflow obstruction becomes apparent.16,17 Respiratory symptoms and structural or functional abnormalities may signal the early stages of COPD development in individuals. This study aims to reveal the natural progression trajectories of different COPD subtypes in the pre-COPD stage through long-term tracking and monitoring of symptoms, tissue structure, functional status, and underlying risk factors. The study also seeks to accurately assess individuals’ potential risk of developing COPD, thereby establishing a personalized screening model for disease progression prediction.

Evidence indicates that age is significantly correlated with the early progression of COPD. Age is a well-established risk factor for COPD, which can be attributed to its association with declined lung function, a higher prevalence of comorbidities, and diminished immune competence.36,37 Given the significant impact of age on the progression of chronic COPD, the management of elderly COPD patients needs to be more refined. For the starting age of screening, larger sample size studies should be conducted in different regions to determine the starting age of screening and reduce the disease burden. We then identified the combined presence of hypertension and diabetes as high-risk factors for the early progression of chronic obstructive pulmonary disease. With a prevalence of more than 50%, hypertension ranks among the most common concomitant conditions in individuals living with COPD.38,39 An epidemiological study has demonstrated that hypertension is an independent risk factor for impaired lung function. The underlying mechanism for this association is also linked to systemic inflammation.40 Previous studies have shown that individuals with type 2 diabetes have a higher probability of developing COPD and also exhibit a relatively higher mortality rate.41 Hypertension and diabetes are both associated with metabolic syndrome. These metabolic disorders may further exacerbate systemic inflammation and oxidative stress, leading to accelerated progression of COPD. Therefore, the management and assessment of comorbidities in COPD patients are also critically important. Patient outcomes can be effectively improved through comprehensive management of inflammation, optimized drug therapy, and multidisciplinary collaboration. Our study tentatively suggests that hemoglobin levels correlate with the progression of pre-COPD conditions. Patients with COPD often exhibit elevated hemoglobin levels due to prolonged hypoxemia. This represents a compensatory response to chronic hypoxia, aiming to enhance oxygen-carrying capacity by increasing hemoglobin level.42 Changes in hemoglobin levels are closely associated with the prognosis of COPD patients, and hemoglobin levels can serve as a biomarker for the prognosis of COPD patients.43 Our findings suggest that hemoglobin levels are elevated in patients with pre-COPD who are prone to progression. This likely represents a compensatory mechanism, indicating that elevated hemoglobin serves as an early warning signal for systemic hypoxia. This finding warrants attention and consideration in clinical practice. Previous studies have shown that patients with acute exacerbation of chronic obstructive pulmonary disease (AECOPD) who have a lymphocyte count below 0.8 × 109/L exhibit significantly higher in-hospital mortality rates (17.1% vs. 7.1%), as well as longer hospital stays and mechanical ventilation durations. This suggests that low lymphocyte counts may represent an independent risk factor for poor prognosis in patients with severe AECOPD.44 Our study also indicates that low lymphocyte counts represent a risk factor for progression to COPD. Lymphopenia may reflect impaired immune function in COPD patients, correlating with disease progression and poor prognosis. In summary, low lymphocyte counts not only serve as a poor prognostic marker for COPD exacerbations but are also closely associated with disease progression and immune dysfunction. Clinically, monitoring lymphocyte levels enables more precise assessment of COPD patients’ disease status and prognosis, thereby guiding individualized treatment strategies. Research has found that the cytokine IL-4 secreted by basophils promotes the transformation of monocytes into interstitial macrophages. The proteases released by these macrophages, such as MMP-12, damage lung tissue and lead to pulmonary emphysema. In experiments, mice with depleted basophils did not develop pulmonary emphysema, indicating that these cells play a crucial role in the formation of pulmonary emphysema.45 Our research revealed a potential association between basophils and COPD. This finding validates the hypothesis that basophils play a crucial role in COPD pathogenesis by regulating inflammatory responses and tissue damage. In the future, they may emerge as important targets for the precise treatment of COPD.

This study has several limitations that require refinement in subsequent work. First, in terms of diagnostic criteria, this study used pre-bronchodilator spirometry values to diagnose COPD, which differs from the current GOLD guidelines’ recommendation of post-bronchodilator values combined with LLN criteria. Future research should adhere strictly to these standards to improve diagnostic accuracy. Second, the retrospective study’s relatively small sample size may introduce selection bias and insufficient statistical power. Furthermore, the absence of external validation restricts the generalizability of our findings. Multi-center prospective cohort studies are planned to expand the sample size and enhance the reliability of the results. Additionally, the study sample was exclusively sourced from the southwestern region, which may have introduced geographical bias. Future research should recruit samples from a broader geographic area and extend follow-up periods to enhance the generalizability and credibility of findings.

Conclusion

In this retrospective cohort study, we identified age, hypertension, diabetes, hemoglobin level, lymphocyte count, and basophil count as independent predictors of progression from pre-COPD to COPD. Based on these routinely available clinical and hematological variables, we constructed and internally validated a nomogram that demonstrated satisfactory predictive performance.

Importantly, all variables included in the model are easily obtainable in standard clinical practice, without requiring advanced imaging or specialized molecular testing. This enhances the feasibility of implementing the model in real-world settings. By enabling individualized risk stratification among patients with pre-COPD, the model may support early identification of high-risk individuals, facilitate closer monitoring, and inform timely preventive interventions aimed at delaying or preventing the onset of overt COPD.

Abbreviations

AECOPD, Acute Exacerbation of Chronic Obstructive Pulmonary Disease; AUC, Area Under the Curve; BMI, Body Mass Index; CI, Confidence Interval; COPD, Chronic Obstructive Pulmonary Disease; DCA, Decision Curve Analysis; FEV1%, Forced Expiratory Volume in 1 Second Percentage Predicted; GOLD, The Global Initiative for Chronic Obstructive Lung Disease; HR, Hazard Ratio; HRCT, High-Resolution Chest CT; MEF25%, Maximum Expiratory Flow at 25% of Vital Capacity; MEF50%, Maximum Expiratory Flow at 50% of Vital Capacity; MMEF%, Maximum Mid-Expiratory Flow; OR, Odds Ratio; PRISm, Preserved Ratio Impaired Spirometry; pre-COPD, Pre-chronic Obstructive Pulmonary Disease; ROC, Receiver Operating Characteristic.

Data Sharing Statement

The data that support the findings of this study are available on request from the corresponding author (Weimin Li MD PhD). The data are not publicly available due to privacy or ethical restrictions.

Ethics Approval and Consent to Participate

The West China Hospital Research Ethics Committee and relevant ethics committees approved the study (approval number: 2024(1762)), which did not interfere with clinical management. All human participants in this retrospective cohort study provided written informed consent prior to their involvement in the research, and the consent forms were completed in Chinese (the native language of all participants) to ensure full understanding of the study purpose, procedures, potential risks, and rights of participants. No consent waivers or exemptions were granted by the Institutional Review Board (IRB)/local ethics committees for this research. We adhered to the Declaration of Helsinki and Good Clinical Practice guidelines.

Acknowledgments

We extremely appreciate the all members’ contribution to this study.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This study was supported by Noncommunicable Chronic Diseases-National Science and Technology Major Project (2023ZD0506103 / 2023ZD0506100 to W Li), State Key Laboratory of Respiratory Health and Multimorbidity, State Key Laboratory Special Fund (No. 2060204).

Disclosure

The authors declare no conflicts of interest in this work.

References

1. Safiri S, Carson-Chahhoud K, Noori M, et al. Burden of chronic obstructive pulmonary disease and its attributable risk factors in 204 countries and territories, 1990-2019: results from the Global Burden of Disease Study 2019. BMJ. 2022;378:e069679. doi:10.1136/bmj-2021-069679

2. Celli B, Fabbri L, Criner G, et al. Definition and nomenclature of chronic obstructive pulmonary disease: time for its revision. Am J Resp Crit Care Med. 2022;206(11):1317–14. doi:10.1164/rccm.202204-0671PP

3. Suissa S, Dell’Aniello S, Ernst P. Long-term natural history of chronic obstructive pulmonary disease: severe exacerbations and mortality. Thorax. 2012;67(11):957–963. doi:10.1136/thoraxjnl-2011-201518

4. Xu J, Zeng Q, Li S, Su Q, Fan H. Inflammation mechanism and research progress of COPD. Front Immunol. 2024;15:1404615. doi:10.3389/fimmu.2024.1404615

5. Yang IA, Jenkins CR, Salvi SS. Chronic obstructive pulmonary disease in never-smokers: risk factors, pathogenesis, and implications for prevention and treatment. Lancet Resp Med. 2022;10(5):497–511. doi:10.1016/S2213-2600(21)00506-3

6. Christenson SA, Smith BM, Bafadhel M, Putcha N. Chronic obstructive pulmonary disease. Lancet. 2022;399(10342):2227–2242. doi:10.1016/S0140-6736(22)00470-6

7. Breyer-Kohansal R, Faner R, Breyer MK, et al. Factors associated with low lung function in different age bins in the general population. Am J Resp Crit Care Med. 2020;202(2):292–296. doi:10.1164/rccm.202001-0172LE

8. Han MK, Agusti A, Celli BR, et al. From GOLD 0 to pre-COPD. Am J Resp Crit Care Med. 2021;203(4):414–423. doi:10.1164/rccm.202008-3328PP

9. Agustí A, Celli BR, Criner GJ, et al. Global initiative for chronic obstructive lung disease 2023 report: GOLD executive summary. Eur Resp J. 2023;61(4):2300239. doi:10.1183/13993003.00239-2023

10. Hodgson DB, Saini G, Bolton CE, Steiner MC. Thorax in focus: chronic obstructive pulmonary disease. Thorax. 2012;67(2):171–176. doi:10.1136/thoraxjnl-2011-201231

11. Wijnant SRA, De Roos E, Kavousi M, et al. Trajectory and mortality of preserved ratio impaired spirometry: the Rotterdam study. Eur Resp J. 2020;55(1):1901217. doi:10.1183/13993003.01217-2019

12. Grant T, Brigham EP, McCormack MC. Childhood origins of adult lung disease as opportunities for prevention. J Allergy Clin Immunol Pract. 2020;8(3):849–858. doi:10.1016/j.jaip.2020.01.015

13. Çolak Y, Nordestgaard BG, Vestbo J, Lange P, Afzal S. Prognostic significance of chronic respiratory symptoms in individuals with normal spirometry. Eur Resp J. 2019;54(3):1900734. doi:10.1183/13993003.00734-2019

14. Woodruff PG, Barr RG, Bleecker E, et al. Clinical significance of symptoms in smokers with preserved pulmonary function. New Engl J Med. 2016;374(19):1811–1821. doi:10.1056/NEJMoa1505971

15. Çolak Y, Afzal S, Nordestgaard BG, Lange P, Vestbo J. Importance of early COPD in young adults for development of clinical COPD: findings from the Copenhagen general population study. Am J Resp Crit Care Med. 2021;203(10):1245–1256. doi:10.1164/rccm.202003-0532OC

16. Lindberg A, Jonsson AC, Rönmark E, Lundgren R, Larsson LG, Lundbäck B. Ten-year cumulative incidence of COPD and risk factors for incident disease in a symptomatic cohort. Chest. 2005;127(5):1544–1552. doi:10.1378/chest.127.5.1544

17. Kalhan R, Dransfield MT, Colangelo LA, et al. Respiratory symptoms in young adults and future lung disease. The CARDIA lung study. Am J Resp Crit Care Med. 2018;197(12):1616–1624. doi:10.1164/rccm.201710-2108OC

18. Stockley JA, Ismail AM, Hughes SM, Edgar R, Stockley RA, Sapey E. Maximal mid-expiratory flow detects early lung disease in α 1 -antitrypsin deficiency. Eur Resp J. 2017;49(3):1602055. doi:10.1183/13993003.02055-2016

19. Harvey BG, Strulovici-Barel Y, Kaner RJ, et al. Risk of COPD with obstruction in active smokers with normal spirometry and reduced diffusion capacity. Eur Resp J. 2015;46(6):1589–1597. doi:10.1183/13993003.02377-2014

20. Oh AS, Strand M, Pratte K, et al. Visual emphysema at chest CT in GOLD stage 0 cigarette smokers predicts disease progression: results from the COPDGene study. Radiology. 2020;296(3):641–649. doi:10.1148/radiol.2020192429

21. Mohamed Hoesein FA, de Jong PA, Lammers JW, et al. Airway wall thickness associated with forced expiratory volume in 1 second decline and development of airflow limitation. Eur Resp J. 2015;45(3):644–651. doi:10.1183/09031936.00020714

22. Jo YS, Rhee CK, Kim SH, Lee H, Choi JY. Spirometric transition of at risk individuals and risks for progression to chronic obstructive pulmonary disease in general population. Arch Bronconeumol. 2024;60(10):634–642. doi:10.1016/j.arbres.2024.05.033

23. Fan J, Fang L, Cong S, et al. Potential pre-COPD indicators in association with COPD development and COPD prediction models in Chinese: a prospective cohort study. Lancet Reg Health Western Pacific. 2024;44:100984. doi:10.1016/j.lanwpc.2023.100984

24. Çolak Y, Afzal S, Nordestgaard BG, Vestbo J, Prevalence LP. Characteristics, and prognosis of early chronic obstructive pulmonary disease. The copenhagen general population study. Am J Resp Crit Care Med. 2020;201(6):671–680. doi:10.1164/rccm.201908-1644OC

25. Matheson MC, Bowatte G, Perret JL, et al. Prediction models for the development of COPD: a systematic review. Int J Chronic Obstruct Pulmonary Dis. 2018;13:1927–1935. doi:10.2147/COPD.S155675

26. Vanfleteren L, Weidner J, Franssen FME, et al. Biomarker-based clustering of patients with chronic obstructive pulmonary disease. ERJ Open Res. 2023;9(1):00301–2022. doi:10.1183/23120541.00301-2022

27. Xu Y, Zhang L, Zhu L, et al. Prognostic value of biomarkers in chronic obstructive pulmonary disease: a comprehensive review. Int J Chronic Obstruct Pulmonary Dis. 2025;20:3123–3134. doi:10.2147/COPD.S531935

28. Du D, Zhang G, Xu D, et al. Association between systemic inflammatory markers and chronic obstructive pulmonary disease: a population-based study. Heliyon. 2024;10(10):e31524. doi:10.1016/j.heliyon.2024.e31524

29. Wang L, Zhang S, Gao Z, Jiang D. Construction and validation of a risk prediction model for chronic obstructive pulmonary disease (COPD): a cross-sectional study based on the NHANES database from 2009 to 2018. BMC Pulmonary Med. 2025;25(1):317. doi:10.1186/s12890-025-03776-w

30. Vestbo J, Lange P. Can GOLD Stage 0 provide information of prognostic value in chronic obstructive pulmonary disease? Am J Resp Crit Care Med. 2002;166(3):329–332. doi:10.1164/rccm.2112048

31. Stavem K, Sandvik L, Erikssen J. Can global initiative for chronic obstructive lung disease stage 0 provide prognostic information on long-term mortality in men? Chest. 2006;130(2):318–325. doi:10.1378/chest.130.2.318

32. de Marco R, Accordini S, Cerveri I, et al. An international survey of chronic obstructive pulmonary disease in young adults according to GOLD stages. Thorax. 2004;59(2):120–125. doi:10.1136/thorax.2003.011163

33. Probst-Hensch NM, Curjuric I, Pierre-Olivier B, et al. Longitudinal change of prebronchodilator spirometric obstruction and health outcomes: results from the SAPALDIA cohort. Thorax. 2010;65(2):150–156. doi:10.1136/thx.2009.115063

34. Li Y, Tang X, Zhang R, et al. Research progress in early states of chronic obstructive pulmonary disease: a narrative review on PRISm, pre-COPD, young COPD and mild COPD. Expert Rev Resp Med. 2025;19:1–17.

35. Lei J, Huang K, Wu S, et al. Heterogeneities and impact profiles of early chronic obstructive pulmonary disease status: findings from the China pulmonary health study. Lancet Reg Health Western Pacific. 2024;45:101021. doi:10.1016/j.lanwpc.2024.101021

36. Rule AD, Grossardt BR, Weston AD, et al. Older tissue age derived from abdominal computed tomography biomarkers of muscle, fat, and bone is associated with chronic conditions and higher mortality. Mayo Clin Proceed. 2024;99(6):878–890. doi:10.1016/j.mayocp.2023.09.021

37. Ding F, Liu W, Hu X, Gao C. Factors related to the progression of chronic obstructive pulmonary disease: a retrospective case-control study. BMC Pulmonary Med. 2025;25(1):5. doi:10.1186/s12890-024-03346-6

38. Finks SW, Rumbak MJ, Self TH. Treating hypertension in chronic obstructive pulmonary disease. New Engl J Med. 2020;382(4):353–363. doi:10.1056/NEJMra1805377

39. Society CT, Chinese Medical Association. Chinese expert consensus on the management of cardiovascular comorbidities in patients with chronic obstructive pulmonary disease]. Zhonghua Jie He He Hu Xi Za Zhi. 2022;45(12):1180–1191. doi:10.3760/cma.j.cn112147-20220505-00380

40. Expert Group of the Chronic Obstructive Pulmonary Disease Assembly; Chinese Thoracic Society, Chinese Medical Association. Chinese expert consensus on the management of cardiovascular comorbidities in patients with chronic obstructive pulmonary disease. Zhonghua Jie He He Hu Xi Za Zhi. 45(12):1180–1191. Chinese. doi: 10.3760/cma.j.cn112147-20220505-00380.

41. Raslan AS, Quint JK, All-Cause CS. Cardiovascular and respiratory mortality in people with type 2 diabetes and chronic obstructive pulmonary disease (COPD) in England: a cohort study using the clinical practice research datalink (CPRD). Int J Chronic Obstruct Pulmonary Dis. 2023;18:1207–1218. doi:10.2147/COPD.S407085

42. Sarkar M, Rajta PN, Khatana J. Anemia in chronic obstructive pulmonary disease: prevalence, pathogenesis, and potential impact. Lung India. 2015;32(2):142–151. doi:10.4103/0970-2113.152626

43. Sato K, Inoue S, Ishibashi Y, et al. Association between low mean corpuscular hemoglobin and prognosis in patients with exacerbation of chronic obstructive pulmonary disease. Resp Investig. 2021;59(4):498–504. doi:10.1016/j.resinv.2021.01.006

44. Hu Y, Long H, Cao Y, Guo Y. Prognostic value of lymphocyte count for in-hospital mortality in patients with severe AECOPD. BMC Pulmonary Med. 2022;22(1):376. doi:10.1186/s12890-022-02137-1

45. Shibata S, Miyake K, Tateishi T, et al. Basophils trigger emphysema development in a murine model of COPD through IL-4-mediated generation of MMP-12-producing macrophages. Proceed Nat Acad Sci United States Ame. 2018;115(51):13057–13062. doi:10.1073/pnas.1813927115

Creative Commons License © 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.