Identification of Misdiagnosis Factors in Allergic Bronchopulmonary Mycosis Using Explainable Machine Learning

Xuemei Chen; Zekai Yu; Xiaoxiao Gong; Runjin Cai; Huan Ge; Yunbing Jia; Jiale Tang; Leng Huang; Xiaozhao Li; Juntao Feng

doi:10.2147/JAA.S605864

Back to Journals » Journal of Asthma and Allergy » Volume 19

Original Research

Identification of Misdiagnosis Factors in Allergic Bronchopulmonary Mycosis Using Explainable Machine Learning

Authors Chen X, Yu Z , Gong X, Cai R, Ge H, Jia Y, Tang J, Huang L, Li X, Feng J

Received 28 February 2026

Accepted for publication 4 May 2026

Published 9 May 2026 Volume 2026:19 605864

DOI https://doi.org/10.2147/JAA.S605864

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Luis Garcia-Marcos

Download Article [PDF]

Xuemei Chen,^1,^2,^* Zekai Yu,^3,^* Xiaoxiao Gong,^1,^2,^* Runjin Cai,^1,^2,^* Huan Ge,^1,² Yunbing Jia,^1,² Jiale Tang,^1,² Leng Huang,^1,² Xiaozhao Li,^2,⁴ Juntao Feng^1,²

¹Department of Respiratory Medicine, National Key Clinical Specialty, Branch of National Clinical Research Center for Respiratory Disease, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, People’s Republic of China; ²National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, People’s Republic of China; ³School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang Province, 310018, People’s Republic of China; ⁴Department of Nephrology, Xiangya Hospital, Central South University, Changsha, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Juntao Feng, Department of Respiratory Medicine, National Key Clinical Specialty, Branch of National Clinical Research Center for Respiratory Disease, Xiangya Hospital, Central South University, Changsha, Hunan, People’s Republic of China, Email [email protected] Xiaozhao Li, Department of Nephrology, Xiangya Hospital, Central South University, Changsha, Hunan, People’s Republic of China, Email [email protected]

Background: Allergic bronchopulmonary aspergillosis/mycosis (ABPA/ABPM) is frequently misdiagnosed as pulmonary tuberculosis due to overlapping clinical and radiological features.
Methods: In a retrospective cohort of 89 multidisciplinary team (MDT)–confirmed ABPA/ABPM patients, we investigated determinants of misdiagnosis and disentangled causal drivers from spurious associations. Machine learning models with SHAP interpretation were utilized alongside a Double Machine Learning framework to estimate the average treatment effects of five key features while adjusting for major confounders.
Results: Immunological markers were identified as the dominant contributors to misdiagnosis. Total serum IgE showed the strongest protective causal effect against misdiagnosis, followed by Aspergillus fumigatus–specific IgE. Bronchiectasis demonstrated a modest protective effect. These findings were robust across covariate balance, overlap, and placebo analyses.
Conclusion: Our results indicate that ABPA/ABPM misdiagnosis is driven primarily by underrecognition of immunological features rather than imaging findings, underscoring the importance of systematic immunologic assessment to reduce diagnostic delay and unnecessary anti-tuberculosis treatment, particularly in tuberculosis-endemic settings. Infographic on ABPA/M diagnosis: data, methods, clinical impact via machine learning.The infographic is divided into three sections: Data and Processing, Methodology and Results and Clinical Implications. The Data and Processing section describes the patient groups, with 89 patients diagnosed with ABPA/M by MDT, including 55 correctly diagnosed and 34 misdiagnosed. It mentions data standardization using z-standardization and feature selection through LASSO with cross-validation. A map of China is shown, highlighting the lack of diagnostic criteria for ABPA/M and its frequent misdiagnosis as tuberculosis. The Methodology section outlines the use of LASSO regression with cross-validation, model selection involving six machine learning models and SHAP analysis for model interpretation. It includes causal inference using a Double Machine Learning framework and Average Treatment Effect estimation. The Results and Clinical Implications section highlights feature selection results, noting the best performance by the Glmnet model with an AUC of 0.808. It identifies slgE and tIgE as key factors, with low levels leading to misdiagnosis and confirms their protective effect. Clinical implications include identifying patients at high risk of misdiagnosis, suggesting combined immunological and radiological assessments and reducing inappropriate treatment.

Keywords: allergic bronchopulmonary aspergillosis, allergic bronchopulmonary mycosis, machine learning, SHAP, diagnosis

Introduction

Allergic bronchopulmonary mycoses (ABPM) comprise a group of pulmonary conditions driven by immune responses to fungal colonization of the airways, most commonly involving Aspergillus fumigatus. These disorders are typically observed in patients with underlying chronic respiratory diseases, particularly asthma or cystic fibrosis (CF),^1–4 but may also occur, less frequently, in individuals without predisposing conditions or in those with other chronic lung diseases such as bronchiectasis or chronic obstructive pulmonary disease (COPD).^5–8 Among these entities, allergic bronchopulmonary aspergillosis (ABPA) is considered the most representative phenotype and arises from a hypersensitivity reaction to airway colonization by Aspergillus fumigatus.⁹ A hallmark feature of ABPA is the presence of fungus-specific immunoglobulin E (sIgE) together with markedly elevated total serum IgE levels.¹⁰ In clinical practice, serum IgE plays a dual role, being essential not only for diagnosis but also for monitoring disease activity and treatment response.¹¹ Accordingly, IgE is regarded as a key mediator in the immunopathogenesis of ABPA.¹²

Because of its nonspecific clinical presentation and the limited awareness among clinicians, ABPA/ABPM is often diagnosed with substantial delay, which may extend to nearly a decade.¹³ In routine clinical practice, it can be difficult to distinguish this condition from other respiratory diseases, including pulmonary tuberculosis, pneumonia, lung cancer, and asthma,^14,15 with tuberculosis representing the most common and challenging differential diagnosis. The underlying reason lies in the fact that airway fungal colonization and the associated hypersensitivity response can closely resemble tuberculosis in both clinical symptoms and radiological findings. In regions with a high burden of tuberculosis, particularly in Africa and Asia, patients are frequently treated empirically for tuberculosis despite negative microbiological evidence.^16,17 As a result, a consistently high rate of misdiagnosis of ABPA/ABPM as tuberculosis has been reported in these settings.¹⁸ China, which ranks among the countries with the highest tuberculosis burden globally,¹⁹ continues to face substantial under-recognition of ABPA/ABPM, with up to 57% of patients reportedly misdiagnosed as pulmonary tuberculosis, pneumonia, or lung abscess.²⁰ Similar patterns have been observed in other high-burden regions. For example, a study conducted in India in 2006 reported that 59 out of 126 patients with ABPA were initially misdiagnosed as pulmonary tuberculosis and received anti-tuberculosis therapy.²¹ In another retrospective analysis published in 2009, as many as 91% of patients were initially treated with anti-tuberculosis therapy. Additional reports have indicated that approximately 21% of patients with ABPA/ABPM were initially diagnosed as having tuberculosis,^22,23 while around 10% were managed as smear-negative tuberculosis cases and subjected to prolonged anti-tuberculosis treatment.²⁴

Such misdiagnosis not only imposes unnecessary economic and healthcare burdens but also exposes patients to adverse effects such as hepatotoxicity,¹³ gastrointestinal discomfort,²⁵ and the emergence of drug-resistant Mycobacterium tuberculosis strains.²⁵ Moreover, inappropriate therapy may further compromise immune function, leading to uncontrolled fungal infection and disease progression.

Despite increasing awareness of ABPA /ABPM in recent years, misdiagnosis and missed diagnosis remain common in clinical practice. Systematic investigations into the factors associated with ABPA/ABPM misdiagnosis are still relatively limited. Machine learning (ML) provides a flexible analytical framework for modeling complex, nonlinear relationships between multidimensional clinical features and disease outcomes.²⁶ It has been increasingly adopted in diagnostic and prognostic research, offering new approaches for improving diagnostic accuracy and identifying factors associated with misdiagnosis in ABPA/ABPM. Based on a confirmed cohort of 89 patients, this study integrates multiple ML algorithms and SHapley Additive exPlanations (SHAP) explainability models to systematically identify key contributors to ABPA/ABPM misdiagnosis, providing exploratory evidence for improving clinical recognition and optimizing diagnostic strategies.

Methods

Participants and Data Collection

Patients with ABPA/ABPM diagnosed by a multidisciplinary team (MDT) at Xiangya Hospital, Central South University, between January 2017 and March 2025 were retrospectively enrolled in this study. In addition, established diagnostic criteria (ISHAM 2013/2024, Asano 2021, and RS criteria) were applied in a descriptive manner to illustrate how the cohort would be classified under current diagnostic frameworks. Patients were excluded if they met any of the following conditions: age ≤ 18 years, pregnancy, presence of other diffuse pulmonary diseases, receipt of systemic antifungal therapy within 2 months prior to diagnosis, use of oral corticosteroids for more than 3 weeks within the preceding 3 months, immunocompromised status, or malignancy. To minimize diagnostic uncertainty, all included patients were carefully evaluated to exclude active tuberculosis. Individuals with microbiological evidence of tuberculosis, such as a positive acid-fast bacilli smear, mycobacterial culture, or nucleic acid amplification test (eg., Xpert MTB/RIF)—were not included. In addition, patients with clinical or radiological findings highly suggestive of active tuberculosis who were subsequently confirmed or responded to anti-tuberculosis therapy were also excluded. Misdiagnosis of pulmonary tuberculosis was defined as cases in which patients were initially diagnosed and treated as tuberculosis based on clinical or radiological suspicion, despite negative microbiological evidence, but were subsequently confirmed as ABPA/ABPM by MDT evaluation and showed no evidence of active tuberculosis during follow-up.

Comprehensive baseline information was obtained for all patients, including demographic characteristics, interval from symptom onset to diagnosis, and relevant clinical manifestations. The diagnosis of asthma was established in accordance with the 2025 Global Initiative for Asthma (GINA) guidelines. Clinical evaluation included laboratory investigations, chest imaging, pulmonary function testing, and bronchoscopic assessment as appropriate for ABPA/ABPM diagnosis. Serum specific IgE (sIgE) levels were determined using a fluorescence enzyme immunoassay (m³, gold domain medical test laboratory), with values ≥ 0.35 kUA/L considered positive. Specific IgG (sIgG) was measured using an automated fluorescence enzyme immunoassay, and levels > 120 AU/mL were defined as positive. Chest computed tomography (CT) images were independently assessed by two experienced radiologists to identify characteristic features, including bronchiectasis, pulmonary infiltrates, and central high-attenuation mucus (HAM).

Ethical approval for this study was obtained from the Institutional Review Board of Xiangya Hospital, Central South University (No. 2023121128). The study was conducted in accordance with the principles outlined in the Declaration of Helsinki. Given its retrospective and observational design, the requirement for informed consent was waived by the ethics review committee.

Feature Preprocessing

Feature selection was carried out using the Least Absolute Shrinkage and Selection Operator (LASSO) regression with cross-validation. The cv.glmnet function was applied to automatically determine the optimal penalty parameter (λ), balancing model complexity and predictive performance. Variables with nonzero coefficients at the optimal λ were retained as input features for subsequent machine learning modeling.

Statistical Analysis and Machine Learning Modeling

Statistical analyses were performed using R software (version 4.4.2), GraphPad Prism 10, and SPSS 27. A two-sided P value < 0.05 was considered statistically significant. Normality of continuous variables was assessed prior to analysis, and data were summarized as mean (standard deviation) or median (interquartile range), as appropriate. Group comparisons for continuous variables were conducted using the Student’s s t-test or Mann–Whitney U-test. Categorical variables were presented as counts and percentages, and compared using the χ²-test or Fisher’s exact test when appropriate.

The modeling workflow was developed within the mlr3 framework and included six machine learning classifiers, namely regularized logistic regression (glmnet), naïve Bayes, k-nearest neighbors (K-KNN), decision tree (rpart), random forest (ranger), and extreme gradient boosting (XGBoost). To prevent overfitting and enhance generalizability, a 5-fold cross-validation approach was employed for model training and evaluation.

A benchmarking strategy was adopted to enable systematic evaluation and comparison of the performance of different machine learning models. Each model was assessed on standardized datasets using a consistent set of performance metrics, including accuracy, balanced accuracy, F-beta score, sensitivity, and specificity, allowing for objective comparison across models. Model performance was further evaluated using bootstrap resampling, with the area under the ROC curve (AUC) and its 95% confidence interval, Brier score, and calibration curves estimated accordingly. The model achieving the highest AUC was identified as the optimal model, whereas the remaining metrics were used for supplementary assessment. Decision curve analysis (DCA) was subsequently conducted to assess clinical utility. In addition, SHapley Additive exPlanations (SHAP) were applied to quantify the contribution of each feature to model predictions, enhancing model interpretability. All variables, including IgE-related indicators, were treated as continuous features without predefined stratification thresholds.

Causal Inference Analysis

To verify whether the key features identified by the prediction model have a genuine causal impact on misdiagnosis rather than mere statistical correlation, we employed a Double Machine Learning (DML) approach.^27,28 This method utilizes orthogonalization techniques to remove the influence of confounding variables (eg., age, sex, asthma history) and isolate the “net effect” of specific features on the outcome.

We calculated the Average Treatment Effect (ATE) for five core factors: Total serum IgE, F.sIgE, HAM, Bronchiectasis, and HPI. To ensure the robustness of the causal estimation, a three-step quality control procedure was performed for each factor: (1) Covariate Balance Check (Love Plot) to ensure fair grouping; (2) Propensity Score Overlap Check (Overlap Plot) to verify sample comparability; and (3) Placebo Test to confirm the statistical significance of the results against random noise.

Results

Baseline Characteristics

A total of 89 patients clinically diagnosed with ABPA/ABPM were included. Among them, 36 fulfilled the ISHAM (2013) diagnostic criteria, 45 met the updated ISHAM (2024) criteria, and 57 met the Asano (2021) criteria, with 24 classified as possible cases. According to the Revised Scoring (RS) criteria,²⁹ 67 patients were diagnosed with ABPA/ABPM and 16 were classified as possible cases.

Overall, 34 patients (38.2%) had a previous misdiagnosis of pulmonary tuberculosis, and 14 of them (41.2%) had received empirical anti-tuberculosis therapy. In addition, 24 cases were initially diagnosed as pneumonia, 2 as recurrent asthma, and 9 as lung cancer. The remaining patients were correctly diagnosed with ABPA/ABPM at initial presentation. Since lung cancer and pneumonia are generally easier to differentiate from ABPA/ABPM based on radiologic and clinical characteristics, and misdiagnosis as pneumonia typically does not result in prolonged or severe mistreatment, our study primarily focused on patients misdiagnosed with pulmonary tuberculosis.

To explore clinical differences associated with misdiagnosis, patients were stratified into a misdiagnosed group (n = 34) and a non-misdiagnosed group (n = 55). The baseline characteristics of both groups are presented in Table 1. Comparisons between the two groups revealed no significant differences in age, sex, or asthma prevalence. In contrast, the time from symptom onset to definitive diagnosis was significantly prolonged in the misdiagnosed group (93.5 [68.8–208.5] months vs. 36.0 [6.0–84.0] months, p = 0.001).

Table 1 Clinical Characteristics

The baseline characteristics of the study population are presented in Table 1. Patients in the misdiagnosed group exhibited a lower frequency of wheezing (55.8% vs. 83.6%, p = 0.006) and had significantly lower levels of total serum IgE (458.0 [353.3–662.9] vs. 1572.0 [1235.0–2155.0] IU/mL, p =0.000), sIgE (0.6[0.3–2.0] vs. 9.4 [6.4–14.9] KU_A/L, p =0.000), sIgG (92.5 [52.4] vs. 202.2 [111.1] KU_A/L, p = 0.000) and CEA (4.8 [1.6] vs. 5.9 [2.7] ng/mL, p = 0.023). No significant intergroup differences were found in other symptoms, peripheral blood eosinophil count, or other hematological parameters. Notably, IgE-related indicators in the misdiagnosed group were frequently within low or borderline ranges, whereas markedly elevated levels were predominantly observed in correctly diagnosed patients, suggesting a gradient relationship between IgE levels and misdiagnosis risk.

Pulmonary function analysis showed that the misdiagnosed group had significantly higher FEV₁ and FEV₁/FVC ratios compared with the non-misdiagnosed group. On imaging assessment, bronchiectasis (91.2% vs. 70.9%, p = 0.032), mucus plug (32.4% vs. 60.0%, p = 0.016), pulmonary infiltrates (70.6% vs. 47.3%, p = 0.047), and HAM (14.7% vs. 43.6%, p = 0.005) were all less frequently observed in the misdiagnosed group. No statistically significant differences were identified between the two groups in mycological findings or bronchoscopic detection of mucus plug.

Feature Selection

Least absolute shrinkage and selection operator (LASSO) regression was used for feature selection. Figure 1A and B illustrates the coefficient trajectories of the variables. The optimal penalty parameter was determined through an iterative fivefold cross-validation procedure. Eleven variables closely associated with the diagnosis were retained, including sIgE, total serum IgE, HAM, fungal evidence, bronchiectasis, eosinophil count, history of pulmonary infiltration, sIgG, asthma, CEA, and mucus plug. These features were subsequently used for model construction.

Figure 1 Lasso regression-based variable screening.

Note: (A) The process of selecting the optimal value of the parameter λ in the lasso regression model is carried out by the cross-validation method; (B) Variation characteristics of variable coefficients.

Model Performance Comparisons

Six machine learning models were developed to explore factors associated with the misdiagnosis of ABPA/ABPM as pulmonary tuberculosis. Figure 2 illustrates the discriminative performance of each model based on receiver operating characteristic (ROC) curves. Overall, all six models demonstrated satisfactory classification ability, with the glmnet model performing best (AUC = 0.808, 95% CI: 0.567–1.000). The XGBoost model ranked second (AUC = 0.769, 95% CI: 0.471–1.000). The remaining models also showed acceptable predictive capacity, with performance decreasing in the following order: naïve Bayes (AUC = 0.750, 95% CI: 0.485–1.000), K-KNN (AUC = 0.683, 95% CI: 0.422–0.943), ranger (AUC = 0.773, 95% CI: 0.311–1.000), and rpart (AUC = 0.615, 95% CI: 0.318–0.913).

Figure 2 ROC curves of the machine learning models based on bootstrap resampling.

Notes: glmnet: Regularized Logistic Regression; naïve Bayes: Naïve Bayes; kknn: K-Nearest Neighbors; rpart: Regression Trees; ranger – Random Forest; XGBoost: Extreme Gradient Boosting.

Table 2 summarizes the comparative performance of six machine learning models in differentiating misdiagnosed cases. Overall, the glmnet, XGBoost, and ranger models demonstrated relatively high and balanced classification metrics across all evaluation indices. Among all models, glmnet demonstrated the best overall performance, yielding the highest sensitivity (0.831), specificity (0.752), and F-beta score (0.786). The XGBoost model achieved comparable accuracy (0.730) and balanced accuracy (0.750), while ranger showed slightly lower specificity (0.656) but maintained strong sensitivity (0.815). In contrast, naïve Bayes and rpart models exhibited moderate classification ability, with accuracies ranging from 0.618 to 0.661 and balanced accuracies around 0.65–0.69. The K-KNN model performed least effectively across all indices, showing reduced sensitivity and overall discriminative capacity. Collectively, the glmnet model achieved the most favorable trade-off between sensitivity and specificity. Given its superior discriminative performance and robustness, glmnet was ultimately chosen for subsequent calibration and decision curve analyses.

Table 2 Benchmark Results

Decision curve analysis (DCA) indicated that the glmnet model provided a stable net clinical benefit over a broad range of threshold probabilities (Figure 3A), supporting its potential clinical applicability. Calibration analysis revealed a Brier score of 0.120, reflecting good agreement between predicted probabilities and observed outcomes, as well as satisfactory model discrimination and calibration performance (Figure 3B). Overall, these results justified the use of the glmnet model as the basis for subsequent SHAP-based interpretability analysis.

Figure 3 Calibration capability and clinical benefit of the model.

Note: (A) Decision curve analysis (DCA); (B) Calibration curve.

Interpretability Analysis

The SHAP summary plots (Figure 4A and B) illustrate the relative importance of each feature in correctly diagnosing ABPA/ABPM within the machine learning model. The results indicate that sIgE (0.142), total serum IgE (0.097), HAM (0.045), fungal evidence (0.039), and eosinophil count (0.030) were the key predictors exerting the strongest positive influence on the model’s diagnostic performance. In contrast, bronchiectasis (0.036) and pulmonary infiltrates (0.022) exhibited negative SHAP values, suggesting that higher levels of these features may lead the model to incorrectly classify ABPA/ABPM cases as tuberculosis. To further elucidate the contribution and interaction of individual features within the predictive framework, force plots (Figure 4C) and dependence scatter plots (Figure 5) were generated using the shapviz package. The force plot visualizes the ranking of feature contributions and their cumulative impact on the prediction of ABPA/ABPM, with the final predicted probability reaching as high as 0.98.

Figure 4 SHAP Summary Plot.

Abbreviation: F.sIgE, fungal specific IgE; HAM, high attenuation mucus; F.evidence, fungus evidence; HPI, history of pulmonary infiltrates; F.sIgG, fungal specific IgG.

Note: (A) SHAP summary plot; (B) SHAP summary bar plot; (C) SHAP Force plot; SHAP summary plot showing the contribution of each feature to the ABPA/ABPM diagnostic model. Features are ranked by the mean absolute SHAP value. Each point represents a patient, with position on the x-axis indicating the direction and magnitude of its impact. Points are colored by the feature value (yellow = low, purple = high).

Figure 5 SHAP dependence plot.

Note: (A) Fungus-specific IgE; (B) Total serum IgE; (C) High attenuation mucus; (D) Fungal evidence; (E) Bronchiectasis; (F) Eosinophil counts; (G) History of pulmonary infiltrates; (H) Asthma; (I) Carcinoembryonic Antigen; (J) Fungus-specific IgG; (K) Presence of mucus plug;The SHAP dependence plot shows the feature value’s impact on prediction, with colors indicating another feature’s value. The SHAP dependence plot shows the feature value’s impact on prediction, with colors indicating another feature’s value.

Figure 5 shows that higher sIgE, total IgE, HAM, and fungal evidence strongly drive predictions toward ABPA/ABPM, while pulmonary infiltrates bias results toward tuberculosis. Importantly, these effects were derived from continuous variable modeling rather than predefined categorical thresholds, allowing the identification of nonlinear relationships across the full range of IgE values.

Causal Effect Analysis of Misdiagnosis Factors

To move beyond correlation and understand the drivers of misdiagnosis, we quantified the causal effect of key clinical features (Figure 6). The analysis revealed a clear hierarchy of feature importance:

Figure 6 Causal Effect of Key Clinical Factors on Misdiagnosis Risk.

Abbreviation: ATE, Average Treatment Effect; F.sIgE, Aspergillus fumigatus-specific IgE; HAM, High-attenuation mucus; HPI, History of pulmonary infiltrates.

Notes: This forest plot illustrates the Average Treatment Effect (ATE) of five key clinical features on the probability of misdiagnosis, estimated using the Double Machine Learning (DML) framework. Green bars (negative ATE values) represent protective factors that significantly reduce the risk of misdiagnosis. Red bars (positive or near-zero ATE values) indicate risk factors or features with negligible causal impact. The analysis reveals that Total serum IgE (ATE = −0.213) and F.sIgE (ATE = −0.135) are the strongest causal drivers for preventing misdiagnosis. In contrast, radiological signs such as High-Attenuation Mucus (HAM) and History of Pulmonary Infiltrates (HPI) show ATE values close to zero (0.028 and 0.027, respectively), indicating limited independent diagnostic value in the absence of immunological evidence.

First, immunological markers are the definitive protective factors. Total serum IgE demonstrated the strongest causal protection (ATE = −0.213), implying that high IgE levels directly reduce the probability of misdiagnosis by approximately 21.3% after controlling for confounders. Similarly, F.sIgE also showed a significant protective effect (ATE = −0.135).

Second, radiological and historical features showed limited independent causal value. Despite being typical signs of ABPA, High-Attenuation Mucus (HAM) and History of Pulmonary Infiltrates (HPI) yielded ATE values near zero (0.028 and 0.027, respectively). This suggests that without immunological confirmation, these signs are insufficient to prevent clinicians from misdiagnosing the condition as tuberculosis.

Interestingly, Bronchiectasis showed a moderate protective effect (ATE = −0.091), possibly because severe central bronchiectasis triggers a broader differential diagnosis including fungal etiologies.

The validity of these causal findings was rigorously verified (Figure 7). Covariate balance checks (Love Plots) for all five factors showed that standardized mean differences were reduced to <0.1 after adjustment. Propensity score distributions (Overlap Plots) confirmed good comparability between groups. Most importantly, Placebo Tests demonstrated that the estimated effects for IgE and sIgE significantly deviated from the random noise distribution, confirming their robust clinical significance.

Figure 7 Quality Control and Validation of Causal Inference Models.

Abbreviations: IPW, Inverse Probability Weighting; SMD, Standardized Mean Difference; ATE, Average Treatment Effect; DML, Double Machine Learning.

Note: Comprehensive validation plots for the five analyzed clinical features, organized into five rows: (A–C) Total serum IgE; (D–F) F.sIgE; (G–I) High-Attenuation Mucus (HAM); (J–L) Bronchiectasis; (M–O) History of Pulmonary Infiltrates (HPI). Left Column (A, D, G, J, M): Covariate Balance Check (Love Plot). These plots display the Standardized Mean Differences (SMD) of confounding variables before (grey dots) and after (blue diamonds) IPW adjustment. The convergence of blue diamonds within the 0.1 threshold (vertical red dotted line) indicates that the model successfully balanced the baseline characteristics between groups. Middle Column (B, E, H, K, N): Placebo Test. These plots validate the statistical significance of the estimated causal effects. The grey histogram represents the distribution of effects from 100 random permutations (noise), while the vertical red line marks the Actual Average Treatment Effect (ATE) estimated by the DML model. For Total serum IgE (B), F.sIgE (E), and Bronchiectasis (K), the red lines significantly deviate from the noise distribution, confirming robust protective effects. In contrast, for HAM (H) and HPI (N), the red lines fall within the noise distribution, indicating negligible causal impact. Right Column (C, F, I, L, O): Propensity Score Overlap Plot. These plots show the probability density distribution of propensity scores for the control (blue) and treated (red) groups. The extensive overlap area (common support) in all plots confirms the comparability of the samples.

Discussion

Misdiagnosis of ABPA/ABPM as pulmonary tuberculosis has been widely reported, particularly in high TB-burden settings.^15,30 This is largely driven by substantial overlap in clinical presentation and radiological features, often leading to inappropriate anti-tuberculosis treatment and delayed recognition of the underlying disease.^17,21,24 Given this diagnostic complexity, several studies have attempted to identify factors associated with such misdiagnosis. However, most of these investigations have been limited to univariate comparisons and lack systematic analysis. A study published in 2022 reported that there were no statistically significant differences in pulmonary function parameters, age, or total IgE levels between misdiagnosed and correctly diagnosed patients. However, patients who were not misdiagnosed were more likely to exhibit eosinophilia, bronchiectasis, and mucus plugs on CT imaging.¹⁵ In addition, Nousheen Iqbal et al reported that the only symptomatic difference between the two groups was hemoptysis, while chest X-ray findings lacked specificity, with pulmonary nodules being the most common radiological feature.¹⁷ Nonetheless, these studies largely relied on traditional group comparisons, which are unable to capture nonlinear relationships or interactions among multiple variables, and cannot quantitatively assess the relative contribution of each variable to misdiagnosis risk.

This study systematically analyzed the causes underlying the misdiagnosis of ABPA/ABPM as pulmonary tuberculosis, dissecting the key drivers of misdiagnosis from both predictive and causal perspectives. The results demonstrated that immunological markers play a central and decisive role in preventing misdiagnosis. Both the SHAP interpretability model and the Double Machine Learning causal analysis consistently showed that total IgE and sIgE exert significant protective effects. After adjustment for potential confounders, including age, sex, and other clinical variables, the protective association remained stable, suggesting that IgE-related indicators are not simply correlated with correct diagnosis but may have a direct causal effect on clinical decision-making processes. Importantly, the contribution of this study lies not in identifying new biomarkers, but in providing a more comprehensive analytical approach. Patients with relatively low or borderline total IgE and sIgE levels tended to be misdiagnosed, especially during early or atypical disease stages when diagnostic thresholds were not satisfied.

In contrast, radiological features traditionally regarded as characteristic of ABPA, such as high-attenuation mucus (HAM) and pulmonary infiltrates, demonstrated limited independent causal value. Although these imaging findings are commonly considered hallmark features of ABPA, our results suggest that, in the absence of immunological confirmation, reliance on radiological manifestations alone is insufficient to reliably distinguish ABPA/ABPM from infectious diseases. This finding highlights the need to establish an “immunology-first” diagnostic hierarchy in clinical practice.

Notably, although our model highlighted the importance of IgE-related indicators, their performance in practice is constrained by predefined thresholds. Rather than relying exclusively on predefined cutoff values, our findings suggest that IgE-related risk may follow a more continuous pattern, with lower or borderline levels being more frequently observed in misdiagnosed cases, which may indicate certain limitations of strict threshold-based approaches in specific clinical contexts. Misdiagnosed patients often had low or borderline IgE levels that did not meet diagnostic cutoffs, particularly in early or atypical stages. While lowering thresholds may reduce missed diagnoses, it inevitably increases the risk of overdiagnosis. Therefore, a more flexible approach that integrates immunological, clinical, and radiological features may be more appropriate. The model may function as a supportive tool for identifying patients at increased risk of misdiagnosis, particularly among those initially suspected of tuberculosis, thereby facilitating further diagnostic evaluation. Nevertheless, its performance is influenced by the characteristics of the underlying dataset, which was relatively limited in size and may restrict generalizability. Validation in larger and more heterogeneous cohorts is therefore required to confirm these findings and to establish clinically applicable thresholds.

The present study did not predefine a fixed probability threshold for classification. Instead, the model was evaluated across a range of decision thresholds to assess its robustness and generalizability under different clinical scenarios. While this approach enhances flexibility, the absence of a clearly defined clinically optimized cutoff may limit its immediate applicability in routine practice. Future research should focus on identifying optimal decision thresholds that balance clinical interpretability and practical utility.

The proposed model is designed as a complementary decision-support tool rather than a replacement for existing diagnostic criteria. It may assist in the early identification of patients at elevated risk of misdiagnosis, particularly those initially suspected of tuberculosis, thereby supporting the prioritization of further immunological investigations and potentially reducing unnecessary anti-tuberculosis treatment.

In addition, the composition of the study population represents an important limitation. The study cohort consisted exclusively of patients with confirmed ABPA/ABPM, while patients with active tuberculosis without concomitant ABPA/ABPM were not included as a control group. As a result, the model was not designed to perform a direct differential diagnosis between ABPA/ABPM and tuberculosis. Previous studies have described several clinical contexts in which ABPA/ABPM and tuberculosis may coexist or overlap, including ABPA occurring in patients with a history of tuberculosis, concurrent ABPA with active tuberculosis, Aspergillus sensitization in individuals with current or past tuberculosis, and situations where ABPA is initially misdiagnosed as tuberculosis.³⁰ The lack of such heterogeneous clinical presentations in our dataset may restrict the generalizability of our findings to real-world diagnostic practice. Future studies should explore optimal threshold strategies and compare the performance of different diagnostic criteria in larger, more diverse cohorts, particularly those including patients with confirmed tuberculosis and overlapping conditions, to further validate and extend the applicability of our findings.

Code Availability

The code used for the analysis in this study is available upon reasonable request from the corresponding author.

Abbreviations

ABPA, Allergic bronchopulmonary aspergillosis; ABPM, Allergic bronchopulmonary mycosis; IgE, Immunoglobulin E; IgG, Immunoglobulin G; F.sIgE, Aspergillus/filamentous fungi–specific IgE; F.sIgG, Aspergillus/filamentous fungi–specific IgG; Total serum IgE, Total immunoglobulin E in serum; EOS.count, Peripheral eosinophil count; BALF, Bronchoalveolar lavage fluid; NGS, Next-generation sequencing; Bx, Bronchoscopy; CT, Computed tomography; HAM, High-attenuation mucus plug; HPI, History of pulmonary infiltrates; ML, Machine learning; SHAP, SHapley Additive exPlanations; AUC, Area under the receiver operating characteristic curve; F-beta, F-beta score; Glmnet, Regularized logistic regression; NB, Naive Bayes; K-KNN, K-nearest neighbors; RPART, Recursive partitioning and regression trees (decision tree); Ranger, Random forest; XGBoost, Extreme gradient boosting; DCA, Decision curve analysis.

Data Sharing Statement

Written informed consent for public sharing of participant data was not obtained in this study. Therefore, in view of the sensitive nature of the data, the supporting datasets are not publicly available.

Ethics Approval and Consent to Participate

Ethical approval for this retrospective study was obtained from the Institutional Review Board of Xiangya Hospital of Central South University (No. 2023121128). The study was carried out in accordance with the Declaration of Helsinki. Given its retrospective and observational design, the requirement for written informed consent was waived by the ethics review committee.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This study was funded by the National Natural Science Foundation of China (Grant No. 82270033) and the Project Program of the National Clinical Research Center for Geriatric Disorders (Xiangya Hospital, Grant No. 2023LNJJ06).

Disclosure

The authors declare no competing interests in this work.

References

1. Agarwal R, Muthu V, Sehgal IS, Dhooria S, Prasad KT, Aggarwal AN. Allergic bronchopulmonary aspergillosis. Clin Chest Med. 2022;43(1):99–15. doi:10.1016/j.ccm.2021.12.002

2. Kosmidis C, Denning DW. The clinical spectrum of pulmonary aspergillosis. Thorax. 2015;70(3):270–277. doi:10.1136/thoraxjnl-2014-206291

3. Agarwal R, Denning DW, Chakrabarti A, Kirk M. Estimation of the burden of chronic and allergic pulmonary aspergillosis in India. PLoS One. 2014;9(12):e114745. doi:10.1371/journal.pone.0114745

4. Maturu VN, Agarwal R. Prevalence ofA spergillussensitization and allergic bronchopulmonary aspergillosis in cystic fibrosis: systematic review and meta-analysis. Clin Exp Allergy. 2015;45(12):1765–1778. doi:10.1111/cea.12595

5. Oguma T, Taniguchi M, Shimoda T, et al. Allergic bronchopulmonary aspergillosis in Japan: a nationwide survey. Allergol Int. 2018;67(1):79–84. doi:10.1016/j.alit.2017.04.011

6. Muthu V, Sehgal IS, Prasad KT, et al. Allergic bronchopulmonary aspergillosis (ABPA) sans asthma: a distinct subset of ABPA with a lesser risk of exacerbation. Med Mycol. 2020;58(2):260–263. doi:10.1093/mmy/myz051

7. Muthu V, Prasad KT, Sehgal IS, Dhooria S, Aggarwal AN, Agarwal R. Obstructive lung diseases and allergic bronchopulmonary aspergillosis. Curr Opin Pulm Med. 2021;27(2):105–112. doi:10.1097/MCP.0000000000000755

8. Sehgal IS, Dhooria S, Prasad KT, et al. Sensitization to A fumigatus in subjects with non-cystic fibrosis bronchiectasis. Mycoses. 2021;64(4):412–419. doi:10.1111/myc.13229

9. Agarwal R, Chakrabarti A, Shah A, et al. Allergic bronchopulmonary aspergillosis: review of literature and proposal of new diagnostic and classification criteria. Clin Exp Allergy. 2013;43(8):850–873. doi:10.1111/cea.12141

10. Agarwal R, Sehgal IS, Muthu V, et al. Revised ISHAM-ABPA working group clinical practice guidelines for diagnosing, classifying and treating allergic bronchopulmonary aspergillosis/mycoses. Eur Respir J. 2024;63(4):2400061. doi:10.1183/13993003.00061-2024

11. Agarwal R, Aggarwal AN, Sehgal IS, Dhooria S, Behera D, Chakrabarti A. Utility of IgE (total and Aspergillus fumigatus specific) in monitoring for response and exacerbations in allergic bronchopulmonary aspergillosis. Mycoses. 2016;59(1):1–6. doi:10.1111/myc.12423

12. Asano K, Tomomatsu K, Okada N, Tanaka J, Oguma T. Treatment of allergic bronchopulmonary aspergillosis with biologics. Chinese Med J Pulmonary Critical Care Med. 2025;3(1):6–11. doi:10.1016/j.pccm.2024.11.005

13. Adhvaryu MR, Reddy N, Vakharia BC. Prevention of hepatotoxicity due to anti tuberculosis treatment: a novel integrative approach. World J Gastroenterol. 2008;14(30):4753–4762. doi:10.3748/wjg.14.4753

14. Agarwal R, Sehgal IS, Muthu V, Dhar R, Armstrong‐James D. Allergic bronchopulmonary aspergillosis in India. Clin Exp Allergy. 2023;53(7):751–764. doi:10.1111/cea.14319

15. Zeng Y, Xue X, Cai H, et al. Clinical characteristics and prognosis of allergic bronchopulmonary aspergillosis: a retrospective cohort study. J Asthma Allergy. 2022;15:53–62. doi:10.2147/JAA.S345427

16. Ekeng BE, Edem K, Akintan P, Oladele RO. Histoplasmosis in African children: clinical features, diagnosis and treatment. Therapeutic Adv Infect Dis. 2022;9:20499361211068592. doi:10.1177/20499361211068592

17. Iqbal N, Amir Sheikh MD, Jabeen K, Awan S, Irfan M. Allergic bronchopulmonary aspergillosis misdiagnosed as smear negative pulmonary tuberculosis; a retrospective study from Pakistan. Ann Med Surg. 2021;72:103045. doi:10.1016/j.amsu.2021.103045

18. Le Thuong V, Nguyen Ho L, Tran Van N. Allergic bronchopulmonary aspergillosis masquerading as recurrent bacterial pneumonia. Med Mycol Case Rep. 2016;12:11–13. doi:10.1016/j.mmcr.2016.06.004

19. Lv H, Wang L, Zhang X, et al. Further analysis of tuberculosis in eight high-burden countries based on the global burden of disease study 2021 data. Infect Dis Poverty. 2024;13(1):70. doi:10.1186/s40249-024-01247-8

20. Jiang N, Xiang L. Allergic bronchopulmonary aspergillosis misdiagnosed as recurrent pneumonia. Asia Pac Allergy. 2020;10(3):e27. doi:10.5415/apallergy.2020.10.e27

21. Agarwal R, Gupta D, Aggarwal AN, Behera D, Jindal SK. Allergic bronchopulmonary aspergillosis: lessons from 126 patients attending a chest clinic in north India. Chest. 2006;130(2):442–448. doi:10.1378/chest.130.2.442

22. Zhang C, Jiang Z, Shao C. Clinical characteristics of allergic bronchopulmonary aspergillosis. Clin Respir J. 2020;14(5):440–446. doi:10.1111/crj.13147

23. Zhang M, Gao J. Clinical analysis of 77 patients with allergic bronchopulmonary aspergillosis in peking union medical college hospital. Zhongguo Yi Xue Ke Xue Yuan Xue Bao. 2017;39(3):352–357. doi:10.3881/j.issn.1000-503X.2017.03.009

24. Zou M-F, Yang Y, Liu L, Sun E-H, Dong L. Clinical characteristics of fifty patients with allergic bronchopulmonary aspergillosis. Chin Med J. 2018;131(9):1108–1109. doi:10.4103/0366-6999.230734

25. Pant A, Das B, Arimbasseri GA. Host microbiome in tuberculosis: disease, treatment, and immunity perspectives. Front Microbiol. 2023;14:1236348. doi:10.3389/fmicb.2023.1236348

26. Ning C, Ouyang H, Xiao J, et al. Development and validation of an explainable machine learning model for mortality prediction among patients with infected pancreatic necrosis. EClinicalMedicine. 2025;80:103074. doi:10.1016/j.eclinm.2025.103074

27. Wendling T, Jung K, Callahan A, Schuler A, Shah NH, Gallego B. Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases. Stat Med. 2018;37(23):3309–3324. doi:10.1002/sim.7820

28. Chernozhukov V, Chetverikov D, Demirer M, et al. Double/debiased machine learning for treatment and structural parameters. Econometrics J. 2018;21(1):C1–C68. doi:10.1111/ectj.12097

29. Cai R, Ge H, Liu B, et al. Proposal and verification of new revised criteria for ABPA/ABPM diagnosis. J Asthma Allergy. 2025;18:467–477. doi:10.2147/JAA.S514664

30. Patil S, Patil R. “Fleeting pulmonary infiltrates in allergic bronchopulmonary aspergillosis” Misdiagnosed as tuberculosis. Int J Mycobacteriol. 2018;7(2):186–190. doi:10.4103/ijmy.ijmy_57_18

Creative Commons License © 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.