Back to Journals » Drug Design, Development and Therapy » Volume 20

An Interpretable PK-Informed Hybrid Model for Voriconazole Exposure Prediction: Roles of CYP2C19 Genotype and Inflammation

Authors Zhou Y, Yun Y, Chen S, Liu X, Liu H ORCID logo, Ren J, Lu M, Ling J, Yang X, Zhou Z, Osei JA, Soheili R ORCID logo, Zou J ORCID logo, Hu N ORCID logo

Received 27 February 2026

Accepted for publication 30 April 2026

Published 7 May 2026 Volume 2026:20 489712

DOI https://doi.org/10.2147/DDDT.S489712

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Anastasios Lymperopoulos



Yehui Zhou,1,2,* Yuting Yun,1,2,* Shiting Chen,1,2,* Xinyi Liu,2,3,* Hao Liu,4 Jiaxin Ren,1,2 Mengmeng Lu,1,2 Jing Ling,5 Xuping Yang,5 Zhou Zhou,2 James Afriyie Osei,1,2 Respina Soheili,1,2 Jianjun Zou,2 Nan Hu5

1School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, People’s Republic of China; 2Department of Pharmacy, Nanjing First Hospital, Nanjing Medical University, Nanjing, People’s Republic of China; 3Department of Clinical Pharmacology, Nanjing First Hospital, Nanjing Medical University, Nanjing, People’s Republic of China; 4State Key Laboratory of Natural Medicines, Key Laboratory of Drug Metabolism, China Pharmaceutical University, Nanjing, People’s Republic of China; 5Department of Pharmacy, The Third Affiliated Hospital of Soochow University, Changzhou, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Jianjun Zou, Department of Pharmacy, Nanjing First Hospital, Nanjing Medical University, No. 68 Changle Road, Nanjing, 210006, People’s Republic of China, Email [email protected] Nan Hu, Department of Pharmacy, The Third Affiliated Hospital of Soochow University, Changzhou, 213000, People’s Republic of China, Email [email protected]

Purpose: Voriconazole (VCZ) exhibits nonlinear pharmacokinetics, a narrow therapeutic window, and substantial interindividual variability. Inaccurate dosing may lead to underexposure or overexposure, causing treatment failure or toxicity. Existing population pharmacokinetic (PPK)-machine learning (ML) models either lack mechanistic interpretability or inadequately characterize VCZ exposure. Therefore, we propose a hybrid model embedding ML within a PPK framework to associate clinical covariates with VCZ exposure.
Patients and Methods: A total of 489 inpatients receiving VCZ at the Third Affiliated Hospital of Soochow University between March 2020 and May 2024 were included. We identified candidate predictors of CL/F using dual-feature selection with Boruta and LASSO. Overlapping features were used to train four ML algorithms to estimate CL/F. The predicted CL/F values were incorporated into a steady-state PPK equation to back-calculate VCZ concentrations, followed by quadratic calibration to reduce bias. Causal mediation analysis assessed pathways from key covariates to VCZ concentration via CL/F, and Shapley Additive exPlanations (SHAP) values were used to quantify feature contributions.
Results: Under the PK-informed hybrid strategy, XGBoost achieved the best concentration prediction (R2 = 0.739, MAE = 0.357, RMSE = 0.526, MAPE = 7.78%), outperforming a direct ML approach treating PK-related variables as inputs (CatBoost: R2 = 0.459). The temporal external validation performance of the hybrid model remained stable (R2 = 0.661, MAE = 0.473, RMSE = 0.651, MAPE = 14.71%). Mediation analysis demonstrated that CRP affected VCZ exposure primarily through CL/F, whereas albumin and age acted as modifiers. A web-based calculator was developed for real-time individualized prediction and assistance with clinical-dose adjustment.
Conclusion: The hybrid model improved VCZ concentration prediction versus direct ML modeling while preserving CL/F-centered mechanistic interpretability. It may help guide dose adjustment and reduce clinically relevant misdosing. This framework may be generalizable to other narrow-therapeutic-window drugs.

Keywords: voriconazole, population pharmacokinetics, machine learning, hybrid modeling strategy, causal mediation analysis, plasma concentration prediction

Introduction

Voriconazole (VCZ) is a widely used triazole antifungal agent for treating invasive fungal infections. Its therapeutic efficacy hinges on achieving appropriate drug exposure within a narrow therapeutic window.1–4 However, VCZ is associated with exposure-related hepatotoxicity and neurotoxicity, and exhibits nonlinear pharmacokinetics driven by saturable metabolism. This variability in drug exposure complicates prediction in routine clinical practice, necessitating individualized dosing and therapeutic drug monitoring (TDM).5,6 Factors such as CYP2C19 genotype, age, weight, hepatic impairment, and drug interactions have been identified as contributors to the variability in VCZ exposure.7 Existing studies have mainly focused on a single or a limited number of covariates.8–10 However, in real-world populations, patients often present with complex pathophysiological conditions, where multiple interrelated clinical characteristics coexist. Therefore, a comprehensive framework is required to identify key influencing factors and elucidate their pathways in determining VCZ exposure, thereby providing a more clinically meaningful basis for individualized dosing, TDM-guided dose adjustment, and optimization of therapy in routine practice.

Methodologically, traditional population pharmacokinetic (PPK) models and machine learning (ML) approaches each offer distinct advantages: PPK models are based on a well-defined pharmacokinetic structure, ensuring strong biological interpretability,11,12 while ML methods provide greater flexibility in capturing complex, nonlinear relationships.13–16 Recent research has attempted to integrate PK parameters into ML models for predicting VCZ concentrations.17–19 For example, Liu et al employed an ensemble model of three ML algorithms to forecast VCZ concentrations in elderly patients.5 However, these models primarily treat pharmacokinetic parameters as input features, with ML dominating the modeling process. Consequently, pharmacokinetic structure serves more as a source of features than as a structural constraint, limiting the mechanistic interpretability of the model and reducing its utility for clinical dosing decisions. In other words, although these approaches may improve predictive performance, they do not fully preserve the mechanistic relationship between clinical covariates, pharmacokinetic behavior, and final drug exposure, which is essential for individualized dosing in real-world settings.

Building on this foundation, we developed a hybrid PPK–ML framework that embeds ML within a PPK-guided structure rather than treating PK-related variables merely as model inputs. By preserving the mechanistic linkage between clinical covariates, CL/F, and VCZ exposure, this framework may provide a more interpretable and clinically relevant approach to exposure prediction. Accordingly, the objective of this study was to develop and validate this framework in real-world patients receiving VCZ therapy, thereby better supporting individualized dosing decisions.

Materials and Methods

Study Population

This retrospective study included adult inpatients (≥18 years) receiving VCZ for confirmed, probable, or suspected invasive fungal infections at the Third Affiliated Hospital of Soochow University. Data from March 2020 to July 2023 constituted the internal dataset; data from August 2023 to May 2024 served as the temporal external validation dataset. Patients receiving concomitant antifungal agents or drugs with known significant VCZ interactions were excluded. Full inclusion and exclusion criteria were detailed in Supplementary S1. The study was approved by the institutional Ethics Committee (No. 2023–038) and conducted in accordance with the Declaration of Helsinki. The requirement for informed consent was waived given the retrospective design without additional biological sample collection.

Data Collection and VCZ Concentration Measurement

Demographic, genetic, laboratory, concomitant medication, and VCZ dosing data were extracted from electronic medical records (EMR). Given that the elimination half-life of VCZ is approximately 6–12 hours, steady state is typically achieved after 4–5 half-lives. Therefore, for patients receiving a loading dose, who can reach target exposure levels more rapidly, the first plasma sample was collected on day 3 of treatment. For patients not receiving a loading dose, plasma samples were collected after completing 5 days of treatment. All samples were obtained within 30 minutes before the next dose. VCZ plasma concentrations were measured using a validated high-performance liquid chromatography coupled with tandem mass spectrometry (HPLC–MS/MS) method (linear range: 0.1–20 mg/L; intra- and inter-day precision: 1.92% and 4.60%, respectively).9

Genotyping and Phenotype Assignment

CYP2C19 genotyping was performed using a fluorescence-based commercial assay (Xi’an Tianlong Technology, China). Phenotypes were classified per Clinical Pharmacogenetics Implementation Consortium (CPIC) guidelines into ultrarapid (UM), rapid (RM), normal (NM), intermediate (IM), and poor metabolizers (PM). The genotype-to-phenotype mapping was provided in Supplementary S1.

Data Processing and Feature Selection

The internal dataset was randomly split into a training cohort and a testing cohort at a ratio of 8:2. After excluding variables with a missing rate exceeding 20%, missing values with less than 20% were imputed using K-nearest neighbors (KNN) fitted on the training cohort and applied to the testing cohort, followed by 1–99% Winsorization. Variables with potential information leakage or high multicollinearity were excluded. Categorical variables were one-hot encoded. The same preprocessing pipeline was applied to the internal and external datasets. Details were provided in Supplementary S2.

Feature selection combined Boruta and LASSO regression, both performed in the training cohort. The intersection was adopted as the core feature set. Multicollinearity was assessed using variance inflation factors (VIF < 5). This dual approach ensured that selected features captured nonlinear predictive information while avoiding redundancy. Moreover, individual CL/F values were derived from a validated one-compartment PPK model with first-order (linear) elimination.9

Hybrid PPK–ML Framework Construction

The modeling strategy comprised three steps: (i) ML-based prediction of individual apparent clearance (CL/F) to characterize metabolic capacity, (ii) PK equation-based back-calculation of steady-state VCZ concentrations, and (iii) calibration of the back-calculated concentrations by evaluating polynomial degree and trough-sampling window against clinically measured trough concentrations. Within this framework, the PK model served as a structural constraint, whereas ML was used to estimate individual CL/F from selected clinical covariates:

where denotes the ML-predicted CL/F for patient , and denotes the corresponding features. This ML component was used to capture potentially nonlinear and heterogeneous relationships between clinical covariates and individual CL/F.

Specifically, we first constructed an ML-based CL/F prediction model. Four common ML algorithms were employed for modeling: categorical boosting (CatBoost), random forest (RF), light gradient boosting machine (LightGBM), and eXtreme Gradient Boosting (XGBoost). Each algorithm underwent parameter tuning via 5-fold cross-validation. Model performance was evaluated using the coefficient of determination (R2), mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE). The final hybrid PPK-ML model was selected based on both the robustness of CL/F prediction and plasma concentration prediction performance in the testing cohort.

The daily dose served as the known-dosing input in the pharmacokinetic formula, enabling subsequent concentration calculation and dose optimization scenario simulations. The predicted CL/F was then substituted into the pharmacokinetic equation to calculate the theoretical steady-state average concentration ():

It should be noted that reflects the average exposure level within the dosing interval, whereas clinical monitoring employs trough concentration as the instantaneous minimum concentration prior to dosing. Additionally, blood sample collection timing in clinical practice exhibited a certain degree of deviation. Therefore, to map to clinically measured trough concentrations, we compared multiple candidate calibration models and selected the final calibration model based on comparative post-calibration performance. The calibration step was generally expressed as:

where denotes a candidate calibration function mapping the theoretical steady-state average concentration to the measured trough concentration. For the quadratic candidate model, was specified as:

The selected calibration model was then used for subsequent trough concentration prediction analyses. Although VCZ exhibits nonlinear pharmacokinetics, the steady-state equation serves as a local linear approximation within the observed therapeutic concentration range. Based on prior multiple-dose pharmacokinetic studies,20 VCZ was reported to reach relative steady state approximately 5–8 days after administration. Accordingly, to determine the optimal calibration time window, trough concentrations collected from days 5–8 after treatment initiation and thereafter were stratified for calibration assessment, and the window demonstrating the best calibration performance was selected for subsequent analyses.

Model Interpretability Analysis

To interpret the CL/F prediction model, SHapley Additive exPlanations (SHAP) values were computed for each feature. Global feature importance was summarized by the mean absolute SHAP value, and SHAP swarm plots were used to visualize feature effects. Additionally, a directed acyclic graph (DAG) was constructed to describe potential causal relationships among variables. Confounders, mediators, and the minimal sufficient adjustment set were defined based on the DAG. Causal inference analysis was used to supplement SHAP results and characterize the direction of effects.

Statistical Analysis

All statistical tests were two-sided, with p < 0.05 considered statistically significant. Statistical analyses were performed using R (version 4.2.2) and Python (version 3.9.0). Normality of continuous variables was assessed using the Shapiro–Wilk test. Based on normality test results, continuous variables were expressed as mean ± standard deviation (SD) or median ± interquartile range (IQR); categorical variables were expressed as counts (frequencies). Comparisons of continuous variables between groups were performed using independent samples t-tests or Mann–Whitney U-tests, while comparisons of categorical variables were performed using χ2-tests or Fisher’s exact tests.

Results

Participants and Baseline Characteristics

Patients were randomly assigned to the training cohort (n = 310) and the testing cohort (n = 77) (Table 1), and all trough concentration samples from a given patient followed the patient-level assignment. Given that patients with invasive fungal infections often present with unstable conditions and fluctuating gastrointestinal absorption capacity, intravenous infusion was uniformly administered to all hospitalized patients in this study to ensure reliable and predictable plasma drug exposure levels during the initial treatment phase. Among them, 135 patients received intravenous loading doses of 600 mg or 400 mg twice daily, followed by maintenance doses of 300 mg or 200 mg twice daily; 252 patients received intravenous doses of 200 mg twice daily without a loading dose. The infusion rate was less than 3 mg/kg per hour. During subsequent treatment, physicians adjusted the dose based on VCZ concentrations. VCZ trough concentrations exhibited considerable variability, with a median of 4.10 mg/L and a range of 0.30–12.90 mg/L; 72.7% (397/546) of trough concentrations fell within the 0.5–5.0 mg/L range. Among study subjects, CYP2C19 phenotypes were predominantly NM (37.4%), IM (48.5%), and PM (14.1%); no UM or RM phenotypes were included. The workflow for data processing, algorithm selection, and modeling process was depicted in Figure 1.

Table 1 Baseline Characteristics of the Training and Testing Cohorts

A schematic of study workflow for VCZ concentration prediction using machine learning and hybrid modeling.

Figure 1 Schematic of the study workflow.

Abbreviations: VCZ, voriconazole; TDM, therapeutic drug monitoring; CL/F, clearance; XGBoost, eXtreme Gradient Boosting; LightGBM, light gradient boosting machine; CatBoost, categorical boosting; RF, random forest; ML, machine learning; KNN, K-nearest neighbors; VIF, variance inflation factors; MAE, mean absolute error; RMSE, root mean square error; MAPE, mean absolute percentage error; PPK, population pharmacokinetic; SHAP, SHapley Additive exPlanations; ALB, albumin; CRP, C-reactive protein; TBIL, total bilirubin; DAG, directed acyclic graph.

Overall Relationships Among Dose, Concentration, and CL/F

Individual CL/F was estimated using a PPK model developed from prior VCZ studies,9 with detailed model parameters provided in Supplementary Table 1. VCZ trough concentrations exhibited a positive correlation with the administered dose (Figure 2A), yet substantial inter-individual variability persisted at equivalent dose levels. CL/F showed high dispersion within each CYP2C19 metabolic phenotype, and the distribution ranges between phenotypes overlapped substantially (Figure 2B). In summary, these findings indicated that genotype alone cannot fully explain the variability in CL/F and VCZ exposure, necessitating the integration of non-genetic clinical covariates for analysis.

Infographic of VCZ dose, plasma concentration, CL/F models and SHAP predictors across panels A–G.

Figure 2 Hybrid model performance of VCZ clearance and plasma concentration. (A) Scatter plot of VCZ dose versus plasma concentration. Solid red line indicates regression fit; shaded area denotes 95% confidence interval; dashed lines represent commonly used plasma concentration thresholds. (B) Distribution of CL/F across CYP2C19 genotypes (PM, IM, NM). Inner boxplots indicate median and interquartile range. (C) Full model for predicting VCZ CL/F using XGBoost. (D) Simplified model for predicting VCZ CL/F using XGBoost. (E) Performance of the hybrid model in predicting VCZ plasma concentration. The dashed line represents the ideal prediction line, and the gray shaded area indicates the ±30% prediction error range. (F) SHAP summary of feature effects on CL/F. (G) SHAP feature importance ranking for CL/F prediction.

Abbreviations: VCZ, voriconazole; CL/F, clearance; XGBoost, eXtreme Gradient Boosting; SHAP, SHapley Additive exPlanations; ALB, albumin; TBIL, total bilirubin; CRP, C-reactive protein; NM, normal metabolizer; IM, intermediate metabolizer; PM, poor metabolizer; MAE, mean absolute error; RMSE, root mean square error; MAPE, mean absolute percentage error.

Hybrid PPK-ML Framework Performance

Dual-feature selection (Boruta and LASSO) identified seven core features for PPK-ML model construction: age, C-reactive protein (CRP), albumin (ALB), CYP2C19 genotype, sex, weight, and total bilirubin (TBIL). All variables exhibited VIF values below 5, indicating no significant multicollinearity (see Supplementary Table 2). Using this reduced seven-feature set, XGBoost showed better predictive performance than the 29-variable setting for CL/F prediction (R2 = 0.916 vs. 0.898; RMSE = 0.383 vs. 0.423; Figure 2C and D), and additional algorithm comparisons were shown in Supplementary Figures 13.

To determine the final calibration strategy, polynomial calibration models of different degrees were evaluated across trough-sampling windows defined as ≥5, ≥6, ≥7, and ≥8 days after dosing. Quadratic calibration using trough concentrations collected ≥7 days after dosing achieved the best or near-best overall predictive performance across the evaluated combinations and was therefore selected as the final calibration strategy for subsequent analyses (Table 2). Further increasing the polynomial degree or extending the sampling threshold beyond ≥7 days did not provide material performance gains, and some metrics deteriorated in certain settings.

Table 2 Predictive Performance of PPK–ML Models After Polynomial Calibration Across Different Trough-Sampling Windows in Testing Cohort

The final PPK–ML model was selected by jointly considering CL/F prediction and downstream concentration prediction performance. Although CatBoost showed slightly better overall fit for CL/F, XGBoost yielded lower concentration prediction errors after extrapolation and calibration (MAE = 0.357, RMSE = 0.526, MAPE = 7.78% vs. CatBoost: MAE = 0.391, RMSE = 0.563, MAPE = 8.36%; Table 3). Consequently, PPK–XGBoost was selected as the final model, and its results were shown in Figure 2E. After nonlinear calibration, concentration prediction achieved R2 = 0.739, with most predictions within ±30% relative error, supporting reliable CL/F estimation and concentration extrapolation.

Table 3 Evaluating the Predictive Performance of Different PPK-ML Models for Voriconazole Clearance and Plasma Concentrations in Testing Cohort

To further evaluate the contribution of the hybrid PPK–ML design, three VCZ concentration prediction strategies were compared: a concentration prediction model without PK parameters, a concentration prediction model incorporating PK parameters, and a hybrid model strategy predicting CL/F first and then extrapolating plasma concentrations. The hybrid strategy achieved the most favorable overall performance, indicating that using CL/F as an intermediate PK-linked target improved the consistency of concentration prediction (Table 4).

Table 4 Comparison of Model Predictive Performance Under Different Plasma Concentration Prediction Strategies in Testing Cohort

Feature Importance and Model Interpretability

The SHAP method was used to interpret the CL/F prediction process, quantifying feature importance and their contribution to prediction outcomes (Figure 2F and G). Scatter plot colors indicated the magnitude of the feature value, with red representing higher values and blue representing lower values. Results showed age and CRP exhibited the highest importance among all features. ALB showed a positive correlation with the prediction outcome, while age, CRP, weight, and TBIL exhibited negative correlations. The DAG (Supplementary Figure 4) illustrated the influence pathways of demographic factors (age, gender, weight), genetic factors (genotype), treatment-related factors (daily dose), inflammatory markers (CRP), and liver function indicators (ALB, TBIL) on the outcome variable (VCZ plasma concentration) through the pharmacokinetic mediator variable (CL/F).

To enhance clinical accessibility and usability of the developed PPK-ML hybrid prediction model, a real-time VCZ plasma concentration predictor has been deployed to the cloud for multi-stage utilization (https://voriconazole-mipd-tool.streamlit.app/). For specific details, see Figure 3. This web-based, user-friendly tool enables clinicians to obtain predictions by inputting relevant patient information.

Voriconazole plasma concentration estimator with patient data inputs and dose optimization recommendations.

Figure 3 Real-time prediction network calculator for VCZ plasma concentration. Users can predict plasma concentrations by clicking the “Estimate Concentration & Optimize Dose” button via the online link.

Abbreviations: VCZ, voriconazole; CL/F, clearance; CRP, C-reactive protein.

Impact Effect Analysis of CYP2C19 Genotypes and CRP on CL/F

Given the prominent importance of CRP in the CL/F prediction phase as indicated by SHAP results, we further analyzed the association between CYP2C19 genotypes and CL/F under different inflammatory states. Using CRP as the stratification variable, patients were categorized into low-inflammation and high-inflammation groups based on a CRP threshold of ≥100 mg/L,21 and CL/F distributions across metabolic phenotypes were compared within each CRP stratum.

Results (Figure 4A) showed significant differences in CL/F across CYP2C19 genotypes in the low-CRP group (n=382, p=7.89×10−10), with substantial intergroup variability (PM: 2.97 ± 1.27 L/h; IM: 3.52 ± 1.14 L/h; NM: 4.19 ± 1.57 L/h). Under high-CRP conditions (n = 164), although CL/F differences among genotypes remained statistically significant (p = 2.66 × 10−8), the magnitude of inter-phenotype differences was markedly decreased, and distributional overlap increased substantially (PM: 1.88 ± 0.99 L/h; IM: 2.27 ± 0.62 L/h; NM: 2.98 ± 0.98 L/h). These findings indicated that high-level inflammation more strongly attenuated the effect of CYP2C19 genotype on CL/F.

Mixed plots showing VCZ CL over F, NIE distribution and regression relationships with CRP, age and ALB.

Figure 4 Results of causal mediation analysis. (A) Distribution of VCZ CL/F across CYP2C19 genotypes in different inflammatory states. Boxes denote interquartile ranges, the median line indicates the median, and individual observations are represented as scatter points. (B) Distribution of standardized NIE estimated from 1000 bootstrap resamples. (C) Scatter plots and regression lines of VCZ plasma concentration versus CL/F. Stratified by CRP level (Low CRP: ≤100 mg/L; High CRP: >100 mg/L). (D) Relationship between CL/F and CRP. Stratified by age group (Age ≤65 years; Age >65 years). (E) Relationship between VCZ plasma concentration and CRP. Stratified by ALB level (ALB ≤35 g/L; ALB 35–55 g/L; ALB >55 g/L). (F) Standardized regression coefficients of predictors for CL/F from multivariable linear regression. (G) Standardized regression coefficients of predictors for VCZ plasma concentration from multivariable linear regression.

Abbreviations: VCZ, voriconazole; CL/F, clearance; NIE, natural indirect effect; CRP, C-reactive protein; ALB, albumin; TBIL, total bilirubin.

Concurrently, counterfactual causal mediation analysis (Table 5) revealed a significant total effect (TE > 0) between CRP and VCZ plasma concentration. Decomposition of the TE indicated that CRP predominantly exerted a substantial natural indirect effect (NIE) by reducing CL/F, while the natural direct effect (NDE) was relatively minor. These findings suggest that CL/F plays a dominant mediating role in the influence of CRP on VCZ exposure. This mediating effect demonstrated robust stability in bootstrap analysis (Figure 4B). Figure 4C illustrated the negative correlation between CL/F and plasma concentration, with a steeper decline observed at high-CRP states (CRP > 100 mg/L). These findings indicated that increased inflammation further amplified CL/F reduction, thereby increasing the risk of elevated VCZ concentrations.

Table 5 Counterfactual Mediation Analysis of the Effect of CRP on Voriconazole Exposure Mediated by CL/F

Overall, elevated inflammatory burden significantly attenuated the influence of CYP2C19 genotype on CL/F yet did not completely eliminate genotype-related differences. This manifested as a reduction in inflammation-related genotype effects rather than complete phenotypic reversal.

Age and ALB Effects on the Modulation of Inflammation–PK Relationships

Stratified analysis revealed heterogeneity in the association between CRP and PK parameters across different population characteristics (Figure 4D). After stratification by age, the negative correlation between elevated CRP and CL/F was more pronounced in the elderly group (>65 years) than in the younger group (≤65 years), with a steeper trend observed. Figure 4E indicated that the association pattern between CRP and VCZ plasma concentrations differed across ALB strata. In the low-ALB group (≤35 g/L) and moderate-ALB group (35–55 g/L), elevated CRP showed a significant positive correlation with increased plasma concentrations. In contrast, this association was markedly attenuated or even directionally reversed in the high-ALB group (>55 g/L).

Path Decomposition Analysis of the CL/F-Dominated Mediating Mechanism

Path decomposition analysis quantified the relative contributions of predictors to VCZ pharmacokinetics. Figure 4F demonstrated that in the multivariate regression model with CL/F as the outcome, CYP2C19 genotype and sex exhibited strong predictive effects while the direct effects of CRP and certain liver function indicators (eg, ALB, TBIL) were relatively weaker. CRP remained associated with CL/F after multivariable adjustment, but with a smaller effect size. After further controlling for key pharmacokinetic intermediate variables, the direct effects of most covariates on VCZ plasma concentrations approached zero, with only a few variables (such as CRP or certain liver function markers) retaining a slight direct association (Figure 4G).

Overall, these results indicated that covariate effects on VCZ exposure were largely mediated through CL/F, consistent with the causal mediation analysis and supporting CL/F as a key intermediary in the inflammation–exposure association.

Temporal External Validation and Dose-Adjustment Scenario Simulation

The temporal external validation cohort included 102 patients with 166 blood samples. The model maintained good performance in predicting CL/F (R2 = 0.765) (Figure 5A). Furthermore, after nonlinear calibration, plasma concentration prediction achieved R2 = 0.661, MAE = 0.473, RMSE = 0.651, MAPE = 14.71%, with 84.9% of prediction points falling within ±30% relative error (Figure 5B). These results indicated that the model retained acceptable predictive performance in the temporal external validation cohort.

Different types of data visualizations with two scatter plots and one dose optimization simulation plot.

Figure 5 Performance of the hybrid model in temporal external validation and dose-adjustment scenarios. (A) Temporal external validation performance of the hybrid model (CL/F prediction). (B) Temporal external validation performance of the hybrid model (Concentration prediction). (C) Dose-adjustment scenario simulation performance.

Abbreviations: CL/F, clearance; MAE, mean absolute error; RMSE, root mean square error; MAPE, mean absolute percentage error.

A predefined dose-adjustment scenario simulation was then performed in 147 high-exposure individuals with plasma concentrations >5.0 mg/L. Simulations assumed steady-state conditions and short-term stability of key covariates. As shown in Figure 5C, after simulated dose adjustment, predicted trough concentrations shifted toward the therapeutic range, with a mean theoretical daily-dose reduction of 192 mg. Individualized simulated-dose recommendations and corresponding predicted trough concentrations were summarized in Supplementary Table 3. No predicted trough concentrations fell below 0.5 mg/L in this simulation. This analysis was based on model-generated projections under predefined assumptions and did not represent evidence of clinical efficacy after actual dose adjustment.

Discussion

In this study, we developed a hybrid PPK–ML framework that predicts VCZ exposure by embedding ML-estimated CL/F within a pharmacokinetic structural model. Using dual-feature selection (Boruta and LASSO), seven predictors were identified: CYP2C19 genotype, CRP, age, ALB, sex, weight, and TBIL. XGBoost was applied to estimate individual CL/F values, which were subsequently mapped to steady-state concentrations through a PK-derived algebraic equation with nonlinear calibration. The model achieved R2 = 0.739 (MAE = 0.357, RMSE = 0.526, MAPE = 7.78%) in internal validation and R2 = 0.661 (MAE = 0.473, RMSE = 0.651, MAPE = 14.71%) in temporal external validation, with 84.9% of external predictions falling within ±30% relative error.

A major strength of our framework is that it uses the PPK model as an explicit structural constraint rather than merely a source of input features. Within this structure-constrained, data-driven framework, CL/F serves as the key intermediate linking clinical covariates to drug exposure. By embedding ML into a predefined pharmacokinetic structure, the model enhances predictive performance while preserving interpretability. Unlike purely data-driven concentration models, our approach maps covariate effects to CL/F and then to exposure through the PPK equation, providing a clearer basis for individualized dosing decisions. Dose-adjustment simulations further support the feasibility of model-informed decision-making under steady-state assumptions, which is particularly relevant for drugs with narrow therapeutic windows.22 This strategy is conceptually consistent with prior PK/PK–PD and Monte Carlo–based dose optimization frameworks.23–25 However, many existing ML approaches have incorporated PK information only as model inputs, rather than using PK structure as an explicit constraint, which may limit mechanistic interpretability.5,17–19 In addition, despite achieving predictive performance broadly comparable to ours, they generally did not incorporate CYP2C19 genotype and CRP, two clinically important determinants of VCZ metabolism that reflect inherited metabolic capacity and inflammatory status. Our framework therefore provides a more biologically informed, interpretable, and clinically actionable approach to individualized dosing support.

Consistent with this, a key biological finding is the significant interaction between CYP2C19 genotype and inflammatory burden. Under low-CRP conditions, CL/F differed markedly across metabolic phenotypes. Under high inflammatory burden (CRP ≥100 mg/L), these differences narrowed substantially, although they remained statistically significant. This pattern is consistent with cytokine-mediated suppression of CYP expression during systemic inflammation,26–28 and supports the meta-analytic findings of Bolcato et al, which demonstrated attenuation of genotype effects under inflammatory stress.29 Importantly, genotype-related differences were not fully abolished, supporting a “partial masking” rather than complete phenoconversion. From a clinical perspective, this suggests that genotype-based dosing should not be interpreted in isolation, but rather in conjunction with the patient’s current inflammatory status.

Counterfactual causal mediation analysis provided additional mechanistic insight: CRP influenced VCZ concentrations predominantly through an indirect pathway via CL/F reduction, whereas its direct effect independent of CL/F was minimal. This finding is biologically coherent with inflammation-mediated suppression of CYP activity, leading to reduced metabolic capacity and elevated exposure.30–32 Path decomposition further confirmed that, after conditioning on CL/F, the direct effects of most covariates (including CRP) on VCZ concentrations approached zero, reinforcing CL/F as the dominant mechanistic intermediary. These results suggest that CRP-related increases in VCZ exposure are largely mediated by reduced CL/F, emphasizing the clinical relevance of monitoring inflammatory status when assessing individualized dosing risk.

Beyond inflammation and genetics, host-state variables demonstrated effect-modifier characteristics. The CRP–CL/F association was steeper in elderly patients (>65 years), consistent with age-related reductions in hepatic metabolic reserve. Low ALB (≤35 g/L) amplified the positive association between CRP and VCZ concentrations, likely reflecting ALB as an inverse acute-phase reactant and a marker of impaired hepatic synthetic capacity or greater inflammatory burden, rather than serving as an independent mediating pathway.32 These observations indicate that age and overall host inflammatory status may further modify exposure risk in the presence of inflammation. Therefore, a more integrated assessment of patient status is necessary when making dosing decisions.

Based on these findings, our clinical decision tool may be more appropriately considered as an adjunct to TDM rather than a replacement for it. Its potential value lies in helping clinicians identify situations in which exposure risk may be changing. For instance, when CRP rises rapidly, ALB decreases, or medication changes occur, the system alerts clinicians to an increased exposure risk and recommends closer monitoring. Similarly, it strengthens warnings about out-of-therapeutic-window risks in patients with genetically indicated low metabolic capacity and high inflammatory burden. In this way, the tool is intended to support earlier recognition of overexposure risk and more individualized decisions on monitoring or dose reassessment, rather than providing a single definitive concentration estimate. Previous studies have demonstrated an association between TDM and clinical outcomes, with risks particularly concentrated during periods of excessive or insufficient exposure.33–36 Future prospective studies should evaluate whether integrating this tool into routine TDM workflows can improve earlier risk recognition and individualized dose adjustment in clinical practice.

Although the model demonstrated stability in temporal external validation, several limitations should be noted. First, individual CL/F values used as ML targets were post-hoc empirical Bayes estimates derived from a validated PPK model.9 Although η-shrinkage was low (16.2%), suggesting substantial information from observed TDM data, some degree of estimation uncertainty may propagate through the ML–PK prediction chain. Second, the steady-state algebraic equation is structurally consistent with the source PPK model; however, its performance may be reduced at higher exposure ranges where saturable metabolism becomes clinically relevant. Third, despite consecutive screening, training-cohort–anchored preprocessing, calibration, and temporal external validation, residual bias from the retrospective design, missing data, real-world trough-sampling variability, and the absence of geographic validation cannot be excluded. Finally, the dose-adjustment simulation was based on hypothetical scenarios under idealized assumptions rather than on direct evidence of clinical benefit. Prospective multicenter validation, implementation studies in routine TDM practice, and comparison with Bayesian MIPD approaches will be important for determining the real-world utility of this framework.

Conclusions

In summary, we developed a novel hybrid PPK–ML framework that combines ML-based prediction of CL/F with PPK structure-based inference of VCZ exposure. By retaining pharmacokinetic structure within the prediction process, this approach achieves robust predictive performance while preserving mechanistic interpretability. Our results further support a biologically and clinically meaningful pathway in which inflammation suppresses metabolic capacity, reduces CL/F, and increases drug exposure. In clinical practice, this framework may help identify overexposure risk earlier, support safer and more individualized dose adjustment during TDM, and ultimately improve VCZ treatment management. It also provides a practical basis for future multicenter prospective studies to determine whether model-informed dosing can translate into better clinical outcomes. In addition, this structure-constrained strategy may be extended to other drugs with narrow therapeutic windows, where both predictive accuracy and interpretability are essential.

Data Sharing Statement

All generated datasets can be obtained by contacting Professor Nan Hu. For reproducibility, our code is available at GitHub: https://github.com/Harkool/Voriconazole.

Ethics Statement

The study was approved by the Ethics Committee of the Third Affiliated Hospital of Soochow University (Approval No. 2023-038) and was conducted in accordance with the Declaration of Helsinki and its later amendments. All patient data were anonymized before analysis, and strict confidentiality was maintained throughout the study in accordance with institutional regulations.

Acknowledgments

The authors would like to express gratitude to the Third Affiliated Hospital of Soochow University for providing data and financial support.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agreed to be accountable for all aspects of the work.

Funding

This work was supported by the National Natural Science Foundation of China (82173899), Jiangsu Pharmaceutical Association (H202108, A2021024, Q202202, JY202207, Z04JKM2023E040), Top Talent of Changzhou “The 14th Five-Year Plan” High-Level Health Talents Training Project (2022CZBJ04).

Disclosure

The authors report no conflicts of interest in this work.

References

1. Zonios D, Yamazaki H, Murayama N, et al. Voriconazole metabolism, toxicity, and the effect of cytochrome P450 2C19 genotype. J Infect Dis. 2014;209(12):1941–17. doi:10.1093/infdis/jiu017

2. Lass-Flörl C, Steixner S. The changing epidemiology of fungal infections. Mol Aspect Med. 2023;94:101215. doi:10.1016/j.mam.2023.101215

3. Dolton MJ, McLachlan AJ. Voriconazole pharmacokinetics and exposure-response relationships: assessing the links between exposure, efficacy and toxicity. Int J Antimicrob Agents. 2014;44(3):183–193. doi:10.1016/j.ijantimicag.2014.05.019

4. Herbrecht R, Denning DW, Patterson TF, et al. Voriconazole versus amphotericin B for primary therapy of invasive aspergillosis. New Engl J Med. 2002;347(6):408–415. doi:10.1056/NEJMoa020191

5. Liu R, Ma P, Chen D, et al. A real-time plasma concentration prediction model for voriconazole in elderly patients via machine learning combined with population pharmacokinetics. Drug Des Devel Ther. 2025;19:4021–4037. doi:10.2147/DDDT.S495050

6. Jiang L, Lin Z. Voriconazole: a review of adjustment programs guided by therapeutic drug monitoring. Front Pharmacol. 2024;15:1439586. doi:10.3389/fphar.2024.1439586

7. Chen K, Zhang X, Ke X, Du G, Yang K, Zhai S. Individualized medication of voriconazole: a practice guideline of the division of therapeutic drug monitoring, chinese pharmacological society. Ther Drug Monit. 2018;40(6):663–674. doi:10.1097/FTD.0000000000000561

8. Klomp SD, Veringa A, Alffenaar JC, et al. Inflammation altered correlation between CYP2C19 genotype and CYP2C19 activity in patients receiving voriconazole. Clin Transl Sci. 2024;17(7):e13887. doi:10.1111/cts.13887

9. Ling J, Yang X, Dong L, Jiang Y, Zou S, Hu N. Influence of C-reactive protein on the pharmacokinetics of voriconazole in relation to the CYP2C19 genotype: a population pharmacokinetics analysis. Front Pharmacol. 2024;15:1455721. doi:10.3389/fphar.2024.1455721

10. Encalada Ventura MA, van Wanrooy MJ, Span LF, et al. Longitudinal analysis of the effect of inflammation on voriconazole trough concentrations. Antimicrob Agents Chemother. 2016;60(5):2727–2731. doi:10.1128/AAC.02830-15

11. Theuretzbacher U, Ihle F, Derendorf H. Pharmacokinetic/pharmacodynamic profile of voriconazole. Clin Pharmacokinet. 2006;45(7):649–663. doi:10.2165/00003088-200645070-00002

12. Li X, Frechen S, Moj D, et al. A physiologically based pharmacokinetic model of voriconazole integrating time-dependent inhibition of CYP3A4, genetic polymorphisms of CYP2C19 and predictions of drug-drug interactions. Clin Pharmacokinet. 2020;59(6):781–808. doi:10.1007/s40262-019-00856-z

13. Ribba B, Dudal S, Lavé T, Peck RW. Model-informed artificial intelligence: reinforcement learning for precision dosing. Clin Pharmacol Ther. 2020;107(4):853–857. doi:10.1002/cpt.1777

14. Doupe P, Faghmous J, Basu S. Machine learning for health services researchers. Value Health. 2019;22(7):808–815. doi:10.1016/j.jval.2019.02.012

15. Obermeyer Z, Emanuel EJ. Predicting the future - big data, machine learning, and clinical medicine. New Engl J Med. 2016;375(13):1216–1219. doi:10.1056/NEJMp1606181

16. Rowe M. An introduction to machine learning for clinicians. Acad Med. 2019;94(10):1433–1436. doi:10.1097/ACM.0000000000002792

17. Shen L, Hu M, Xu X, et al. Precision dosing of voriconazole in immunocompromised children under 2 years: integrated machine learning and population pharmacokinetic modeling. Front Pharmacol. 2025;16:1671652. doi:10.3389/fphar.2025.1671652

18. Huang Y, Zhou Y, Liu D, et al. Comparison of population pharmacokinetic modeling and machine learning approaches for predicting voriconazole trough concentrations in critically ill patients. Int J Antimicrob Agents. 2025;65(2):107424. doi:10.1016/j.ijantimicag.2024.107424

19. Cheng L, Zhao Y, Liang Z, et al. Prediction of plasma trough concentration of voriconazole in adult patients using machine learning. Euro J Pharm Sci. 2023;188:106506. doi:10.1016/j.ejps.2023.106506

20. Hanai Y, Ueda T, Hamada Y, et al. Optimal timing for therapeutic drug monitoring of voriconazole to prevent adverse effects in Japanese patients. Mycoses. 2023;66(12):1035–1044. doi:10.1111/myc.13639

21. Encalada Ventura MA, Span LF, van den Heuvel ER, Groothuis GM, Alffenaar JW. Influence of inflammation on voriconazole metabolism. Antimicrob Agents Chemother. 2015;59(5):2942–2943. doi:10.1128/AAC.04789-14

22. Liu P, Mould DR. Population pharmacokinetic-pharmacodynamic analysis of voriconazole and anidulafungin in adult patients with invasive aspergillosis. Antimicrob Agents Chemother. 2014;58(8):4727–4736. doi:10.1128/AAC.02809-13

23. Wang T, Chen S, Sun J, et al. Identification of factors influencing the pharmacokinetics of voriconazole and the optimization of dosage regimens based on Monte Carlo simulation in patients with invasive fungal infections. J Antimicrob Chemother. 2014;69(2):463–470. doi:10.1093/jac/dkt369

24. Jiang Z, Wei Y, Huang W, et al. Population pharmacokinetics of voriconazole and initial dosage optimization in patients with talaromycosis. Front Pharmacol. 2022;13:982981. doi:10.3389/fphar.2022.982981

25. Lin XB, Li ZW, Yan M, et al. Population pharmacokinetics of voriconazole and CYP2C19 polymorphisms for optimizing dosing regimens in renal transplant recipients. Br J Clin Pharmacol. 2018;84(7):1587–1597. doi:10.1111/bcp.13595

26. Morgan ET. Impact of infectious and inflammatory disease on cytochrome P450-mediated drug metabolism and pharmacokinetics. Clin Pharmacol Ther. 2009;85(4):434–438. doi:10.1038/clpt.2008.302

27. Aitken AE, Richardson TA, Morgan ET. Regulation of drug-metabolizing enzymes and transporters in inflammation. Annu Rev Pharmacol Toxicol. 2006;46:123–149. doi:10.1146/annurev.pharmtox.46.120604.141059

28. Stanke-Labesque F, Gautier-Veyret E, Chhun S, Guilhaumou R. Inflammation is a major regulator of drug metabolizing enzymes and transporters: consequences for the personalization of drug treatment. Pharmacol Ther. 2020;215:107627. doi:10.1016/j.pharmthera.2020.107627

29. Bolcato L, Khouri C, Veringa A, et al. Combined impact of inflammation and pharmacogenomic variants on voriconazole trough concentrations: a meta-analysis of individual data. J Clin Med. 2021;10(10):2089. doi:10.3390/jcm10102089

30. Veringa A, Ter Avest M, Span LF, et al. Voriconazole metabolism is influenced by severe inflammation: a prospective study. J Antimicrob Chemother. 2017;72(1):261–267. doi:10.1093/jac/dkw349

31. van Wanrooy MJ, Span LF, Rodgers MG, et al. Inflammation is associated with voriconazole trough concentrations. Antimicrob Agents Chemother. 2014;58(12):7098–7101. doi:10.1128/AAC.03820-14

32. Li X, Lai F, Jiang Z, et al. Effects of inflammation on voriconazole levels: a systematic review. Br J Clin Pharmacol. 2022;88(12):5166–5182. doi:10.1111/bcp.15495

33. Park WB, Kim NH, Kim KH, et al. The effect of therapeutic drug monitoring on safety and efficacy of voriconazole in invasive fungal infections: a randomized controlled trial. Clin Infect Dis. 2012;55(8):1080–1087. doi:10.1093/cid/cis599

34. Miyakis S, van Hal SJ, Ray J, Marriott D. Voriconazole concentrations and outcome of invasive fungal infections. Clin Microbiol Infect. 2010;16(7):927–933. doi:10.1111/j.1469-0691.2009.02990.x

35. Jin H, Wang T, Falcione BA, et al. Trough concentration of voriconazole and its relationship with efficacy and safety: a systematic review and meta-analysis. J Antimicrob Chemother. 2016;71(7):1772–1785. doi:10.1093/jac/dkw045

36. Hamada Y, Seto Y, Yago K, Kuroyama M. Investigation and threshold of optimum blood concentration of voriconazole: a descriptive statistical meta-analysis. J Infect Chemother. 2012;18(4):501–507. doi:10.1007/s10156-011-0363-6

Creative Commons License © 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.