Back to Journals » Neuropsychiatric Disease and Treatment » Volume 22

Machine Learning Models for Predicting Antipsychotic Effectiveness and Separate Cost-Effectiveness Analysis in Hospitalized Schizophrenia Patients

Authors Zhang J ORCID logo, Xu Q ORCID logo, Jiang W, Sun D, Peng L

Received 27 November 2025

Accepted for publication 26 February 2026

Published 10 March 2026 Volume 2026:22 582314

DOI https://doi.org/10.2147/NDT.S582314

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 3

Editor who approved publication: Dr Rakesh Kumar



Jiatong Zhang,1 Qian Xu,2 WenLong Jiang,3 DaWei Sun,3 LongYan Peng4

1College of Pharmacy, Qiqihar Medical University, Qiqihar, Heilongjiang, People’s Republic of China; 2Department of Pharmacy, The Third Hospital of Daqing City, Daqing, Heilongjiang, People’s Republic of China; 3Department of Psychiatry, The Third Hospital of Daqing City, Daqing, Heilongjiang, People’s Republic of China; 4Department of Psychology, The Third Hospital of Daqing City, Daqing, Heilongjiang, People’s Republic of China

Correspondence: Qian Xu, Department of Pharmacy, The Third Hospital of Daqing City, Daqing, Heilongjiang, People’s Republic of China, Tel +8619845661688, Email [email protected]

Purpose: Schizophrenia is a burden on patients’ health and finances and long-term antipsychotic treatment is required; treatment response differs among patients. This study aims to leverage data from Chinese hospitals to develop a machine learning (ML) model that predicts antipsychotic treatment efficacy in patients with schizophrenia and to conduct a payer-perspective cost-effectiveness analysis to inform clinical practice.
Patients and Methods: This single-center, real-world retrospective cohort study included 834 patients with schizophrenia from a Chinese hospital. Eight models were constructed using ML and performance was assessed. The model with highest accuracy was determined based on the area under the receiver operating characteristic curve (AUC). We used the Shapley Additive Explanations (SHAP) values to determine the relative importance of each factor. Cost-effectiveness and incremental cost-effectiveness analyses were performed to assess cost-effectiveness of various treatments. A univariate sensitivity analysis was also conducted to validate the results.
Results: The top 10 strongly correlated variables, identified through the Boruta algorithm, were selected for in-depth analysis to construct the model. GBM demonstrates the highest performance following a comprehensive evaluation. On the independent test set, our model achieved an AUC of 0.879 (95% CI: 0.833– 0.924), an accuracy of 0.836, and a recall of 0.823. Based on this model, we developed and made publicly available an online prediction calculator to assist in clinical decision-making. Among all the treatment regimens, risperidone was the most cost-effective.
Conclusion: The GBM model and its online calculator predict the treatment efficacy for hospitalized schizophrenia patients, aiding doctors in tailoring personalised treatment strategies. Risperidone tablets exhibit the highest cost-effectiveness in treatment, guiding the optimization of treatment plans and cost reduction.

Keywords: machine learning, schizophrenia, prediction, antipsychotics, cost-effectiveness

Introduction

Schizophrenia is a chronic, multifaceted, and heterogeneous mental disorder that has long posed significant challenges to public health.1 Epidemiological surveys in China have shown that the weighted lifetime prevalence of schizophrenia is 0.7%.2 Research indicates that men typically experience the onset of schizophrenia earlier than women, with the initial occurrence in men generally between the ages of 20 and 24, while in women, it tends to occur approximately five years later or even beyond.3 Modern medicine has made significant progress on mental health, institutions now offer systematic treatment plans for schizophrenia which combine antipsychotic medications with psychosocial treatments. These plans aim at improving the condition of patients and improving quality of life. Relapse rate remains a problem and can reach 63% within 2 years and 80% within 5 years.4,5 Keeping patients stable requires significant psychosocial resources and long-term use of antipsychotic drugs to maintain.6 Doctors typically choose antipsychotic drugs based on the effect of the drugs on side effects.7 Among Chinese patients, 87.3% received monotherapy; of these, 78.9% were treated primarily with oral second-generation antipsychotics (APs). The most common combination regimen involved coadministration of two oral APs, while only 0.3% of patients received combinations of three or more drugs.8 In public health the economic efficiency of drugs is assessed by weighing efficacy, side effects and costs to optimize treatment results within patients’ financial means.7

Machine learning (ML), a fundamental aspect of artificial intelligence, is increasingly showing significant potential in clinical applications.9–11 Machine learning dynamically assesses medical data using artificial intelligence to extract treatment outcome-related features from intricate interactive variables, encompassing clinical, genetic, and environmental factors. This approach develops an efficacy prediction model applicable for real-world clinical analysis, facilitating medical decision-making.10,12,13 Machine learning models are widely used in medical research. However, their “black-box” nature produces opaque decision-making logic that presents a major obstacle to clinical translation and practical application.14 To improve model interpretability and clarify feature contributions, we applied the Shapley Additive exPlanations (SHAP) framework for post-hoc attribution analysis. By quantifying each feature’s contribution to classification outcomes, SHAP systematically evaluated the relative importance of features and provided a transparent, interpretable basis for model decision-making.15

In machine learning modeling, selecting outcome indicators requires consideration of data accessibility, clinical relevance, and comparability with prior research. For schizophrenia, efficacy assessment should encompass core dimensions such as symptom severity and changes in treatment response. We selected the Brief Psychiatric Rating Scale (BPRS) primarily because it provides a comprehensive evaluation of symptom severity and treatment response and meets the practical needs of real-world clinical follow-up. The BPRS is widely used in real-world studies of schizophrenia and related heterogeneous populations; it effectively quantifies psychopathology, assesses intervention effects, and has been validated as a pragmatic cross-diagnostic tool with high clinical compatibility and low measurement bias.16

Combined antipsychotic drug regimens are common in clinical practice. However, their utility as a quantifiable predictor in machine-learning models developed from real-world data has not been systematically evaluated. Most current prediction models rely primarily on conventional variables—clinical data, neuroimaging, and monotherapy—and do not incorporate integrated analyses of complex treatment regimens.17–19 More importantly, existing studies generally focus on developing single prediction models and lack an integrated evaluation framework that combines treatment-effect prediction, individualized risk assessment, and cost-effectiveness analysis. This disconnect among components has impeded the effective translation of machine-learning models into clinical decision-support tools. Therefore, the goal of future research should be to translate these findings into clinical practice.20

This study aimed to systematically construct and validate a ML model that predicted antipsychotic drug efficacy at 6 weeks in patients with schizophrenia, using retrospective inpatient data from Chinese hospital information systems. We evaluated the importance of key predictors and applied the SHAP framework to enhance model interpretability, clarifying how individual variables influenced treatment outcomes within single samples.21 Building on these results, we developed an online clinical calculator that translated model outputs into individualized risk assessments. Concurrently, a cost-effectiveness analysis from the payer’s perspective provided a foundation for optimizing health resource allocation. Ultimately, the study established a full-chain methodological framework that integrated machine learning prediction, digital tool assessment, and health economic evaluation, offering real-world evidence–based support for clinical decision-making in precision treatment of schizophrenia.

Research Methods

Data Sources and Study Subjects

This retrospective study analyzed electronic medical records of schizophrenia patients hospitalized at a tertiary psychiatric hospital in Daqing City, Heilongjiang Province, China, from January 2022 to December 2024, using data from the hospital information system. Inclusion criteria were: (1) diagnosis of schizophrenia (ICD-10 F20); and (2) age >18 years. Exclusion criteria were: (1) missing data; (2) diagnosis of other primary mental disorders, such as schizoaffective disorder (ICD-10 F25), bipolar disorder, or substance-induced mental disorder; (3) presence of neurological or organic somatic conditions capable of directly producing psychotic symptoms (eg, dementia, epilepsy, brain tumor, thyroid crisis); (4) severe, uncontrolled somatic diseases likely to compromise treatment tolerance or outcome assessment (eg, end-stage renal failure, decompensated liver cirrhosis, active malignant tumor); and (5) history of alcohol or illicit drug abuse or dependence. We included 834 patients with enough complete demographic and clinical data. 595 patients received one AP and 239 received two APs.

Definitions and Clinical Variables

44 clinical factors potentially influencing therapeutic effects were extracted from the hospital information system. These factors encompass patients’ basic characteristics, including marital status, educational level, gender, age, body mass index (BMI), allergy history, family history, smoking and drinking histories, length of hospital stay, and history of schizophrenia. The comorbidities considered were anemia, hypertension (HTN), hyperlipidemia (HLD), diabetes mellitus (DM), coronary heart disease (CHD), thyroid disease, Cerebrovascular Accident (CVA), hyperprolactinemia (HPRL), extrapyramidal symptoms (EPS), common cold, and constipation. Pre-medication laboratory test indicators included High-density lipoprotein cholesterol (HDL-C), heart rate (HR), fasting blood glucose (FBG), hemoglobin (Hb), neutrophils (NEUT), aspartate aminotransferase (AST), low-density lipoprotein cholesterol (LDL-C), triglyceride (TG), serum creatinine (SCR), alanine aminotransferase (ALT), γ-glutamyl transferase (GGT), blood urea nitrogen (BUN), total cholesterol (TC), red blood cell (RBC), white blood cell (WBC) and lactate dehydrogenase (LDH). Treatment measures comprised the administration of one or two antipsychotic drugs, sedative-hypnotic drugs, and modified electroconvulsive therapy (MECT). During treatment, the the Treatment Emergent Symptom Scale (TESS) and BPRS were utilised. The BPRS scale is used to measure clinical improvement and the TESS measures adverse reactions.

The primary outcome of this study was the short-term therapeutic efficacy of antipsychotic drugs. Efficacy was assessed at the sixth week of treatment (approximately 42 days). This time point was selected with reference to relevant clinical practice guidelines and aligned with the routine observation cycle at our institution, aiming to evaluate the early core response to pharmacotherapy. Symptom changes were measured using the BPRS and dichotomized for analysis: after 6 weeks of treatment, patients whose total BPRS score decreased by < 25% from baseline were classified as “treatment non-responders” and designated as positive examples (target category) for the ML model, all others were classified as “treatment responders.”

Feature Selection and Model Construction

Enrolled patients were randomly assigned to a training set (70%, n = 584) and a test set (30%, n = 250). The training set was used to develop the model, while the test set was used to evaluate its performance.

Feature selection is a crucial component of data preprocessing: finding the most representative and valuable features of the set without removing unnecessary ones. This process enhances model performance, making it a critical aspect of model construction. In this study, the Boruta algorithm was employed to screen baseline variables and identify the most predictive features.22 Boruta is a feature-selection method that relies on variable-importance measures derived from random forests. Its core procedure compares the Z-values of real features with those of randomly generated “shadow features” and systematically retains variables whose importance exceeds random noise, thereby constructing a core feature set for the final prediction model.23 Based on Boruta’s importance ranking, the top ten variables were selected as the core features for model construction.

The model incorporated 10 variables and was trained on a sample of 834. Consequently, the sample-to-variable ratio far exceeded the commonly accepted empirical guideline in clinical prediction modeling—that the sample size should be at least 10–20 times the number of predictors (EPV ≥ 10)—thereby providing substantial protection against overfitting and supporting robust model training and validation.

Eight algorithms were used to build a prediction model for antipsychotic drugs in patients with schizophrenia, including k-nearest neighbours (KNN), extreme gradient boosting (XGBoost), support vector machine (SVM), logistic regression (LR), categorical boosting (CatBoost), neural networks (NN), light gradient boosting machine (LightGBM) and gradient boosting machine (GBM). The models can be categorized as follows: LR, KNN, SVM, and ANN are traditional single learners, while XGBoost, LightGBM, GBM, and CatBoost belong to the family of boosting ensemble algorithms. Unlike logistic regression, ensemble learning models can detect more complex latent patterns in high-dimensional data.24

In this study, the model was developed using 10-fold cross-validation, repeated five times. Hyperparameters were automatically optimised through a grid search process in the training phase.25 To evaluate the prediction model’s performance systematically, we employed a multi-dimensional validation strategy. First, we assessed the model’s overall discriminative ability by plotting the receiver operating characteristic (ROC) curve and calculating the area under the curve (AUC). We reported results as AUC values with 95% confidence intervals (CI) for the training set and the test set, respectively. In addition, we used core classification metrics, namely accuracy, precision, recall, and the F1 score, as supplementary evaluations to address prediction correctness, positive-prediction reliability, recognition completeness, and their trade-offs. Given the class-imbalanced distribution of the data in this study, we also plotted a precision–recall (PR) curve and calculated the average precision (AP) to provide a more thorough assessment of the model’s performance in identifying positive cases. Finally, we applied decision curve analysis (DCA) to evaluate the model’s clinical net benefit across different decision thresholds, thereby quantifying its practical application value. SHAP serves as a comprehensive framework for interpreting ML models, evaluating prediction outcomes by assessing the contribution and significance of each feature.23 We used the SHAP tool to interpret the ML model with the optimal performance.

Cost-Effectiveness Analysis

This study evaluated the cost-effectiveness ratios of monotherapy using four antipsychotic drugs. While both direct and indirect costs typically factor into treatment cost calculations, in this instance, all patients received oral formulations. Apart from sedative-hypnotics, the costs associated with other treatment items were largely comparable and effectively balanced each other out. In this regard we simplified the calculation using only drug price, and excluded other costs. Cost-effectiveness analysis and incremental cost-effectiveness analysis were utilised, with costs measured in Chinese Yuan (CNY). Given the strong correlation between cost-effectiveness analysis outcomes and parameter selection, a one-way sensitivity analysis of the results has been planned for further exploration.

The drug prices were obtained from the winning bid prices of centralized drug procurement by medical institutions on the medical insurance service platform in Heilongjiang Province, China. Quetiapine was obtained from Guangdong Dongyangguang Pharmaceutical Company Limited (Dongguan, Guangdong, China), with the batch number H20213870. Clozapine was provided by Jiangsu Enhua Pharmaceutical Joint Stock Company Limited (Xuzhou, Jiangsu, China), with the batch number H32022962. Olanzapine was supplied by Qilu Pharmaceutical Company Limited (Jinan, China), with the batch number H20183501. Risperidone was supplied by Qilu Pharmaceutical Company Limited (Jinan, China), with the batch number H20041808.

The quetiapine group received quetiapine tablets, 100 mg × 30 tablets, costing 0.92 yuan per tablet, with a maintenance dose of 300–450 mg/day. The clozapine group received clozapine tablets, 25 mg × 100 tablets, costing 0.04 yuan per tablet, with a maintenance dose of 100–200 mg/day. The olanzapine group received olanzapine tablets, 5 mg × 14 tablets, costing 1.01 yuan per tablet, with a maintenance dose of 5–20 mg/day. The risperidone group received risperidone tablets, 1 mg × 30 tablets, costing 0.05 yuan per tablet, with a maintenance dose of 2–6 mg/day.

Statistical Analysis

In this study, data analysis, development, and validation were conducted using R 4.5.1. Count data were presented as percentages (%), and inter-group comparisons were conducted using either the χ2 test or Fisher’s exact test, as appropriate. To assess whether measurement data adhered to a normal distribution, kurtosis and skewness tests were conducted; all measurement data in this study were found to follow a non-normal distribution. For data not following a normal distribution, results were expressed as medians with interquartile ranges (M [P25, P75]). The Mann–Whitney U-test or the Kruskal–Wallis test was employed for inter-group comparisons where necessary.

All statistical analyses and modeling in this study were conducted in the R environment (version 4.5.1). Feature screening was performed with the “Boruta” package, while model training and hyperparameter optimization were carried out using the “caret” package; hyperparameters were automatically tuned via that package’s built-in grid search. Model interpretability analyses were completed using the “shapviz” package. A complete list of R packages and their specific version numbers (eg, kernelshap_0.5.0) required for the analyses, together with the hyperparameter grid used during model training, are provided in the supplementary materials (Table S1 and List S1, S2) to ensure reproducibility.

Result

Baseline Characteristics

A total of 834 eligible patients were included and allocated to a training set (n = 584) and a test set (n = 250). In the training and test sets, 149 (25.5%) and 81 (32.4%) patients were smokers, respectively. Median HDL-C and LDL-C in the training set were 1.33 (IQR: 1.22, 1.40) and 3.04 (IQR: 2.46, 3.35); corresponding medians in the test set were 1.31 (IQR: 1.24, 1.40) and 3.04 (IQR: 2.50, 3.35). At admission, the median BPRS score was 41 (IQR: 36, 45) in the training set and 40 (IQR: 36, 46) in the test set. As shown in Table 1, the training and test groups were well balanced on the key clinical and demographic characteristics selected by the Boruta algorithm. No statistically significant differences were observed between the two groups for the 10 core predictive variables—including smoking history, laboratory indicators, and baseline symptom severity (BPRS score) (all p > 0.05). In this retrospective cohort, the actual median treatment duration was 48 days (IQR: 41, 56), which closely matched the pre-set efficacy evaluation time point of 6 weeks and thus supports the feasibility of that time point with real-world data. All baseline characteristics are provided in Supplementary Table S2. The results showed no systematic bias among the collected variables between groups, further supporting the homogeneity of the study population.

Table 1 Baseline Characteristics of the Dataset

Feature Selection

Boruta algorithm distinguishes strongly-correlated variables from weak ones, and thus increases prediction accuracy. The yellow boxes represent shadow features automatically generated by the algorithm, which have been excluded from the analysis to ensure focus on the most influential core variables.26,27 In this study, we applied the Boruta algorithm to the included variables and thereby identified the key predictors: Smoking, HLD, CHD, DM, CVA, BPRS, GGT, ALT, HDL-C, and LDL-C (Figure 1).

Figure 1 Boruta algorithm feature selection.

Abbreviations: APs, antipsychotics; HLD, hyperlipidemia; CHD, coronary heart disease; HPRL, hyperprolactinemia; DM, diabetes mellitus; CVA, Cerebrovascular Accident; HTN, hypertension.

Notes: Variables are ranked by importance. The final decision for each variable is color-coded: green (Confirmed), yellow (Tentative), and red/blue (Rejected).

Model Evaluation

This study evaluates the model’s performance based on accuracy, precision, sensitivity, F1 score, and AUC (Table 2). The AUC values of Logistic, Neural Network, XGBoost, LightGBM SVM, GBM, CatBoost, and KNN on testing set are 0.842, 0.855, 0.879, 0.850, 0.879, 0.831, 0.822, and 0.838 respectively (Figure 2 and Table 2). Both XGBoost and GBM have 0.879 AUC and the difference between training and test set is 0.044 indicating less noise and greater generalization stability. The difference between training and test set is 0.071 in the AUC difference and implies more overfitting as it is rated for the test set. In the calibration curve for the test set (Figure 2D), the GBM curve lies closest to the diagonal, indicating the greatest agreement between predicted and observed probabilities. By contrast, models such as XGBoost exhibit more pronounced calibration deviations. At the clinical decision-making level, the DCA based on GBM provided substantially higher clinical net benefit than other models across the entire reasonable threshold range (Figure 2F). The precision–recall curve showed that GBM had the smallest decline in average precision (ΔAP = 0.02) between the training and test sets, demonstrating superior generalization robustness on class-imbalanced data (Figure 3). Overall, GBM maintained excellent discriminative performance on the test set (AP = 0.94) and achieved the best balance of high precision and robustness. Considering its calibration, clinical net benefits, and generalization ability, GBM offered greater prediction reliability, clinical utility, and resistance to overfitting. Given these performance, GBM is the most suitable predictor of the data set and hence we selected GBM for further analysis.

Table 2 Model Performance Metrics Across the Training and Testing Sets

Figure 2 Performance evaluation of eight machine learning classifiers.

Abbreviations: ROC, Receiver Operating Characteristic; ML, machine learning; DCA, Decision curves analysis.

Notes: (A) ROC curves of ML models in training set. (B) ROC curves of ML models in test set. (C) Calibration curves of ML models in training set. (D) Calibration curves of ML models in test set. (E) DCA of ML models in training set. (F) DCA of ML models in test set.

Figure 3 Precision-recall curves of machine learning models.

Abbreviation: PR, Precision-Recall.

Notes: (A) Precision-Recall curves for the training set. (B) Precision-Recall curves for the independent test set.

GBM Model Analysis and the Accompanying Web Calculator

This study assesses the significance and contribution of strongly correlated variables in the GBM model through the Shap algorithm (Figure 4A). The model identified DM, baseline BPRS score, pre-medication ALT, pre-medication LDL-C, and smoking as the top five predictors in terms of importance. SHAP values distinguished positive and negative factors influencing efficacy prediction, enhancing the understanding of correlations between predicted and actual outcomes (Figure 4B). This study validated the model’s interpretability using a schizophrenia-prediction case. The results showed that patients with lower BPRS scores, no smoking history, no history of CHD or DM, and normal lipid indicators had higher treatment efficiency (Figure 4C).

Figure 4 GBM model analysis.

Abbreviations: CVA, cerebrovascular accident; CHD, coronary heart disease; DM, diabetes mellitus; GBM, gradient boosting machine; HLD, hyperlipidemia; SHAP, SHapley Additive exPlanations.

Notes: (A) SHAP variable importance ranking of the GBM model. (B) Beeswarm plot of SHAP values for the GBM model. (C) Examples of SHAP interpretation for efficacy.

For continuous variables, a deeper purple indicates a smaller value, whereas a more vibrant yellow signifies a larger value. Comorbid physical conditions such as DM, CVA, and CHD, along with baseline BPRS score, smoking, and elevated pre-medication ALT and LDL-C, were found to negatively correlate with efficacy prediction. Conversely, increased pre-medication HDL-C positively correlated with efficacy prediction. An online calculator, based on the GBM model, has been developed at https://treatment-efficacy.shinyapps.io/make_web/to assist in predicting treatment efficacy for hospitalized schizophrenia patients.

Cost-Effectiveness Analysis and Sensitivity Analysis

The Boruta algorithm excluded both combination antipsychotic drug treatment and monotherapy as weakly correlated variables, suggesting that combination drug treatment did not significantly enhance treatment efficacy in patients. This study chose monotherapy for the cost-effectiveness analysis. Patients were allocated to four treatment groups: quetiapine, clozapine, olanzapine and risperidone. No statistically significant differences were observed in the therapeutic effects, adverse reactions, or average 6-week treatment courses among the four patient groups (Table 3).

Table 3 Patient Characteristics and Baseline Variables

Statistical differences were observed among the four patient groups in their use of sedative-hypnotics, educational background, marital status, family history, and history of schizophrenia (P < 0.005). Other variables showed no statistical differences (P > 0.005). Further detailed analysis is available in Supplementary Table S3.

In this study, the Risperidone group exhibited the lowest cost-effectiveness ratio. Using the group with the lowest treatment effectiveness rate as a benchmark, the Clozapine group demonstrated the most favourable incremental cost-effectiveness ratio (Table 4). The uncertainty in drug pricing directly impacts its cost-effectiveness ratio, necessitating a sensitivity analysis. Assuming a 10% reduction in drug price while all other parameters remain unchanged, a one‑way sensitivity analysis was conducted (Table 4).

Table 4 Cost-Effectiveness Analysis and Sensitivity Analysis

Discussion

Schizophrenia, affecting 0.32% of the global population, is a major cause of disability worldwide.28,29 Schizophrenia significantly impacts both the health and economic well-being of patients, potentially reducing their lifespan by several decades, with a threefold higher mortality risk compared to the general population.30,31 Treatment options encompass medications and psychological interventions, the average monthly cost is over four times greater than for non-patients with similar demographics.32 Treatment responses vary significantly among individuals.1 Predicting the efficacy of schizophrenia treatments and assessing the costs associated with atypical antipsychotic treatments using real-world data from China holds considerable clinical value.

The literature has become increasingly focused on how artificial intelligence can predict treatment outcomes for schizophrenia.33 Various studies have utilised models incorporating diverse input features to forecast these outcomes. Most of the research focusses on factors in neuroimaging and clinical data, with relatively few studies examining the role of antipsychotic drugs.34,35 While traditional logistic regression is simple, easily interpretable, and clearly articulated, it falls short in capture the nonlinear relationships between variables and outcomes.36 Machine learning is well-suited to handle high-dimensional nonlinear data and produce highly accurate predictions.37 Over the last three years there have been a lot of publications in psychiatry focusing on ML.38

The findings of this study should be interpreted within the context of a real-world inpatient cohort. This study used eight different ML algorithms to construct a predictive model of schizophrenia treatment effectiveness for patients with schizophrenia. We found that GBM obtained high AUC in training sets and maintained strong prediction in test sets (AUC = 0.879, 95% CI: 0.8330.924). GBM performed well in this work. When broadening the application settings, it needs to be aware of its imbalanced data set due to which stability and generalization are compromised.

The opaque nature of ML algorithms has sparked concerns within the academic community regarding their transparency and potential biases. This has restricted their widespread use in clinical settings.39 The SHAP algorithm was employed to elucidate the specific influence of each feature on efficacy prediction.40 The findings revealed that somatic comorbidities, including DM, coronary heart disease, and CVA, alongside smoking, liver function indicators, and blood lipid levels, were pivotal in forecasting treatment outcomes. It is well-known that schizophrenia patients have high incidence of diseases such as anaemia, cardiomyopathy, DM, CVA and CHD.41 The impact of these diseases on treatment prognosis is frequently underestimated.42 Many schizophrenia patients are confronted with physical comorbidities which can interact and complicate treatment. Chronic somatic diseases may lead to psychiatric readmission.42,43 Schizophrenia patients with these comorbidities often suffer adverse reactions, potentially disrupting otherwise effective treatments.28 The initial severity of psychiatric symptoms is directly related to treatment. High scores on the BPRS, when admitted, tend to be detrimental to treatment. Poor liver function can lead to accumulation of active metabolites from antipsychotic drugs, changing blood drug concentrations and affecting treatment. Patients with liver cirrhosis exhibit significantly different clearance rate of quetiapine 48 hours after treatment than normal liver function patients. Individuals suffering from liver disease should increase dosages carefully.44 Research has shown metabolic syndrome is important predictor of schizophrenia treatment efficacy and correlates with treatment response, which corresponds to dyslipidemia.18 Smoking was found to be a predictor of treatment response. Schizophrenia sufferers smoke more frequently and find it harder to quit than general population.45,46 Smoking is shown to be associated with decreased effectiveness of olanzapine therapy due to CYP1A1/1A2 genotype and CYP2D6 metabolic status.47 The tobacco smoke constituents can increase metabolic rate of antipsychotic drugs, leading to reduced concentrations of these drugs in smokers.48 The baseline severity of psychiatric symptoms and somatic disorder were found to be predictive factors for antipsychotic treatment outcomes.49–51 The GBM model in this study demonstrated a predictive capability, indicating its feasibility and adaptability for clinical use. Clinicians may utilise the online calculator to forecast treatment efficacy and tailor personalised treatment plans, including pharmacogenetic testing, for patients with poor predicted treatment responses. SHAP-based interpretability analysis in this study showed that combined antipsychotic medication regimens were not strong predictors of treatment outcomes, consistent with prior meta-analytic evidence that combination therapy is not significantly superior to monotherapy in efficacy.52 This finding suggests that patients’ intrinsic pathophysiological characteristics may provide greater predictive value than the specific treatment regimen. Accordingly, the model serves not only as a prognostic tool but also as a methodological framework for empirically testing or challenging clinical hypotheses using real-world data. Clinicians can therefore use the online calculator to predict patients’ 6-week treatment responses and to tailor personalized treatment plans for those forecast to have poor responses, including individualized options such as pharmacogenetic testing.

The improvement of medication adherence is essential for schizophrenia patients in order to prevent psychotic relapse and to remission and recovery.28 This study ranked the efficacy of monotherapies as follows: Clozapine, Risperidone, Olanzapine, and Quetiapine. Clozapine’s relatively high incidence of adverse reactions has constrained its clinical use, and it is reserved as the first-line option only for treatment-resistant schizophrenia.53 Network meta-analysis of 32 APs showed Clozapine, Amisulpride, Olanzapine, Risperidone and Zotepine are more effective than other APs for primary outcomes, with no difference in efficacy between the remaining drugs.54 Cost-effectiveness analysis showed that Risperidone tablets cost 0.12 CNY per unit of effect, and therefore are the most economical choice. This reduces the burden on patients and families. These results are consistent with previous studies, which show Amisulpride, Olanzapine and Risperidone are the top three antipsychotic drugs based on cost effectiveness.7 Cost-effectiveness analysis also showed that Clozapine tablets cost 40.95 CNY per unit of effect, which is the lowest cost for refractory patients.55 Beyond clinical effectiveness, doctors should consider economic implications when treating schizophrenia patients.56

This study addresses a practical gap: when first-line antipsychotic agents demonstrate comparable efficacy at the population level, clinicians often lack quantitative tools to combine individual-level efficacy risk with treatment costs. We developed a framework that first identifies patients at high risk of treatment failure using ML models and then integrates an independent cost-effectiveness analysis, thereby offering clinicians a more cost-effective decision-support strategy grounded in individualized risk assessment and health-economics comparisons among options with similar efficacy. The clinical manifestations of this disorder are, however, highly heterogeneous: patients display substantial variation in symptom profiles, disease trajectories, and treatment responses. Such heterogeneity complicates efforts to capture the complex interactions among multiple variables and may reduce the accuracy and robustness of prediction models.

This study had three limitations. First, external generalizability at the research-design level requires further verification. As a single-center retrospective study, we depended on one institutional cohort; multi-center, large-sample prospective cohort studies are therefore needed for external validation. That step was indispensable before the conclusions could be broadly applied in clinical practice. Second, regarding assessment tools, most existing studies use the PANSS scale, while only a small number use the BPRS as the primary efficacy-evaluation instrument.57–59 We employed the BPRS for symptom assessment. Although the BPRS is widely used in clinical settings and yields readily accessible data, we did not include the PANSS for cross-validation, which may have limited the breadth of symptom-assessment dimensions. Subsequent studies should administer both scales concurrently and perform cross-validation across instruments to improve the comprehensiveness and accuracy of symptom assessment. Third, regarding outcome indicators, this study focused on short-term measures. Although these findings have direct implications for optimizing in-hospital treatment plans, the study lacked follow-up on patients’ long-term prognoses. Future studies will extend the follow-up period, incorporate long-term outcomes such as recurrence rate, social function recovery, and quality of life, and construct a more complete efficacy-evaluation system to provide more comprehensive evidence for the long-term standardized management of this disease.

Conclusion

Eight ML models were employed to review the hospitalization data of schizophrenia patients within the hospital information system, to develop an efficacy prediction model and an online calculator to improve the efficacy prediction in clinical practice. Cost effectiveness analysis was used to evaluate various drugs to determine which option best balances clinical benefits and medical expenses.

Institutional Review Board Statement

The Ethics Committee of Daqing Third Hospital, Heilongjiang Province, approved this study (Ethics Approval Number: (2025) KY No. 04), which complies with the Declaration of Helsinki. The need for written informed consent was waived by the Ethics Review Committee of Daqing Third Hospital due to the retrospective study. Data were sourced from patient medical records in the hospital information system, ensuring no interference with patient diagnosis or treatment. The research data excluded any personally identifiable information, fully safeguarding patient privacy and security.

Data Sharing Statement

Since the data concerning schizophrenia patients are sensitive, access to this data has to be approved by Third Hospital of Daqing City. Requests should be directed to the corresponding author.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

Innovation Fund Project for Graduate Students of Qiqihar Medical University (Grant No. QYYCX2024-23).

Disclosure

The authors declare no competing interests.

References

1. McCutcheon RA, Reis Marques T, Howes OD. Schizophrenia-an overview. JAMA Psychiatry. 2020;77(2):201–15. doi:10.1001/jamapsychiatry.2019.3360

2. Huang Y, Wang Y, Wang H, et al. Prevalence of mental disorders in China: a cross-sectional epidemiological study. Lancet Psychiatry. 2019;6(3):211–224. doi:10.1016/S2215-0366(18)30511-X

3. Kahn RS, Sommer IE, Murray RM, et al. Schizophrenia. Nat Rev Dis Primers. 2015;1:15067. doi:10.1038/nrdp.2015.67

4. Sato A, Moriyama T, Watanabe N, et al. Development and validation of a prediction model for rehospitalization among people with schizophrenia discharged from acute inpatient care. Front Psychiatry. 2023;14:1242918. doi:10.3389/fpsyt.2023.1242918

5. Rattehalli RD, Zhao S, Li BG, et al. Risperidone versus placebo for schizophrenia. Cochrane Database Syst Rev. 2016;12(12):CD006918. doi:10.1002/14651858.CD006918.pub3

6. Jauhar S, Johnstone M, McKenna PJ. Schizophrenia. Lancet. 2022;399(10323):473–486. doi:10.1016/S0140-6736(21)01730-X

7. Zhou J, Millier A, Aballea S, et al. Cost-effectiveness of ten commonly used antipsychotics in first-episode schizophrenia in the UK: economic evaluation based on a de novo discrete event simulation model. Br J Psychiatry. 2025;227(2):545–552. doi:10.1192/bjp.2024.251

8. Qiu H, He Y, Zhang Y, et al. Antipsychotic polypharmacy in the treatment of schizophrenia in China and Japan. Aust N Z J Psychiatry. 2018;52(12):1202–1212. doi:10.1177/0004867418805559

9. Xue B, Li D, Lu C, et al. Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Network Open. 2021;4(3):e212240. doi:10.1001/jamanetworkopen.2021.2240

10. English M, Kumar C, Ditterline BL, et al. Machine learning in neuro-oncology, epilepsy, Alzheimer’s Disease, and Schizophrenia. Acta Neurochir Suppl. 2022;134:349–361. doi:10.1007/978-3-030-85292-4_39

11. Skorobogatov K, De Picker L, Wu CL, et al. Immune-based machine learning prediction of diagnosis and Illness State in Schizophrenia and bipolar disorder. Brain Behav Immun. 2024;122:422–432. doi:10.1016/j.bbi.2024.08.013

12. Speiser JL, Callahan KE, Houston DK, et al. Machine learning in aging: an example of developing prediction models for serious fall injury in older adults. J Gerontol A. 2021;76:647–654. doi:10.1093/gerona/glaa138

13. Hunter DJ, Holmes C. Where medical statistics meets artificial intelligence. N Engl J Med. 2023;389:1211–1219. doi:10.1056/NEJMra2212850

14. Verma AA, Murray J, Greiner R, et al. Implementing machine learning in medicine. CMAJ. 2021;193(34):E1351–E1357. doi:10.1503/cmaj.202434

15. Wiggerthale J, Reich C. Explainable machine learning in critical decision systems: ensuring safe application and correctness. AI. 2024;5:2864–2896. doi:10.3390/ai5040138

16. Hofmann AB, Schmid HM, Jabat M, et al. Utility and validity of the Brief Psychiatric Rating Scale (BPRS) as a transdiagnostic scale. Psychiatry Res. 2022;314:114659. doi:10.1016/j.psychres.2022.114659

17. Di Camillo F, Grimaldi DA, Cattarinussi G, et al. Magnetic resonance imaging-based machine learning classification of schizophrenia spectrum disorders: a meta-analysis. Psychiatry Clin Neurosci. 2024;78(12):732–743. doi:10.1111/pcn.13736

18. Kim EY, Kim J, Jeong JH, et al. Machine learning prediction model of the treatment response in schizophrenia reveals the importance of metabolic and subjective characteristics. Schizophr Res. 2025;275:146–155. doi:10.1016/j.schres.2024.12.018

19. Li Y, Zhang L, Zhang Y, et al. A random forest model for predicting social functional improvement in Chinese patients with Schizophrenia after 3 months of atypical antipsychotic monopharmacy: a cohort study. Neuropsychiatr Dis Treat. 2021;17:847–857. doi:10.2147/NDT.S280757

20. Del Fabro L, Bondi E, Serio F, et al. Machine learning methods to predict outcomes of pharmacological treatment in psychosis. Transl Psychiatry. 2023;13(1):75. doi:10.1038/s41398-023-02371-z

21. Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2(1):56–67. doi:10.1038/s42256-019-0138-9

22. Dang T, Fermin A, Machizawa MG. oFVSD: a Python package of optimized forward variable selection decoder for high-dimensional neuroimaging data. Front Neuroinf. 2023;17:1266713. doi:10.3389/fninf.2023.1266713

23. Huang D, Gong L, Wei C, et al. An explainable machine learning-based model to predict intensive care unit admission among patients with community-acquired pneumonia and connective tissue disease. Respir Res. 2024;25(1):246. doi:10.1186/s12931-024-02874-3

24. Pantanowitz L, Pearce T, Abukhiran I, et al. Nongenerative artificial intelligence in medicine: advancements and applications in supervised and unsupervised machine learning. Mod Pathol. 2025;38(3):100680. doi:10.1016/j.modpat.2024.100680

25. Poldrack RA, Huckins G, Varoquaux G. Establishment of best practices for evidence for pre-diction: a review. JAMA Psychiatry. 2020;77:534–540. doi:10.1001/jamapsychiatry.2019.3671

26. Su Y, Li Y, Zhang H, et al. Machine learning model for prediction of permanent stoma after anterior resection of rectal cancer: a multicenter study. Eur J Surg Oncol. 2024;50(7):108386. doi:10.1016/j.ejso.2024.108386

27. Gao L, Wang GD, Yang XY, et al. Development of a risk prediction model for sepsis-related delirium based on multiple machine learning approaches and an online calculator. PLoS One. 2025;20(7):e0323831. doi:10.1371/journal.pone.0323831

28. Saboori Amleshi R, Ilaghi M, Rezaei M, et al. Predictive utility of artificial intelligence on schizophrenia treatment outcomes: a systematic review and meta-analysis. Neurosci Biobehav Rev. 2025;169:105968. doi:10.1016/j.neubiorev.2024.105968

29. Velligan DI, Rao S. The Epidemiology and Global Burden of Schizophrenia. J Clin Psychiatry. 2023;84(1):e21078. doi:10.4088/JCP.MS21078COM5

30. Guo LK, Su Y, Zhang YY, et al. Prediction of treatment response to antipsychotic drugs for precision medicine approach to schizophrenia: randomized trials and multiomics analysis. Mil Med Res. 2023;10(1):24. doi:10.1186/s40779-023-00459-7

31. Gatov E, Rosella L, Chiu M, et al. Trends in standardized mortality among individuals with schizophrenia, 1993–2012: a population-based, repeated cross-sectional study. CMAJ. 2017;189(37):1177–1187. doi:10.1503/cmaj.161351

32. Barbosa WB, Costa JO, de Lemos LLP, et al. Costs in the treatment of schizophrenia in adults receiving atypical antipsychotics: an 11-year cohort in Brazil. Appl Health Econ Health Policy. 2018;16(5):697–709. doi:10.1007/s40258-018-0408-4

33. Chakravarty MM. Guest editorial: special issue on machine learning in schizophrenia. Schizophr Res. 2019;214:1–2. doi:10.1016/j.schres.2019.10.044

34. Tarcijonas G, Sarpal DK. Neuroimaging markers of antipsychotic treatment response in schizophrenia: an overview of magnetic resonance imaging studies. Neurobiol Dis. 2019;131:104209. doi:10.1016/j.nbd.2018.06.021

35. Koutsouleris N, Kahn RS, Chekroud AM, et al. Multisite prediction of 4-week and 52-week treatment outcomes in patients with first-episode psychosis: a machine learning approach. Lancet Psychiatry. 2016;3(10):935–946. doi:10.1016/S2215-0366(16)30171-7

36. Grant SW, Collins GS, Nashef SAM. Statistical Primer: developing and validating a risk prediction model. Eur J Cardiothorac Surg. 2018;54(2):203–208. doi:10.1093/ejcts/ezy180

37. Choi RY, Coyner AS, Kalpathy-Cramer J, et al. Introduction to machine learning, neural networks, and deep learning. Transl Vis Sci Technol. 2020;9(2):14. doi:10.1167/tvst.9.2.14

38. Tandon N, Tandon R. Using machine learning to explain the heterogeneity of schizophrenia: realizing the promise and avoiding the hype. Schizophr Res. 2019;214:70–75. doi:10.1016/j.schres.2019.08.032

39. Castelvecchi D. Can we open the black box of AI? Nature. 2016;538(7623):20–23. doi:10.1038/538020a

40. Yue L, Chen WG, Liu SC, et al. An explainable machine learning based prediction model for Alzheimer’s disease in China longitudinal aging study. Front Aging Neurosci. 2023;15:1267020. doi:10.3389/fnagi.2023.1267020

41. Attar R, Valentin JB, Freeman P, et al. The effect of schizophrenia on major adverse cardiac events, length of hospital stay, and prevalence of somatic comorbidities following acute coronary syndrome. Eur Heart J Qual Care Clin Outcomes. 2019;5(2):121–126. doi:10.1093/ehjqcco/qcy055

42. Tian YE, Di Biase MA, Mosley PE, et al. Evaluation of brain-body health in individuals with common neuropsychiatric disorders. JAMA Psychiatry. 2023;80(6):567–576. doi:10.1001/jamapsychiatry.2023.0791

43. Filipcic I, Simunovic Filipcic I, Ivezic E, et al. Chronic physical illnesses in patients with schizophrenia spectrum disorders are independently associated with higher rates of psychiatric rehospitalization; a cross-sectional study in Croatia. Eur Psychiatry. 2017;43:73–80. doi:10.1016/j.eurpsy.2017.02.484

44. Thyrum PT, Wong YW, Yeh C. Single-dose pharmacokinetics of quetiapine in subjects with renal or hepatic impairment. Prog Neuropsychopharmacol Biol Psychiatry. 2000;24(4):521–533. doi:10.1016/s0278-5846(00)00090-7

45. Mann-Wrobel MC, Bennett ME, Weiner EE, et al. Smoking history and motivation to quit in smokers with schizophrenia in a smoking cessation program. Schizophr Res. 2011;126(1–3):277–283. doi:10.1016/j.schres.2010.10.030

46. De Leon J, Diaz FJ. A meta-analysis of worldwide studies demonstrates an association between schizophrenia and tobacco smoking behaviors. Schizophr Res. 2005;76(2–3):135–157. doi:10.1016/j.schres.2005.02.010

47. Djordjevic N, Radmanovic B, Cukic J, et al. Cigarette smoking and heavy coffee consumption affecting response to olanzapine: the role of genetic polymorphism. World J Biol Psychiatry. 2020;21(1):29–52. doi:10.1080/15622975.2018.1548779

48. Veldhuizen S, Behal A, Zawertailo L, et al. Outcomes among people with schizophrenia participating in general-population smoking cessation treatment: an observational study. Can J Psychiatry. 2023;68(5):359–369. doi:10.1177/07067437231155693

49. Soldatos RF, Cearns M, Nielsen MØ, et al. Prediction of early symptom remission in two independent samples of first-episode psychosis patients using machine learning. Schizophr Bull. 2022;48(1):122–133. doi:10.1093/schbul/sbab107

50. Fonseca de Freitas D, Kadra-Scalzo G, Agbedjro D, et al. Using a statistical learning approach to identify sociodemographic and clinical predictors of response to clozapine. J Psychopharmacol. 2022;36(4):498–506. doi:10.1177/02698811221078746

51. Anderson JP, Icten Z, Alas V, et al. Comparison and predictors of treatment adherence and remission among patients with schizophrenia treated with paliperidone palmitate or atypical oral antipsychotics in community behavioral health organizations. BMC Psychiatry. 2017;17(1):346. doi:10.1186/s12888-017-1507-8

52. Højlund M, Köhler-Forsberg O, Gregersen AT, et al. Prevalence, correlates, tolerability-related outcomes, and efficacy-related outcomes of antipsychotic polypharmacy: a systematic review and meta-analysis. Lancet Psychiatry. 2024;11(12):975–989. doi:10.1016/S2215-0366(24)00314-6

53. Albitar O, Harun SN, Sheikh Ghadzi SM. Semi-physiological pharmacokinetic model of clozapine and norclozapine in healthy, non-smoking volunteers: the impact of race and genetics. CNS Drugs. 2024;38(7):571–581. doi:10.1007/s40263-024-01092-1

54. Huhn M, Nikolakopoulou A, Schneider-Thoma J, et al. Comparative efficacy and tolerability of 32 oral antipsychotics for the acute treatment of adults with multi-episode schizophrenia: a systematic review and network meta-analysis. Lancet. 2019;394(10202):939–951. doi:10.1016/S0140-6736(19)31135-3

55. Basu A. Cost-effectiveness analysis of pharmacological treatments in schizophrenia: critical review of results and methodological issues. Schizophr Res. 2004;71(2–3):445–462. doi:10.1016/j.schres.2004.02.012

56. Bobes J, Cañas F, Rejas J, et al. Economic consequences of the adverse reactions related with antipsychotics: an economic model comparing tolerability of ziprasidone, olanzapine, risperidone, and haloperidol in Spain. Prog Neuropsychopharmacol Biol Psychiatry. 2004;28(8):1287–1297. doi:10.1016/j.pnpbp.2004.06.017

57. Homan P, Argyelan M, DeRosse P, et al. Structural similarity networks predict clinical outcome in early-phase psychosis. Neuropsychopharmacology. 2019;44(5):915–922. doi:10.1038/s41386-019-0322-y

58. Blessing EM, Murty VP, Zeng B, et al. Anterior hippocampal-cortical functional connectivity distinguishes antipsychotic naïve first-episode psychosis patients from controls and may predict response to second-generation antipsychotic treatment. Schizophr Bull. 2020;46(3):680–689. doi:10.1093/schbul/sbz076

59. Smucny J, Davidson I, Carter CS. Comparing machine and deep learning-based algorithms for prediction of clinical improvement in psychosis with functional magnetic resonance imaging. Hum Brain Mapp. 2021;42(4):1197–1205. doi:10.1002/hbm.25286

Creative Commons License © 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.