Contrast-Enhanced CT Shell Features and Deep Learning for Predicting Early Transarterial Chemoembolization Refractoriness in Hepatocellular Carcinoma

Qinglong Zhao; Wei Zhang; Zhuo Wang; Xingyuan Liu; Xinyu He; Jiayi Yang; Liming Cui; Xiaoping Leng

doi:10.2147/JHC.S605522

Back to Journals » Journal of Hepatocellular Carcinoma » Volume 13

Original Research

Contrast-Enhanced CT Shell Features and Deep Learning for Predicting Early Transarterial Chemoembolization Refractoriness in Hepatocellular Carcinoma

Authors Zhao Q , Zhang W, Wang Z, Liu X, He X, Yang J , Cui L, Leng X

Received 2 March 2026

Accepted for publication 14 April 2026

Published 21 April 2026 Volume 2026:13 605522

DOI https://doi.org/10.2147/JHC.S605522

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Prof. Dr. Imam Waked

Download Article [PDF]

Qinglong Zhao,¹ Wei Zhang,² Zhuo Wang,³ Xingyuan Liu,⁴ Xinyu He,¹ Jiayi Yang,¹ Liming Cui,¹ Xiaoping Leng^3,^5,⁶

¹Department of Interventional Radiology, The Second Affiliated Hospital of Harbin Medical University, Harbin, People’s Republic of China; ²Department of Interventional Vascular Surgery, General Hospital of Beidahuang Group, Harbin, People’s Republic of China; ³Department of Ultrasound, The Second Affiliated Hospital of Harbin Medical University, Harbin, People’s Republic of China; ⁴Department of Radiology, The Second Affiliated Hospital of Harbin Medical University, Harbin, People’s Republic of China; ⁵Ultrasound Molecular Imaging Joint Laboratory of Heilongjiang Province (International Cooperation), Harbin, People’s Republic of China; ⁶State Key Laboratory of Frigid Zone Cardiovascular Diseases, Ministry of Science and Technology, Harbin, People’s Republic of China

Correspondence: Xiaoping Leng, Department of Ultrasound, The Second Affiliated Hospital of Harbin Medical University, Harbin, People’s Republic of China, Email [email protected]

Purpose: The aim of this study was to develop and validate a predictive model for early refractoriness to transarterial chemoembolization (TACE)—termed early TACE refractoriness (ETR)—in patients with hepatocellular carcinoma (HCC). The model integrates contrast-enhanced CT (CECT) shell features (annular features at the tumor-liver parenchyma interface) with the Vision-Mamba (Vim) architecture, known for its efficiency in handling high-resolution medical images.
Patients and Methods: This study was a two-center and retrospective study. Patients from center 1 were divided into the training set (n=254) and validation set (n=108), while patients from center 2 were used as the testing set (n=75). A joint model was constructed to predict ETR, and four Vim models without clinical features and 14 machine learning models based on clinical features were also developed for comparison. Model performance was evaluated by the accuracy, area under the curve (AUC), calibration curve, sensitivity, specificity, decision curve analysis (DCA) and Delong test. SHapley Additive exPlanations(SHAP) analysis were used to explain the predictions.
Results: The combined model based on the Vim framework performs better than others. The AUC of the combined model in the training set, validation set and test set were 0.959, 0.956 and 0.942, respectively. The calibration curve and DCA verified the practicality of the combined model in clinical practice. SHAP provides a visual interpretation of the model.
Conclusion: The Vim-based model integrating CECT and shell features shows promise for ETR prediction, offering a preliminary stratification tool. However, it remains a promising step rather than a definitive solution, requiring prospective validation due to the retrospective design and limited validation.

Keywords: hepatocellular carcinoma, transarterial chemoembolization refractoriness, vision-mamba, contrast-enhanced CT, shell feature

Introduction

Studies have shown that patients with TACE refractoriness have a poor survival prognosis.^1,2 The determination of TACE refractoriness is made after a patient has undergone at least two to three TACE procedures. However, repeated ineffective TACE not only further compromises liver function and causes patient suffering, but also compromises patients’ eligibility for subsequent therapies. Early identification of patients at high risk of ETR, combined with the timely initiation of additional therapies such as systemic treatment, could lead to improved tumor response, a reduced incidence of vascular invasion or extrahepatic metastasis, and more favorable survival outcomes.

Currently, there is a critical clinical need for an effective predictive model for ETR. Although several predictive models for TACE refractoriness have been developed based on clinical information and radiomics, they have not achieved widespread clinical acceptance or application.^3–5 These clinical-only scoring systems are convenient for clinical use, but their ability to mine imaging features reflecting tumor heterogeneity is limited. Radiomics demonstrates predictive value for response to TACE by quantitatively extracting whole-tumor features from imaging data.³ However, radiomics is constrained by two key limitations: its dependence on target volume delineation and its representation of features by predefined values, both of which can lead to the loss of complex details. In contrast, end-to-end deep learning methods facilitate holistic processing from input to output, enabling models to learn relevant features directly from the input data. Previous studies have also demonstrated that features captured by deep learning algorithms can effectively predict treatment response, recurrence, distant metastasis and survival outcomes in patients with HCC.^6–8 On the other hand, the definition of TACE refractoriness in HCC encompasses not only the intra-tumoral response to TACE but also vascular invasion and distant metastasis, reflecting the tumor’s invasive and metastatic properties. The tumor-liver parenchyma interface is the region of most active tumor growth. Research has found that the interaction between the tumor and its microenvironment is correlated with local invasion and distant metastasis.⁹ Typical changes in cancer cells at the tumor boundary, such as epithelial-to-mesenchymal transition, manifest as dual alterations in both cell morphology and function. This leads to decreased intercellular adhesion and enhanced migratory properties. Some of these cancer cells can acquire metastasis-like cancer stem cell-like characteristics, with increased heterogeneity, invasiveness, and therapeutic resistance. This suggests that the interface between the tumor and liver tissue may provide phenotypic information related to tumor invasion and metastasis, thereby making the development of predictive models possible. Extracting quantitative imaging features associated with the cellular phenotype of the tumor-liver parenchyma interface may allow for a quantitative reflection of the pathological heterogeneity at the tumor margin. The predictive value of features extracted from the tumor-normal tissue interface has been successfully demonstrated. Specifically, tumor shell features (consisting of outer voxels around the tumor boundary) have proven effective in forecasting distant metastasis post-treatment in patients with non-small cell lung cancer and cervical cancer.¹⁰ Zhang, K. et al further support this potential by showing that peritumoral radiomics features are strong predictors of peritumoral microvascular invasion and survival outcomes in HCC.^11,12 Accordingly, we sought to develop an ETR prediction model integrating the Vim deep learning method with CECT images and shell features.

Materials and Methods

Patient Enrollment

This study was conducted in accordance with the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guidelines. A total of 437 patients from 2 centers were enrolled: (1) Center 1, 362 patients, data from November 2016 to December 2024, were randomly divided into training set (70% of patients) and validation set (30% of patients); (2) Center 2, 75 patients, patients admitted from January 2019 to December 2023 were used as the external test set (Figure 1). Inclusion criteria: (1) diagnosed with HCC according to the European Association for the Study of the Liver (EASL) criteria. (2) BCLC stage A or B (patients with BCLC stage A were those ineligible for surgical resection or radiofrequency ablation). (3) Age ≥ 18 years. (4) Underwent an abdominal CECT within 1 month before the initial TACE procedure. (5) Child-Pugh class A or B. (6) Eastern Cooperative Oncology Group (ECOG) performance status of 0 or 1.Exclusion criteria: (1) Patients with poor-quality preoperative CECT images. (2) Concomitant other malignancies. (3) A history of prior HCC-related treatment. (4) Patients with incomplete clinical data.

Figure 1 Flowchart of patients’ enrollment. Light blue boxes (top and middle) represent the original data and patient inclusion process. The light red box (middle) indicates the exclusion criteria. Light green boxes (bottom) denote the final cohort groupings.

Abbreviations: HCC, hepatocellular carcinoma; TACE, transarterial chemoembolization.

TACE Procedure and Follow-Up

TACE procedures were performed by interventional radiologists with over 15 years of experience and their assistants. Conventional TACE(cTACE) was performed on all patients. For cTACE, a mixed emulsion of a chemotherapeutic agent (epirubicin, 10–40 mg) and lipiodol (5–10 mL) was first injected, followed by the injection of an appropriate amount of gelatin sponge particles to embolize the tumor-feeding arteries.

Follow-up, including non-contrast chest CT, upper abdominal CECT or contrast-enhanced MRI, and alpha fetoprote(AFP) levels, was conducted 4 to 8 weeks after the initial TACE. Subsequent follow-ups using the same modalities were performed 1 to 3 months after each subsequent TACE. Based on the results of each re-evaluation, the need for repeat TACE or a transition to other treatment plans (such as combined with systemic therapy) was assessed.

ETR Evaluation

The efficacy of TACE for each patient was evaluated according to the modified Response Evaluation Criteria in Solid Tumors (mRECIST).¹³ For this study, we defined ETR as events occurring within three months of the first two consecutive TACE procedures. For patients with BCLC stage A/B, refractoriness was met if any of the following were observed: (i) residual activity in target intrahepatic lesions exceeding 50% of the baseline, or the emergence of new intrahepatic lesions at each follow-up; (ii) new vascular invasion; (iii) new extrahepatic metastasis; or (iv) a continuous rise in post-procedure AFP.

Image Acquisition and Processing

Before TACE, all patients underwent a 64-slice CT scanner to collect CECT images according to the standard liver scan protocol. The detailed scanning parameters are shown in Table S1. The obtained CECT images were resampled and density-normalized. All CT images were resampled to an isotropic resolution of 1 × 1×1 mm³ during preprocessing. HCC lesions before the initial TACE treatment were defined as regions of interest (ROI). These ROIs were manually delineated on the images by two independent physicians, each with over five years of clinical experience, using 3D Slicer software. The ROIs were further reviewed and manually refined by a senior physician with over 20 years of experience in the field. Any discrepancies were resolved through group discussion. The ROIs were then cropped from the CT images to isolate the tumor regions for analysis (Figure 2a).

Figure 2 (a) Overview of the predicting models construction, evaluation and interpretation. Firstly,the CT images were preprocessed to obtain the ROI. The red dashed rectangle indicates the HCC lesion on the CT image. The red solid rectangle shows a locally magnified view of the HCC after boundary delineation. Solid elliptical contours in red, blue, and green represent the HCC ROIs. The clinical features were also collected. The ellipsis (…) indicates that additional clinical factors (e.g., age, sex, tumor location) were collected but omitted here due to space constraints. Secondly, cropped ROI images, enhanced CT, radiomics feature maps, clinical, and shell features were input to the Vision-Mamba Model. Additionally, selected clinical features were input into other machine learning models. The plus sign inside a circle (⊕) represents integrating selected clinical features with the enhancement model to develop the combined model. Finally, the prediction models were evaluated using ROC curves, calibration curves, decision curve analysis, and DeLong tests, with SHAP value maps visualizing the regions of CT images focused on by the Vision-Mamba model. (b) The dual-pathway Vision-Mamba model architecture. The clinical features and shell features were concatenated with the features extracted by the dual-branch Vision-Mamba at the concatenation layer. The red dashed rectangle indicates the HCC lesion on the CT image.

Abbreviations: AFP, alpha-fetoprotein; BCLC, Barcelona Clinic Liver Cancer; SHAP, Shapley Additive Explanations; ETR, early transarterial chemoembolization refractoriness; NETR, non-early transarterial chemoembolization refractoriness.

Clinical Feature Selection and Machine Learning Model Development

Clinical factors were retrospectively collected using the PACS system. In the training set, univariate and multivariate logistic regression analyses were performed, and clinical features associated with the occurrence of ETR were screened using a p-value < 0.05. A binary classification machine learning model was then built using the screened clinical features (Figure 2a).

Voxel-Level Radiomics Feature Map Extraction

Voxel-level radiomics feature maps were generated using a sliding-window approach. Specifically, for each voxel within the lesion bounding box, a local neighborhood patch (kernel size: 5×5×5 voxels) centered on that voxel was extracted, and a predefined set of first-order and texture-based radiomic descriptors (including GLCM energy, entropy, contrast, and homogeneity, as well as first-order statistics such as mean, variance, and kurtosis) were computed within this local window. The resulting scalar value was assigned back to the central voxel, yielding a 3D feature map of the same spatial dimensions as the input volume (resampled to 32×64×64). This process was repeated independently for each selected radiomic feature, producing 6 feature maps in total, where 6 denotes the number of features retained after dimensionality reduction. Each feature map was subsequently intensity-normalized (zero mean, unit variance) prior to model input.

Vim Deep Learning Model Construction and Evaluation

To fully exploit spatial complexity and high-dimensional feature interactions, three phases of CECT images were input into a multi-input three-dimensional deep neural network based on the Vim architecture (Figure 2b). The Vim model is a recently proposed efficient visual representation architecture. Its core advantage lies in effectively processing a large number of high-resolution, multi-layered sequence features in medical images using a “bidirectional state space model (SSM)”, while achieving spatial perception through position embedding. Compared with the traditional transformer self-attention mechanism, this method processes long sequences with linear complexity, making it more efficient and suitable for medical scenarios.¹⁴

For each CECT phase, the ROIs was first cropped during three phases. Then, the images from the different CT phases were concatenated along the relevant dimensions so that all the modalities combined could form a 3D tensor containing full multimodal information. This three-dimensional tensor was then extracted from the voxel-wise radiomics feature map and fed into the model independently. This approach retains the spatial consistency of the data and enables the model to exploit the integrated multimodal information fully.

The model input is a 4D tensor of (X+1, 32, 64, 64), where X is the number of features selected from the voxel-level radiomics feature maps for the model input. “+1” corresponds to the CT-enhanced image. Each input is initially processed by a structurally equivalent 3D convolutional block (3D Conv Block). This block consists of convolution, batch normalization (BatchNorm), activation functions (eg., ReLU), and spatial down-sampling to extract low-level spatial features in local space with reduced dimensionality to a unified spatial scale (each output is a tensor of size 128×8×16×16). During this process, deep convolutional kernels adaptively learn to capture texture and intensity patterns from various channels, akin to clinical “visual interpretation”, but in a more specific and high-dimensional setting. The features of each input are convoluted with the initial convolution, maintaining the high-resolution spatial structure differences and avoiding the loss of hierarchical information resulting from low-dimensional summarization. Then, the convolutional features from all inputs are concatenated in terms of dimension and fed into the Vim Block for global modelling. Vim is fundamentally different from any Transformers as it uses a state-space model instead of self-attention for modelling at a global level. First, it outputs the input feature trajectory, then it decomposes the input coupling B (and the output coupling C) into a SSM, as well as the time step dt. These parameters are then aggregated by global attention and pooling. Finally, the multi-path features are concatenated with the trained clinical and shell features and then input into a fully connected layer to output a binary classification of non-early TACE refractoriness/ETR. To make the loss function more robust to class imbalance, weighted binary cross-entropy (BCEWithLogitsLoss) was used. The study also implements SHAP technology to produce saliency maps showing where the model concentrates its attention: on tumor lesion areas. Additionally, we developed three single-path Vim models (using single-phase CECT images) and one dual-path Vim model (using three-phase CECT images, radiomics feature maps and shell features) for comparison. Shell features were calculated from the best performing phase of the three single-path Vim models.

During model training, the data from center 1 was randomly split at a ratio of 7:3 into training and validation sets, while the data from center 2 formed the external test dataset. The Adam optimizer was used with an initial learning rate of 1e-4. This was reduced by half at regular intervals during training, based on the validation performance, to ensure stable and accurate convergence. Several data augmentation methods were implemented to enhance the model’s generalization capability and avoid overfitting. In particular, the training samples were randomly flipped and rotated with a 50% probability, and a random scaling ratio between 0.9 and 1.1 was chosen. We also introduced elastic deformation (Gaussian elastic deformation parameters are typically set to alpha = 15 and sigma = 3) to increase the diversity of the training samples. These methods effectively broadened the data distribution and enhanced the model’s robustness to diverse inputs. Furthermore, batch normalization was introduced at appropriate locations in the network structure using standard parameters (momentum 0.1, epsilon 1e-5) to accelerate convergence and stabilize the training process. Dropout (with a drop rate of 0.5) was also applied in the fully connected layer to further inhibit the co-adaptation.The proposed method trains the model using an early stopping approach, updating the training rounds adaptively according to the loss on the validation set to finally produce a more general model.

Multi-Modal Fusion and Channel Concatenation Dimensions

In the dual-branch architecture, multi-modal fusion is performed at two levels. First, within the imaging branch, the three-phase CT volumes (plain, arterial, and venous phases) are concatenated along the channel dimension prior to the 3D Conv Block, forming a tensor of shape (3, 32, 64, 64) as input. The X voxel-level radiomics feature maps are similarly organized as an X-channel volume of identical spatial dimensions. These two streams are processed by separate but structurally identical 3D Conv Blocks, each producing an output tensor of shape (128, 8, 16, 16) after three stages of convolution-BN-ReLU-downsampling. The two convolutional outputs are then concatenated along the channel dimension to yield a (256, 8, 16, 16) tensor, which is subsequently flattened and fed into the Vision-Mamba Block for global sequence modeling. Second, after the Mamba Block produces a global feature vector, it is concatenated with the shell feature vector (length 256, after a two-layer MLP projection) and the selected clinical feature vector along the feature dimension, forming the final fused representation (total length ~640) that is passed to the fully connected classification head.

Shell Feature Generation

This study uses CECT data of HCC and manually drawn 3D masks as input to construct a shell feature map that highlights the heterogeneity of the tumor-liver parenchyma interface. First, the images and masks are resampled to a unified voxel and matrix, and a consistent interpolation strategy and intensity normalization are used to reduce cross-device and reconstruction differences. Then, on each axial slice containing the tumor, a fixed-size grayscale patch centered on the tumor centroid of that layer is extracted, and morphological dilation and erosion are performed on the corresponding binary mask. The difference between the two yields a ring-shaped boundary band. This boundary band is then element-wise multiplied with the normalized grayscale patch to preserve the tumor’s peripheral voxels and suppress the interior and background. To preserve the spatial correspondence between layers, all slice patches are aligned at their centroids and then resampled to a uniform size and orientation. Zero-filling is adopted when the tumor lies close to the edge of the field of view to minimize clipping bias. The boundary enhancement patches obtained from each layer are accumulated along the head-to-foot direction to form a single 2D shell map, reflecting the overall edge metabolism and morphological complexity of the tumor. Finally, the shell map is sized and re-normalized for intensity and used for subsequent modeling. Throughout the process, a fixed patch size, morphological structural elements, and accumulation strategy were maintained for each case, and visual review was conducted to control quality, making the feature reproducible and comparable in multi-case and multi-center conditions (Figure 3).

Figure 3 (a) ETR, Early TACE refractoriness case. (b) NETR, non-Early TACE refractoriness case. In each cohort, shell feature maps (third column) were calculated from a series of slices of the tumor (Second column). The solid 3D box in the first column delineates the entire tumor on CT. The red straight dashed lines indicate that the CT images in the second column are consecutive slices of this 3D tumor. In the second column, the HCC lesion is enclosed by the red solid rectangle, and the red elliptical ring-shaped area represents the HCC-liver parenchyma interface, from which shell features are extracted. In the third column, higher red values reflect greater interslice intensity differences, indicating a more pronounced shell feature. As shown, tumors that develop early TACE refractoriness exhibit more complex morphological patterns.

Statistical Analysis

Statistical analysis was conducted in Python (version 3.8.2; www.python.org). An independent samples t-test was applied to compare normally distributed quantitative data, and the results are presented as the mean ± standard deviation. Non-normally distributed data were analyzed using the Mann–Whitney U-test and presented as median (interquartile range). The Levene test was used to determine the equality of variance. Categorical variables were presented as counts (n) and percentages (%), and were compared using chi-square tests or Fisher’s exact test. A p-value of <0.05 was considered statistically significant. The performance of the classification models was evaluated using ROC curves and the AUCs. DCA curve analysis was conducted to evaluate the clinical utility of the models by estimating net benefits across a range of threshold probabilities. The DeLong test was used to compare the AUCs of different models. Model calibration was assessed using calibration curves to determine how closely the model’s predicted probabilities aligned with the observed incident rates. We assessed model calibration using the Hosmer–Lemeshow (HL) test, with P > 0.05 indicating a good fit. To empirically address the adequacy of our sample size, we conducted a learning curve analysis by progressively increasing training set size from 90 to 230 samples and evaluating test AUC at each step (Figure S1).

Results

Basic Characteristics

This study included 437 patients from two centers. The cohort was divided into a training set of 254 patients, a validation set of 108 patients, and an external test set of 75 patients. Table 1 summarizes the patient characteristics. The ETR rates in the training set, validation set, and external test set were 106 (41.7%), 45 (41.7%), and 41 (54.7%), respectively. There was no significant difference in ETR rates among the three groups. Statistically significant differences were found among the three groups in terms of gender, etiology, maximum tumor diameter, number of tumors, ECOG performance status score, BCLC stage, and tumor capsule (p < 0.05). No other baseline characteristics differed significantly among the three groups.

Table 1 Baseline Characteristics of Patients in Three Cohorts

Model Construction and Validation

Using univariate and multivariate logistic regression on the training set, we found that maximum tumor diameter, BCLC stage, tumor boundary, tumor capsule, and tumor enhancement were valuable clinical parameters for predicting ETR (Table 2). These clinical features were incorporated into 14 machine learning algorithms, and the names and predictive performance of the machine learning models are shown in Table S2. The model with the best predictive performance was Extremely Randomized Trees (ExtraTree).

Table 2 Univariable and Multivariable Logistic Regression Analysis for Predicting Early TACE Refractoriness in the Training Cohort

Figures 3a and b show representative 2D shell mapping maps. The top row shows tumors with ETR (Figure 3a), and the bottom row reports tumors without ETR (Figure 3b). The 2D shell mapping shows that tumors with ETR exhibit more heterogeneous boundary expression than those without. Tables 3, S3, and Figures 4,5 detail the performance of the ExtraTree machine learning model and the single-pathway and dual-pathway Vim models. A comparison of single-pathway Vim models across different phases of CECT images revealed that the arterial phase single-pathway Vim model had higher accuracy, AUC, sensitivity, and specificity than the portal venous phase and delayed phase models across all datasets. Therefore, tumor shell features were calculated using arterial phase CECT images.

Table 3 Performance of the Six Models in Three Cohorts

Figure 4 The top row is the ROC curves and AUCs of the models (a–c). The bottom row shows the Calibration curves of the models in all sets (d–f).

Abbreviations: AUC, area under the receiver operating characteristic curve; ROC, receiver operating characteristic; Val, validation.

Figure 5 The models’ Decision curves (a–c) and DeLong test (d–f) in different sets.

Abbreviations: DCA, decision curve analysis; Val, validation.

Table 3 shows that among the six different models, the combined model (constructed using the dual-pathway Vim framework, combined with triphasic CECT, radiomics feature maps, clinical features, and shell features) demonstrated the best predictive performance and robust predictive ability across all datasets, with accuracy ranging from 0.867 to 0.878, AUC from 0.942 to 0.959, sensitivity from 0.854 to 0.896, and specificity from 0.865 to 0.882 (95% confidence intervals for accuracy, sensitivity, specificity, NPV, PPV, and F1 are shown in Table S3). Notably, the model maintained good predictive accuracy on both the internal validation set and the external test set. ROC curves for the six models are shown in Figure 4 a–c. Furthermore, as shown in Figure 4 d-f, the calibration curves show that the combined model has the best agreement between the predicted probability and the actual incidence. The degrees of freedom (df), chi-square (Chi2), and p-values (calculated via the Hosmer-Lemeshow test) for the three cohorts in Table S4 reflect the degree of calibration of the predictive models. Subsequently, the Delong test and DCA indicate that the combined model has better performance and higher net clinical benefit (Figure 5). The predicted probability for each HCC patient in the six models is shown in Figure 6, with the combined model showing the best performance among all models.

Figure 6 The predictive probabilities for each patient and the optimal cut-off values (derived from the training set) in the training, validation, and external test sets for the six models, respectively. (a–c) Combined model. (d–f) Arterial phase model. (g–i) Venous phase model. (j–l) Delayed phase model. (m–o) Enhangcement model. (p–r) Clinical model.

Abbreviations: ETR, early transarterial chemoembolization refractoriness; NETR, non-early transarterial chemoembolization refractoriness.

The shell-only model achieved the following performance across datasets: in the training set, AUC = 0.860, ACC = 0.760, sensitivity = 0.736, specificity = 0.777; in the validation set, AUC = 0.825, ACC = 0.769, sensitivity = 0.800, specificity = 0.746; and in the test set, AUC = 0.803, ACC = 0.747, sensitivity = 0.756, specificity = 0.735. These results demonstrate that the shell feature alone possesses meaningful discriminative capacity, confirming that peritumoral boundary heterogeneity carries independent prognostic information.

As shown in the learning curve (Figure S1), the test AUC rises substantially from ~0.52 at n=90 to ~0.89 at n=190, and begins to plateau beyond n=190–210 (test AUC stabilizing at approximately 0.90–0.92). Crucially, the training AUC remains relatively stable throughout (0.87–0.96), and the train-test AUC gap narrows progressively with increasing sample size, converging to approximately 0.03–0.04 at our full cohort size. This convergence pattern provides empirical evidence that the model is approaching a stable generalization boundary and is not severely overfitting at the final sample size used.

Model Visualization via SHAP

The SHAP value plot provides an explanation and insight into how the model predicts for each patient on CT images. In Figure 7, CT images are compared with the SHAP value plot, where the dark red areas represent regions that contribute more to the model’s predictions. In the SHAP value maps of all patients, areas showing high SHAP values include the junction of tumor fusion nodules, some areas of significant enhancement, tumor boundaries, and areas of tumor necrosis (as shown in Figures 7a–d).

Figure 7 In a comparison between SHAP value maps and CT images in four cases, the darker red areas in the SHAP value maps indicate the regions that contribute more significantly to the model’s prediction.(a)The darker red area in the middle region of the tumor indicates significant SHAP values that contribute to the prediction of the model. (b)The visualization of The SHAP value maps emphasizes the importance of enhanced regions and tumor boundaries in model prediction. (c and d) The darker red areas, especially the necrotic areas in the middle of the tumor, highlight areas that have a significant impact on the prediction of whether early TACE refractoriness occurs.

Abbreviation: SHAP, Shapley Additive Explanations.

Discussion

This study established a deep learning-based combined model that combines CECT images, shell features, radiomics feature maps and clinical features to assess the risk of ETR in HCC patients undergoing cTACE. Tumor shell features were calculated using arterial phase CECT images. A single-pathway Vim model incorporating single-phase CECT, a machine learning model incorporating clinical features, and a dual-pathway Vim model incorporating three-phase CECT were also developed and compared with the joint model. The performance of all models was validated on the training, validation, and external test sets in terms of AUC, accuracy, sensitivity, specificity, DeLondo test, and clinical applicability. The combined model showed the best performance and outperformed all other models. Furthermore, we provide an interpretable method to visualize key regions of CT images involved in model prediction.

HCC patients with a high probability of ETR after initial TACE monotherapy should be identified promptly. For these patients who may benefit from early TACE combined with systemic therapy, the initial treatment strategy should be modified. TACE combined with sorafenib/lenvatinib or TACE combined with atezolizumab and bevacizumab can improve the objective response rate, progression-free survival, and overall survival in patients with intermediate-stage HCC.^15,16 However, there is currently no broad consensus on when and for which patients to use TACE in combination with systemic therapy. ETR following TACE is a sign of TACE treatment failure, which not only impairs liver function and misses the optimal treatment window, but also greatly shortens patient survival. Therefore, this study chose ETR as the endpoint. We developed a joint model based on a Vim deep learning framework to explore its potential utility in early identification of high-risk patients, which may offer a preliminary reference for stratification in clinical decision-making for initial treatment.

TACE is the most commonly used first-line local treatment for all stages of HCC worldwide.¹⁷ Therefore, some studies have attempted to develop scores such as the ART score and the ABCR score to predict whether patients will benefit from TACE treatment.^4,5 These scoring systems are convenient for clinical use, but they only contain clinical information and have limited depth in mining imaging features that reflect tumor heterogeneity. Some studies have used radiomics to mine tumor features^3,18 or used genetic data combined with machine learning algorithms to construct predictive models.¹⁹ This multi-dimensional assessment of tumor heterogeneity significantly improves the accuracy of TACE refractoriness prediction. However, radiomics relies on tumor segmentation, which is subjective and subject to operator differences, leading to inconsistent feature extraction results.²⁰ Liver cancer biopsy may cause complications such as tumor bleeding or needle tract metastasis, making gene testing difficult. These factors hinder the clinical application of these two predictive models. Deep learning methods, as non-invasive end-to-end models, are not limited by target delineation and preset values. They can directly extract microscopic features, revealing potential histopathological features related to clinical outcomes. Studies have shown that tumor features directly extracted by deep learning methods can effectively predict the response and prognosis of HCC TACE treatment.^6,7,21 Our model yielded AUCs of 0.959, 0.956, and 0.942 across the training, validation, and test sets, respectively. These results outperform those of previous studies in similar cohorts, such as an MRI-based radiomics nomogram (AUCs: 0.955 and 0.941)¹⁸ and a model combining immune-related genes with machine learning (AUCs: 0.887 and 0.762).¹⁹ However, cross-study comparisons require caution due to variations in study populations, imaging or genetic testing approaches, and modeling frameworks.

Previous studies have shown that the dual-pathway Vim model outperforms vision-transformer and 3D-ResNet, making it an ideal choice for 3D imaging tasks such as CT scans.¹⁴ The results of this study also show that the prediction model based on the dual-pathway Vim framework, constructed from 3D CECT images, achieved good prediction performance on the training set, validation set, and external test set. The voxel-level radiomics feature map method solves the problem of high feature compression in traditional radiomics by analyzing subtle feature distributions and combining deep learning models.¹⁴ The results of this study also suggest that the prediction model of deep learning combined with radiomics feature maps has better performance. This may be because radiomics feature maps can observe more detailed differences in feature distributions. Our results also suggest that the combined model with clinical features exhibits better prediction performance, with AUC values higher than the Vim model containing only image features or the machine learning model containing only clinical features on the training, validation, and external test set. Previous studies have shown that combined imaging and clinical features have better predictive performance in predicting TACE refractoriness than separate imaging/clinical models.¹⁸ This may be because clinical and imaging features reflect different dimensions of information about cancer patients. The results of this study suggest that tumor size, BCLC stage, tumor boundary, tumor capsule, and enhancement degree may be predictive factors for ETR. Studies have also shown that the larger the tumor diameter, the more likely it is to develop TACE refractoriness.^3,22 This may be because, as the tumor burden increases, the biological behavior of the tumor changes, making the tumor more likely to invade and metastasize, resulting in a worse prognosis after treatment.²³ BCLC staging is the most commonly used method for HCC treatment allocation worldwide, which includes information such as tumor burden, liver function, vascular invasion, and metastasis. The higher the stage, the worse the response to various treatments and the worse the prognosis. The results of this study also suggest that the higher the BCLC grade, the greater the likelihood of ETR. However, when and for which patients to combine systemic therapy is a challenge that clinicians have always faced. Irregular tumor margins are closely related to microvascular invasion and are important risk factors affecting the prognosis of HCC patients after treatment.^24,25 The results of this study also suggested that patients with irregular margins are more likely to develop ETR. Patients with HCC without a tumor capsule have a higher probability of developing TACE refractoriness after TACE treatment;²⁵ however, the results of logistic regression in this study indicated a contrasting finding. This may be because there is a non-linear association between the tumor capsule and ETR in this study, and logistic regression cannot reflect this non-linear relationship. Since machine learning and deep learning models are good at handling non-linear data, it is appropriate to include tumor capsule features in the model construction. Previous studies have indicated that a higher degree of tumor enhancement is associated with better TACE treatment outcomes.²⁶ Consistent with this, our results suggested that increased tumor enhancement was associated with a lower probability of ETR. A potential explanation could be that TACE drug delivery depends on blood flow; thus, high tumor enhancement may reflect increased tumor vascularity and blood flow, which might contribute to improved TACE efficacy. However, the underlying biologic mechanisms require further validation.

Previous predictive models for TACE refractoriness^3,18,27 focused only on extracting tumor-specific features, neglecting the tumor-hepatic parenchyma junction. The tumor-hepatic parenchyma junction represents a complex biological interface where the tumor and its microenvironment are believed to interact; this region is considered critical for tumor cell morphological transformation, invasion into surrounding tissues, and potential angiogenesis. Studies have found a close correlation between the tumor-normal tissue interface and local tumor invasion and distant metastasis.^9,28 Post-treatment vascular invasion or distant metastasis is a key marker of TACE refractoriness. Hao, H. et al successfully constructed a model for predicting distant metastasis after treatment in NSCLC and CC patients using Pet-CT images combined with tumor edge shell features extracted from radiomics.¹⁰ This suggests that the shell features, after standardized feature extraction and processing, may serve as a potential proxy for the overall metabolic and morphological complexity of the tumor edge. Furthermore, it is reproducible and comparable under multi-case and multi-center conditions. Our results suggested that the enhancement model and joint model constructed by incorporating shell features into the Vim framework demonstrated improved predictive performance compared to the model containing only CECT images. A potential explanation is that shell features may provide complementary information to tumor features, thereby potentially contributing to a more comprehensive characterization of TACE resistance.

In the present study, shell features were computed using arterial phase images, and the arterial phase model was found to perform best among single-phase models. This may be related to the fact that arterial phase images provide more information on tumor heterogeneity and clearer tumor boundaries. This hypothesis is supported by our interpretability method, namely, the visualization of SHAP value maps. We used SHAP value map visualization to attempt to address the interpretability issue of deep learning models. The original CT images contain detailed tumor information and are important features. SHAP value maps suggested that tumor enhancement areas, tumor nodule fusion areas, tumor boundary areas, and tumor necrosis areas contributed substantially to the prediction results. A potential explanation is that these areas may reflect key biological properties of the tumor, including vascular complexity, invasiveness, microenvironmental heterogeneity, and microvascular invasion, which are hypothesized to be associated with TACE efficacy and refractoriness. DCA indicated that the combined model yielded a favorable net benefit, suggesting its potential utility as a reference for informing clinical decision-making.

This study has several limitations. First, the retrospective, two-center design inherently introduces potential selection bias and limits causal inference, which subsequently compromises the generalizability and reproducibility of the prediction model. Therefore, future studies utilizing prospective, multicenter, and multinational datasets are warranted. Second, concerning external validation, although our dual-center cohort provides a degree of cross-institutional assessment, we agree that validation in fully independent, geographically distinct cohorts remains necessary. We have noted this as a priority for future work. Third, despite using SHAP values for model interpretability and visualization, the identified imaging-derived features—particularly those capturing peritumoral heterogeneity via the shell representation—currently lack direct histopathological validation. Without biological corroboration, their mechanistic relevance remains unproven. Future studies are warranted to validate these features against histopathological data to elucidate the underlying mechanisms. Fourth, our study did not include biomarkers such as gene and protein expression that may predict TACE refractoriness. Due to the complications and economic costs of detecting these biomarkers, most patients in our dataset did not possess these biomarkers. Future studies will continue to explore the feasibility of constructing predictive models using combined biomarkers. Fifth, the 2D shell features derived from 3D shell features through computation and standardization may result in a loss of spatial complexity. Future studies may require further research and the development of better algorithms to represent shell features, thereby obtaining more comprehensive feature representations of the boundary region and better predictive capabilities.

Conclusion

In conclusion, the deep learning model developed in this exploratory study suggests potential for predicting ETR risk in BCLC stage A/B HCC patients ineligible for surgical resection or ablation. By integrating CECT, shell features, clinical characteristics, and voxel-level radiomics within a Vim architecture, the model may serve as a preliminary reference for stratifying initial therapies—such as TACE combined with systemic therapy for high-risk patients. However, given the retrospective design, it represents a promising step requiring validation rather than a definitive clinical tool. Future multi-center studies are warranted to validate reproducibility, while incorporating pathological or genomic data could help elucidate the underlying biological mechanisms, which will be our future focus.

Abbreviations

TACE, transarterial chemoembolization; ETR, early TACE refractoriness; HCC, Hepatocellular carcinoma; CECT, contrast-enhanced CT; Vim, Vision-Mamba; AFP, alpha fetoprote; cTACE, conventional TACE; AUC, area under the curve; DCA, decision curve analysis; SHAP, SHapley Additive exPlanations; BCLC, Barcelona Clinic Liver Cancer; ROI, regions of interest; ECOG, Eastern Cooperative Oncology Group; ExtraTree, Extremely Randomized Trees.

Data Sharing Statement

Data will be made available on request from the corresponding author.

Ethics Approval and Informed Consent

This study was conducted in accordance with the principles of the Declaration of Helsinki. This two-center retrospective study was approved by the ethics committees of the Second Affiliated Hospital of Harbin Medical University (YJSKY2024-320) and General Hospital of Beidahuang Group (KY-2025062501), and patient informed consent was waived. To ensure confidentiality, all patients’ information was subjected to anonymization.

Funding

This work was supported by the National Natural Science Foundation of China (No. U22A20346).

Disclosure

The authors report no conflicts of interest in this work.

References

1. Choi J, Lee D, Shim JH, et al. Evaluation of transarterial chemoembolization refractoriness in patients with hepatocellular carcinoma. PLoS One. 2020;15(3):e0229696. doi:10.1371/journal.pone.0229696

2. Katayama K, Imai T, Abe Y, et al. Number of nodules but not size of hepatocellular carcinoma can predict refractoriness to transarterial chemoembolization and poor prognosis. J Clin Med Res. 2018;10(10):765–17. doi:10.14740/jocmr3559w

3. Niu X-K, He X-F. Development of a computed tomography-based radiomics nomogram for prediction of transarterial chemoembolization refractoriness in hepatocellular carcinoma. World J Gastroenterol. 2021;27(2):189–207. doi:10.3748/wjg.v27.i2.189

4. Sieghart W, Hucke F, Pinter M, et al. The ART of decision making: retreatment with transarterial chemoembolization in patients with hepatocellular carcinoma. Hepatology. 2013;57(6):2261–2273. doi:10.1002/hep.26256

5. Adhoute X, Penaranda G, Naude S, et al. Retreatment with TACE: the ABCR SCORE, an aid to the decision-making process. J Hepatol. 2015;62(4):855–862. doi:10.1016/j.jhep.2014.11.014

6. Sun Z, Shi Z, Xin Y, et al. Contrast-enhanced CT imaging features combined with clinical factors to predict the efficacy and prognosis for transarterial chemoembolization of hepatocellular carcinoma. Acad Radiol. 2023;30(Suppl 1):S81–s91. doi:10.1016/j.acra.2022.12.031

7. Wang H, Liu Y, Xu N, et al. Development and validation of a deep learning model for survival prognosis of transcatheter arterial chemoembolization in patients with intermediate-stage hepatocellular carcinoma. Eur J Radiol. 2022;156:110527. doi:10.1016/j.ejrad.2022.110527

8. Wang K, Xiang Y, Yan J, et al. A deep learning model with incorporation of microvascular invasion area as a factor in predicting prognosis of hepatocellular carcinoma after R0 hepatectomy. Hepatol Int. 2022;16(5):1188–1198. doi:10.1007/s12072-022-10393-w

9. Valastyan S, Weinberg RA. Tumor metastasis: molecular insights and evolving paradigms. Cell. 2011;147(2):275–292. doi:10.1016/j.cell.2011.09.024

10. Hao H, Zhou Z, Li S, et al. Shell feature: a new radiomics descriptor for predicting distant failure after radiotherapy in non-small cell lung cancer and cervix cancer. Phys Med Biol. 2018;63(9):095007. doi:10.1088/1361-6560/aabb5e

11. Zhang K, Zhang L, Li W-C, et al. Radiomics nomogram for the prediction of microvascular invasion of HCC and patients’ benefit from postoperative adjuvant TACE: a multi-center study. Eur Radiol. 2023;33(12):8936–8947. doi:10.1007/s00330-023-09824-5

12. Xia T-Y, Zhou Z-H, Meng X-P, et al. Predicting microvascular invasion in hepatocellular carcinoma using CT-based radiomics model. Radiology. 2023;307(4):e222729. doi:10.1148/radiol.222729

13. Llovet JM, Lencioni R. mRECIST for HCC: performance and novel refinements. J Hepatol. 2020;72(2):288–306. doi:10.1016/j.jhep.2019.09.026

14. Zhang Z, Luo T, Yan M, et al. Voxel-level radiomics and deep learning for predicting pathologic complete response in esophageal squamous cell carcinoma after neoadjuvant immunotherapy and chemotherapy. J Immunother Cancer. 2025;13(3). doi:10.1136/jitc-2024-011149

15. Kudo M, Ueshima K, Ikeda M, et al. Randomised, multicentre prospective trial of transarterial chemoembolisation (TACE) plus sorafenib as compared with TACE alone in patients with hepatocellular carcinoma: TACTICS trial. Gut. 2020;69(8):1492–1501. doi:10.1136/gutjnl-2019-318934

16. Wang K, Feng J, Yu H, et al. Transarterial chemoembolization plus atezolizumab and bevacizumab in patients with intermediate hepatocellular carcinoma: a single-arm, Phase 2 trial. Signal Transduct Target Ther. 2025;10(1):328. doi:10.1038/s41392-025-02427-0

17. Park J-W, Chen M, Colombo M, et al. Global patterns of hepatocellular carcinoma management from diagnosis to death: the BRIDGE Study. Liver Int. 2015;35(9):2155–2166. doi:10.1111/liv.12818

18. Dong Y, Hu J, Meng X, Yang B, Peng C, Zhao W. Development and validation of a radiomics nomogram based on magnetic resonance imaging and clinicoradiological factors to predict HCC TACE refractoriness. Cancer Manag Res. 2025;17:1441–1455. doi:10.2147/cmar.S486561

19. Xu Q, Wang C, Yin G. Immune-related gene signature to predict TACE refractoriness in patients with hepatocellular carcinoma based on artificial neural network. Front Genet. 2023;13:993509. doi:10.3389/fgene.2022.993509

20. Pascuzzo R, Garattini SK, Doniselli FM. Clinical application of radiomics in oncology: where do we stand? J Magn Reson Imaging. 2024;60(6):2745–2746. doi:10.1002/jmri.29340

21. Wang S, Zhao Y, Cai X, et al. CMT-FFNet: a CMT-based feature-fusion network for predicting TACE treatment response in hepatocellular carcinoma. Comput Med Imaging Graph. 2025;124:102577. doi:10.1016/j.compmedimag.2025.102577

22. Li H, Kang W, Rong P. Development and validation of a clinical factors and body fat distribution-based nomogram to predict refractoriness of transarterial chemoembolization in hepatocellular carcinoma. Quant Imaging Med Surg. 2024;14(1):447–461. doi:10.21037/qims-23-963

23. Ho S-Y, Liu P-H, Hsu C-Y, et al. Tumor burden score as a new prognostic marker for patients with hepatocellular carcinoma undergoing transarterial chemoembolization. J Gastroenterol Hepatol. 2021;36(11):3196–3203. doi:10.1111/jgh.15593

24. Xu W, Li R, Liu F. Novel prognostic nomograms for predicting early and late recurrence of hepatocellular carcinoma after curative hepatectomy. Cancer Manag Res. 2020;12:1693–1712. doi:10.2147/CMAR.S241959

25. Zhang L, Zhang X, Li Q, et al. Transarterial chemoembolization failure in patients with hepatocellular carcinoma: incidence, manifestation and risk factors. Clin Res Hepatol Gastroenterol. 2023;47(2):102071. doi:10.1016/j.clinre.2022.102071

26. Reis SP, Sutphin PD, Singal AG, et al. Tumor enhancement and heterogeneity are associated with treatment response to drug-eluting bead chemoembolization for hepatocellular carcinoma. J Computer Assist Tomography. 2017;41(2):289–293. doi:10.1097/RCT.0000000000000509

27. Sheen H, Kim JS, Lee JK, Choi SY, Baek SY, Kim JY. A radiomics nomogram for predicting transcatheter arterial chemoembolization refractoriness of hepatocellular carcinoma without extrahepatic metastasis or macrovascular invasion. Abdom Radiol. 2021;46(6):2839–2849. doi:10.1007/s00261-020-02884-x

28. Hunter MV, Moncada R, Weiss JM, Yanai I, White RM. Spatially resolved transcriptomics reveals the architecture of the tumor-microenvironment interface. Nat Commun. 2021;12(1):6278. doi:10.1038/s41467-021-26614-z

Creative Commons License © 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.