Using machine learning to develop a stacking ensemble learning model for the CT radiomics classification of brain metastases

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Scientific Reports volume 14, Article number: 28575 (2024 ) Cite this article screen printing machine

The objective of this study was to explore the potential of machine-learning techniques in the automatic identification and classification of brain metastases from a radiomic perspective, aiming to improve the accuracy of tumor volume assessment for radiotherapy. By using various machine-learning algorithms, including random forest, support vector machine, gradient boosting machine, XGBoost, decision tree, artificial neural network, k-nearest neighbors, LightGBM, and CatBoost algorithms, a stacking ensemble model was developed to classify gross tumor volume (GTV), brainstem, and normal brain tissue based on radiomic features. Multiple evaluation metrics, including the specificity, sensitivity, negative predictive value, positive predictive value, accuracy, Matthews correlation coefficient, and the Youden index, were used to assess the model’s performance. The stacked ensemble model integrated the strengths of the nine base models and consistently outperformed individual base models in classifying GTV (area under the curve [AUC] = 0.928), brainstem (AUC = 0.932), and normal brain tissue (AUC = 0.942). Among the base models, the support vector machine model demonstrated the best performance in the three classifications (AUC = 0.922, 0.909, and 0.928). The higher performance of the stacked ensemble model highlighted the low performance of other models, including the decision tree (AUC = 0.709, 0.706, 0.804) and k-nearest neighbors (AUC = 0.721, 0.663, 0.729) models in certain contexts, such as when faced with high-dimensional feature spaces. While machine learning shows significant promise in medical image analysis, relying solely on a single model may lead to suboptimal results. By combining the strengths of various algorithms, the stacking ensemble model offers a better solution for the classification of brain metastases based on radiomic features.

Brain metastases are common metastatic lesions of systemic cancer that pose a severe health threat to patients due to their proximity to vital organs1. Radiation therapy stands as one of the primary treatment modalities for brain metastases2. To maximize therapeutic outcomes and minimize harm to patients, it is imperative to precisely and promptly identify and segment these metastatic lesions before providing radiotherapy3. Similarly, the brainstem, responsible for many vital functions, needs accurate delineation and protection during the radiation process. Currently, gross tumor volume (GTV) target segmentation is often manually executed by radiotherapy physicists using computed tomography (CT) simulation localization images. However, this method is highly reliant on the physicist’s personal experience and subjective judgment4. Thus, there is a urgent need for an objective and efficient methods to ensure the precise segmentation of GTV before radiotherapy.

Previous studies on automatic and precise segmentation of radiotherapy targets often used methods such as convolutional neural networks (CNN)5. While CNN possess the ability to train end-to-end directly from raw images to the final segmentation result, the features learned and extracted from images by CNNs are vast, complex, and non-interpretable, limiting their clinical applications6. In recent years, the radiomics of CT scans have garnered increasing attention. Radiomics provides an unprecedented opportunity to explore the intricate characteristics of images. Through specialized algorithms, radiomics can extract a wide range of high-dimensional features from raw images, including size, shape, first-order texture, second-order texture, and higher-order texture features, and assign these features specific radiomic names and parameters7. Extraction of these features not only deepens our understanding of the rich information conveyed by images but also, more crucially, opens new research avenues for precise segmentation of radiotherapy targets. By segmenting CT images through radiomics, a target segmentation strategy can be developed that, compared with the conventional CNN method, is more quantitative.

Advancements in artificial intelligence (AI) in recent years have proffered superior technological means for precision medicine8,9. Machine-learning methods can extract and select more representative radiomic features, facilitating the construction of classification models with excellent capabilities10,11. Building these classification models is a necessary precursor for subsequently using radiomic features in constructing automatic segmentation models. Therefore, this research used nine machine-learning methods, along with a stacking ensemble algorithm model, to amalgamate the strengths of the nine machine-learning models, thereby creating a robust CT radiomic classification tool. This tool aims to achieve efficient and precise classification of GTV, brainstem, and normal brain tissue. The goal of this study is to offer a preliminary radiomic feature selection and foundational classification model strategy for future research into constructing an automatic segmentation model for brain metastases based on radiomic features, thus laying a solid foundation for future precision radiotherapy.

For the development of the model, the radiological data of 113 patients of a median age of 61.3 (range 35–82) years with brain metastases who underwent radiation therapy from January 2016 to January 2021 were examined. The inclusion criteria were (1) age above 18 years, (2) availability of complete electronic medical records and imaging data, and (3) presence of no other brain lesions apart from brain metastases. The exclusion criteria were (1) imaging with artifacts or damage, (2) presence of brain lesions other than brain metastases, (3) and missing patient data.

The images used for analysis were obtained from CT scans for patients’ radiation therapy positioning The enhanced CT scans were performed using the SOMATOM Definition AS 20 slice CT simulator(Siemens, Germany)system with the following scanning parameters: tube voltage, 120 kVp; tube current, 540 mAs; and scanning range, from the top of the skull to the third cervical vertebra. The acquired image pixel size was 512 × 512, and the scanning slice thickness was set at 3 mm with a field of view (FOV) of 250–400 mm. After scanning, the reconstructed CT images were transferred to the specialized radiation three-dimensional treatment planning system (TPS) Pinnacle3 version 9.16. After wavelet transform filtering preprocessing, the tissue characteristics of each were extracted.

Magnetic resonance imaging (MRI) scans in the same position were performed using the Philips 3.0T Ingenia large-bore MRI positioning system (Philips, Netherlands), which was equipped with a high-precision 3D gradient deformation correction module to reduce the geometric deformation caused by the nonlinearity of residual gradient and magnetic field inhomogeneities. Conventional MRI sequences (T1W 3D + c) were performed in supine position. The parameters : 3D scan mode, flip angle, 12°, TE 2.4ms, TR 5.0 (TR/TE = 5.0/2.4), FFE technique, voxel, 0.95 mm×0.96 mm×3 mm, field of view 280 mm×280 mm×186 mm, matrix, 296 × 292 × 62, slice thickness, 3 mm. After obtaining the T1w 3d + c magnetic resonance image of the patient, we adjusted the CT and MR images to the appropriate window width and window level, manually translate the MR image to make it close to the position of the CT image, then adjust the angle and registration range of the MR image, and finally select the best auto registration algorithm for image registration through Pinnacle3 9.16 TPS system.

GTV, brainstem, and normal brain tissue were used as the three regions of interest (ROIs). To facilitate image feature extraction, all images and ROIs were batch-processed and converted to a Neuroimaging Informatics Technology Initiative (NII) format12. NII is a commonly used medical image format that is capable of storing 3D and 4D neuroimaging data such as brain MRI scans and CT scans. This file format is actually a composite file that contains image data as well as associated metadata. This image data is stored in the form of 3D arrays13. All CT images were manually segmented by two senior radiologists with 10 years of experience, using MRI as a fusion reference. In case of disagreement, a third radiologist with 15 years of experience made the final decision. The primary tumor lesion GTV and brainstem were first manually delineated on the images before a formula (whole brain – [GTV + brainstem]) was used to derive the normal brain tissue, forming the final three ROIs.

Image feature extraction was performed using Pyradiomics, a software package for the extraction of radiomic features from medical imaging based on Python version 3.9 (http://PyRadiomics.readthedocs.io/en/latest). Before feature extraction, all CT images were resampled to a voxel size of 1 × 1 × 1 mm3. Radiomic features used in this study included first-order statistics (firstorder), gray-level co-occurrence matrix (glcm), gray-level dependence matrix (gldm), gray-level run length matrix (glrlm), gray-level size zone matrix (glszm), and neighboring gray-tone difference matrix (ngtdm). The algorithms used to obtain these radiomic features mainly referenced the Image Biomarker Standardization Initiative14.

All models in the stacking ensemble model were trained using cross-validation and predicted and evaluated using the bootstrap resampling method. In this study, nine base models were incorporated, namely Random Forest (RF), Support Vector Machine (SVM), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGBoost), Decision Tree (DT), Artificial Neural Network (ANN), K-Nearest Neighbors (KNN), Light Gradient Boosting Machine (LightGBM) and Categorical Boosting (CatBoost). The construction and training of the stacking ensemble model were performed according to the following steps:

(1) The k-fold cross-validation method was used to split the training dataset. The base models were trained on k − 1 portions of the data and then used to predict the remaining portion of the data. This process was repeated k times until each portion of the data had been used as a validation set. For each base model, the process was repeated to obtain prediction results for the entire training set. With m models and n samples, a new dataset of dimensions \(\:n\:\times\:m\) was obtained, where each column represents the prediction results of a base model. As there were nine base models, the new dataset was represented as

where \(\:P\) denotes the new dataset, and \(\:{p}_{i}\) is the prediction result of the \(\:i\) -th model for all training samples.

(2) The new dataset \(\:P\) was used as features, and the target values of the original data were used as labels to train the meta-model such that

where \(\:M\) is the meta-model, and \(\:y\) represents the target values.

(3) The base models were used to predict new data, thereby forming a new prediction dataset \(\:P{\prime\:}\) . The meta-model was used to predict \(\:P{\prime\:}\) to obtain the final results using the formula

where \(\:\widehat{y}\) denotes the prediction results for the new data.

(4) Finally, random resampling was performed 1,000 times from the original training data (with replacements) to form new training datasets. The aforementioned stacking ensemble method (training the base models, training the meta-model, and making the prediction) was used for each bootstrap sample. The model’s performance was evaluated for each bootstrap sample before the average of all the evaluation results was calculated to derive the final performance metrics for the model (Fig. 1).

Schematic of stacking ensemble model.

Finally, the performance of all the base models and the stacking ensemble model was evaluated by performing area under the receiver operating characteristic curve (AUC-ROC) analysis and calculating the sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and the Matthews correlation coefficient (MCC).

After initial analysis extracted a total of 1,674 radiomic features from the ROI segmentation (Fig. 2), further analysis was performed to eliminate potentially redundant features. Mann-Whitney U test was used to identify outcome-related features for each sequence, and only factors with a significance level of P < 0.05 were included for further screening19. The results of the U test indicated that a total of 456 radiomic features for each patient were non-redundant. These features were subsequently included in the subsequent machine-learning-based feature selection process.

Region of interest segmentation for brain metastases. The red shaded area is gross tumor volume, the blue shaded area is the brainstem, and the green shaded area is the result of (whole brain – [gross tumor volume + brainstem]).

Initially, feature selection was executed using the RF approach, with the Bayesian optimization algorithm used for automatic optimal parameter searching. After looped RF classification was conducted for 1–3,000 trees, during which the out-of-bag error rate (OBB) was calculated, an error-rate relationship graph was plotted to determine the ideal parameters ‘mtry’ and ‘ntree’ for the algorithm model. “mtry” is the number of variables (or features) that the model looks at each time it needs to make a decision. Think of it as the number of options the model considers before making each “branch” or split in a decision tree20. “ntree” is the total number of decision trees that the model creates. It can be taken as the model building a forest by creating many trees, each of which learns something slightly different21. s shown in the error rate graph (Fig. 3A), a rapid decline and significant fluctuation in error rate occurred between 0 and 250 trees, but stability was achieved between 500 and 1,000 trees. The Bayesian optimization algorithm ultimately selected 102 trees at the lowest error rate as the final model parameter and set ‘mtry’ to 21. The use of the MeanDecreaseGini method to compute importance scores identified 17 features with a MeanDecreaseGini value exceeding 2, the value deemed significant (Fig. 3B)22, that were included as candidate features for subsequent analysis. SVM-RFE was also used for essential feature selection, with the Bayesian optimization algorithm aiding in locating optimal hyperparameters for SVM-RFE. Ultimately, a 7-fold cross-validation step with 124 iterations resulted in the model’s lowest error rate of 0.137, and 12 radiomic features were shortlisted for further analysis (Fig. 3C). Analysis of the intersection between the features selected by SVM-RFE and RF identified eight crucial radiomic features that were incorporated into the ensuing model construction.

Gross tumor volume, brainstem, and normal brain tissue important feature screening based on random forest and support vector machine recursive feature elimination algorithms. (A, D, G) Out-of-bag error rate calculation. (B, E, H) Random forest importance ranking of important features. (C, F, I) Support vector machine recursive feature elimination algorithm result plot.

When the OBB for the RF model was first computed for 1–3,000 trees, the error rate rapidly decreased and exhibited considerable fluctuation between 0 and 400 trees before stabilizing between 500 and 1,900 trees (Fig. 3D). Using the Bayesian optimization algorithm for optimal hyperparameter searching led to the selection of 289 trees at the lowest error rate alongside ‘mtry’ set to 23 as the final model parameters, ultimately shortlisting eight features for further analysis (Fig. 3E). The use of SVM-RFE with the Bayesian optimization algorithm revealed that after 10-fold cross-validation with 71 iterations, the model achieved its lowest error rate of 0.088 with 13 radiomic features selected (Fig. 3F). Analysis of the intersection of the features chosen by both SVM-RFE and RF identified six radiomic features as essential for inclusion in subsequent model development.

Feature selection was conducted using the RF approach, with the Bayesian optimization algorithm optimizing the best hyperparameters automatically. Calculation of the model OBB showed a rapid decrease and significant fluctuation in the error rate up to 70 trees before stabilizing after 100 trees (Fig. 3G). The Bayesian optimization algorithm ultimately selected 62 trees at the lowest error rate with ‘mtry’ set to 24 as the final model parameter. A total of 20 features were subsequently chosen via the RF classifier for the next analysis phase (Fig. 3H). SVM-RFE, optimized through the Bayesian network, identified its best hyperparameters. Eventually, a 10-fold cross-validation step with 81 iterations yielded the model’s lowest error rate of 0.199 and the selection of eight radiomic features (Fig. 3I). Analysis of the intersection of the features chosen by both SVM-RFE and RF led to the identification of seven significant radiomic features for further model construction.

All hyperparameters used for GTV classification model construction can be found in Supplementary File 2. The prediction results of all base models are presented in Table 1. Among them, the SVM model achieved the highest area under the curve (AUC = 0.922, 95% confidence interval [CI] = 0.901–0.942). In contrast, the DT (AUC = 0.709, 95% CI = 0.661–0.758) and KNN (AUC = 0.721, 95% CI = 0.675–0.766) models had relatively inferior performances. Due to the robust performance of the SVM model, it was chosen as the meta-model in the construction of the stacking ensemble model. Throughout 1,000 iterations of bootstrap resampling for training and internal validation assessment, the final stacking model demonstrated superior predictive performance (AUC = 0.928, 95% CI = 0.906–0.950) compared with the base models. Performance metrics for the base models and the stacking model are shown in Fig. S1.

Table 2 depicts the outcomes of all the base models. Both the DT (AUC = 0.706, 95% CI = 0.642–0.771) and KNN (AUC = 0.663, 95% CI = 0.579–0.748) models yielded the lowest AUC values, whereas the SVM model yielded the highest AUC value (AUC = 0.909, 95% CI = 0.873–0.944). To harness the robust attributes of the SVM model, it was integrated as the meta-model in the ensemble. After the implementation of the bootstrap method using a rigorous 1000-iteration train-and-validate approach, the resultant stacking model outperformed the foundational models, yielding an AUC of 0.928 (95% CI = 0.906–0.950). Additional performance metrics are illustrated in Fig. S2.

In the classification of normal brain tissue, which is shown in Table 3 the SVM model yielded the highest performance, with an impressive AUC of 0.928 (95% CI = 0.908–0.947), whereas the KNN (AUC = 0.729, 95% CI = 0.679–0.779) and GBM (AUC = 0.695, 95% CI = 0.641–0.749) models yielded much lower performance. Given the consistent robustness of SVM, it was adopted as the pivotal component for the stacked ensemble. Following the bootstrap protocol, the consolidated stacking model displayed enhanced efficacy, achieving an AUC of 0.928 (95% CI = 0.906–0.950). A comprehensive overview of metrics spanning the base to stacking models is delineated in Fig. S3.

In this study, the “one-vs-rest” approach was used to transform the CT images of three tissues—GTV, brainstem, and normal brain tissue—into separate binary classification problems for significant radiomic feature selection and classification model construction. For each binary classification problem, nine base models and one stacking ensemble model were constructed. The results showed that the stacking model is an promising classifier in the three binary classification problems.

When the dimensionality of the feature space is very high, the KNN model may suffer from the so-called “curse of dimensionality.” In high-dimensional spaces, the distances between all data points tend to become relatively large, leading to a decline in the performance of KNN models26,27. SVM models, especially SVM models with kernel tricks, can work effectively in high-dimensional spaces28. For datasets that are not linearly separable, SVM models with kernel tricks can find a function that transforms the data into a high-dimensional space, making the data linearly separable in that space29. Although KNN models are theoretically capable of handling non-linear problems, in practice, finding an appropriate k value and distance metric can be challenging30.

The ensemble approach has often been used in machine learning for its capacity to merge the strengths of individual models, thereby potentially elevating the overall performance31. The aim of using an ensemble, especially a stacking ensemble, is to harmonize predictions from various models, resulting in a reduction of model-specific biases and errors and hence improving overall prediction accuracy32. The decision to use SVM as the meta-model for the stacked ensemble in this study is rooted in its repeated demonstration of superiority in individual predictions. This decision was further validated by the higher AUC values achieved by the stacking ensemble models across all tissue classifications.

This study also used multiple evaluation metrics, including specificity, sensitivity, NPV, PPV, accuracy, MCC, and the Youden index, to assess the differences in the capabilities of each model. These metrics provide a comprehensive view of a model’s performance from various perspectives, allowing for a more nuanced understanding of each model’s strengths and weaknesses. Specificity and sensitivity offer insight into the true negative and true positive rates, respectively, which is essential for understanding a model’s performance in binary classification problems33. NPV and PPV provide perspectives on the predictive values for negative and positive classifications34; accuracy provides a broad overview of the overall correctness of a model’s predictions35. Additionally, MCC offers a balanced evaluation, considering both the true and false rates of predictions36; lastly, the Youden index serves as a single statistic that captures the performance of a diagnostic test.

By leveraging these diverse evaluation metrics, a holistic understanding of the models’ performance was obtained, ensuring that the assessments were not biased toward any specific aspect of the predictions37. This multi-metric approach allowed for a rigorous comparison of the models, highlighting those that performed consistently well across various aspects of classification and those that might have specific areas of strength or weakness. A finding of note was that in the classification problem between the brainstem and other tissues, the sensitivity of the DT model was only 37%, while its specificity reached 90%. This might be because during its construction, the DT model prioritized avoiding the misclassification of normal tissues as the brainstem and focused less on scenarios where the brainstem was misclassified as normal tissue. When confronted with imbalanced datasets, DT models often lean toward accurately predicting the dominant class38. Furthermore, there might be biases in the model’s feature selection, enhancing its ability to distinguish non-brainstem tissues while weakening its ability to recognize the brainstem.

The stacking ensemble model exhibited favorable performance across all three problems. This success can be attributed to the core principle of the stacking method: consolidating predictions from multiple models to produce an outcome39. This approach effectively harnesses the strengths of individual base models while circumventing potential flaws or errors that arise when a model is used in isolation. Especially when dealing with complex or highly non-linear data, the ensemble method offers added robustness40. Compared with a singular model, the stacking ensemble reduces the risk of overfitting by combining predictions from several models. Moreover, its commendable generalization capabilities are evident even after 1,000 resampling iterations using the bootstrap method.

In summary, the exemplary performance of the stacking ensemble model underscores the superiority of multi-model ensemble approaches in tackling intricate classification problems, providing compelling evidence for its adoption in similar future applications. This suggests that research relying solely on a single model is less reliable, as it is challenging for one model to identify the best approach for a particular classification problem. Some models may not be well suited to the problem, hindering their practical application. In different problem-solving scenarios, various algorithms exhibit distinct suitabilities.

This study’s successful classification of GTV, brainstem, and normal brain tissue of patients with brain metastases from a radiomic perspective provides a preliminary research reference for further studies on automatic precision segmentation based on radiomics. The findings will aid physicians in more accurately assessing tumor volume and location, subsequently improving radiotherapy planning. Such advancements have the potential to enhance patient survival and quality of life.

This study has several limitations. First, the ensemble model, although superior, might be computationally intensive and require a more extended training period. Second, as a bootstrap resampling method was used for internal validation of only one dataset, the extent of the model’s generalizability needs to be assessed. External validation using a different dataset would confirm the model’s generalizability. Despite its limitations, the findings of this study undoubtedly open new possibilities for the automatic identification and classification of brain metastases, further highlighting the immense potential of machine learning in medical image analysis. The methodology described here can be further applied to research on automatic segmentation of tumor target areas and organs at risk. Demonstration of the capacity to automatically classify brain metastases based on significant radiomic features can be applied to further identification of these pivotal radiomic parameters as alternatives to pixel values. The use of the approach described here eliminates the need for image registration across different images, precisely addressing the challenges of traditional target area auto-delineation, which heavily relies on the grayscale pixel values of patient images, leading to lower accuracy due to limited reference parameters.

By using nine distinct algorithms, radiomic classification models for the brainstem, normal brain tissue, and gross tumor volume of brain metastases were constructed. The stacking ensemble model consistently exhibited the best performance across all three classification tasks. This finding not only underscores the efficacy of ensemble methods in addressing complex classification challenges but also emphasizes the significance of leveraging multiple base models to harness their strengths. The proposed model holds great importance for future endeavors in assisting with the validation of accurate radiotherapy segmentation areas and aiding in precision segmentation.

All data generated and analyzed during this study are included in this published article.

Achrol, A. S. et al. Brain metastases. Nat. Rev. Dis. Primers. 5 (1), 5 (2019).

Gondi, V. et al. Radiation Therapy for Brain metastases: an ASTRO Clinical Practice Guideline. Pract. Radiat. Oncol. 12 (4), 265–282 (2022).

Shi, F. et al. Deep learning empowered volume delineation of whole-body organs-at-risk for accelerated radiotherapy. Nat. Commun. 13 (1), 6566 (2022).

Article ADS PubMed PubMed Central CAS Google Scholar

Claessens, M. et al. Machine learning-based detection of aberrant deep learning segmentations of target and organs at risk for prostate radiotherapy using a secondary segmentation algorithm. Phys. Med. Biol. 67 (11), 10 (2022).

Han, Z., Jian, M., Wang, G. G. & ConvUNeXt An efficient convolution neural network for medical image segmentation. Knowl. Based Syst. 253, 109512 (2022).

Ali, R. et al. Structural crack detection using deep convolutional neural networks. Autom. Constr. 133, 103989 (2022).

Mayerhoefer, ME et al. Introduction to radiomics. J. Nucl. Med. 61(4), 488–495 (2020).

Article PubMed PubMed Central CAS Google Scholar

Zwanenburg, A. et al. The image Biomarker Standardization Initiative: standardized quantitative Radiomics for High-Throughput Image-based phenotyping. Radiology 295 (2), 328–338 (2020).

Hussain, S. F. & Ashraf, M. M. A novel one-vs-rest consensus learning method for crash severity prediction. Exp. Sys Appl. 228, 120443 (2023).

Luo, X. et al. Multi-Classification Data Stream Algorithm Based on One-Vs-Rest Strategy. In 2023 3rd International Conference on Artificial Intelligence, Aut and Alg. ;21:66–72. (2023).

Mao, N. et al. Intratumoral and peritumoral radiomics for preoperative prediction of neoadjuvant chemotherapy effect in breast cancer based on contrast-enhanced spectral mammography. Eur. Radiol. 32 (5), 3207–3219 (2022).

Article PubMed CAS Google Scholar

Hou, J. et al. MRI-based radiomics nomogram for predicting temporal lobe injury after radiotherapy in nasopharyngeal carcinoma. Eur. Radiol. 32 (2), 1106–1114 (2022).

Elhadad, A., Jamjoom, M. & Abulkasim, H. Reduction of NIFTI files storage and compression to facilitate telemedicine services based on quantization hiding of downsampling approach. Sci. Rep. 14 (1), 5168 (2024).

Article ADS PubMed PubMed Central CAS Google Scholar

Huang, Y. et al. Longitudinal MRI-based fusion novel model predicts pathological complete response in breast cancer treated with neoadjuvant chemotherapy: a multicenter, retrospective study. EClinicalMedicine 58, 101899 (2023).

Article PubMed PubMed Central Google Scholar

Chen, W. et al. Screening diagnostic markers for acute myeloid leukemia based on bioinformatics analysis. Tra Can. Res. 11 (6), 1722 (2022).

Khan, S. U. et al. A machine learning-based approach for the segmentation and classification of malignant cells in breast cytology images using gray level co-occurrence matrix (GLCM) and support vector machine (SVM). Neu Com. Appl. 1, 1–8 (2022).

Uddin, S., Haque, I., Lu, H., Moni, M. A. & Gide, E. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 12 (1), 6256 (2022).

Article ADS PubMed PubMed Central CAS Google Scholar

Patange, A. D., Pardeshi, S. S., Jegadeeshwaran, R., Zarkar, A. & Verma, K. Augmentation of decision tree model through hyper-parameters tuning for monitoring of cutting tool faults based on vibration signatures. Jou Vib. Eng. Tec. 23, 1–9 (2022).

Peng, Y. et al. Deep learning and machine learning predictive models for neurological function after interventional embolization of intracranial aneurysms. Front. Neurol. 15, 1321923 (2024).

Article PubMed PubMed Central Google Scholar

Probst, P., Wright, M. N. & Boulesteix, A. L. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Min. Knowl. Discovery. 9 (3), e1301 (2019).

Van Echelpoel, W. & Goethals, P. L. M. Variable importance for sustaining macrophyte presence via random forests: data imputation and model settings. Sci. Rep. 8 (1), 14557 (2018).

Article ADS PubMed PubMed Central Google Scholar

Tavakoli, E. B., Beygi, A. & Yao, X. RPkNN: an OpenCL-Based FPGA implementation of the dimensionality-reduced kNN algorithm using Random Projection. IEEE Trans. Very Large Scale Integr. VLSI Syst. 30 (4), 549–552 (2022).

Jiang, X., Kong, X. & Ge, Z. Augmented Industrial Data-Driven modeling under the curse of dimensionality. IEEE/CAA J. Automatica Sinica. 10 (6), 1445–1461 (2023).

Rao, C. S. & Karunakara, K. Efficient detection and classification of brain tumor using kernel based SVM for MRI. Mul Tools Appl. (5):7393–7417. (2022).

Shi, X. et al. Application of the gaussian process regression method based on a combined Kernel function in Engine Performance Prediction. ACS Omega. 7 (45), 41732–41743 (2022).

Article PubMed PubMed Central CAS Google Scholar

Ayyad, S. M., Saleh, A. I. & Labib, L. M. Gene expression cancer classification using modified K-Nearest neighbors technique. Biosystems 176, 41–51 (2019).

Article PubMed CAS Google Scholar

Aboneh, T., Rorissa, A. & Srinivasagan, R. Stacking-based ensemble learning method for multi-spectral image classification. Technologies 10 (1), 17 (2022).

Cao, H. et al. Application of stacking ensemble learning model in quantitative analysis of biomaterial activity. Mic J. 183, 108075 (2022).

Isbel, L., Grand, R. S. & Schübeler, D. Generating specificity in genome regulation through transcription factor sensitivity to chromatin. Nat. Rev. Genet. 23 (12), 728–740 (2022).

Article PubMed CAS Google Scholar

Wong, H. B. & Lim, G. H. Measures of diagnostic accuracy: sensitivity, specificity, PPV and NPV. Pro Sin Healthc. 20 (4), 316–318 (2011).

Swets, J. A. Measuring the accuracy of diagnostic systems. Science 240 (4857), 1285–1293 (1988).

Article ADS MathSciNet PubMed CAS Google Scholar

Chicco, D. & Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 21 (1), 6 (2020).

Schisterman, E. F. et al. Optimal cut-point and its corresponding Youden Index to discriminate individuals using pooled blood samples. Epidemiology 16 (1), 73–81 (2005).

Goeminne, L. J., Gevaert, K. & Clement, L. Peptide-level Robust Ridge Regression improves estimation, sensitivity, and specificity in Data-dependent quantitative label-free Shotgun Proteomics. Mol. Cell. Proteom. 15 (2), 657–668 (2016).

Pavlyshenko, B. Using stacking approaches for machine learning models. IEEE ;255–258. (2018).

Książek, W. et al. Development of novel ensemble model using stacking learning and evolutionary computation techniques for automated hepatocellular carcinoma detection. Bio Bio Eng. 40 (4), 1512–1524 (2020).

Bhinder, B. et al. Artificial intelligence in cancer research and precision medicine. Cancer Discov. 11 (4), 900–915 (2021).

Article PubMed PubMed Central CAS Google Scholar

Mesko, B. The role of artificial intelligence in precision medicine. Expert Rev. Precision Med. Drug Dev. 2 (5), 239–241 (2017).

Mishra, A. K. et al. Breast ultrasound tumour classification: a machine learning—Radiomics based approach. Expert Syst. 38 (7), e12713 (2021).

Gitto, S. et al. CT radiomics-based machine learning classification of atypical cartilaginous tumours and appendicular chondrosarcomas. EBioMedicine 68, 103407 (2021).

Article PubMed PubMed Central Google Scholar

Graphic Abstract is created with biorender.com.

We acknowledge funding from The Open Fund for Scientific Research of Jiangxi Cancer Hospital (number:2021J15), the Sichuan Science and Technology Program (No. 2022YFS0616), The Gulin County People’s Hospital-The Affiliated Hospital of Southwest Medical University Science and Technology Strategic Cooperation Project (project number: 2022GLXNYDFY05), the Sichuan Provincial Medical Research Project Plan (No. S21004), the Key-funded Project of the National College Student Innovation and Entrepreneurship Training Program (No.202310632001), the National College Student Innovation and Entrepreneurship Training Program (No. 202310632028), the National College Student Innovation and Entrepreneurship Training Program (No. S202410632165X), the Provincial University Innovation and Entrepreneurship Training Program (No. S202210632248).

These authors contributed equally: Huai-wen Zhang, Yi-ren Wang, Bo Hu and Bo Song.

Department of Radiotherapy, The Second Affiliated Hospital of Nanchang Medical College, Jiangxi Clinical Research Center for Cancer, Jiangxi Cancer Hospital, Nanchang, 330029, China

Huai-wen Zhang, Xi Wang & Xiao-ming Zhong

Department of Neurosurgery, Jingdezhen No.1 People’s Hospital, Jingdezhen, 333000, China

School of Nursing, Southwest Medical University, Luzhou, 646000, China

Yi-ren Wang, Zhong-jian Wen & Xiao-man Chen

Wound Healing Basic Research and Clinical Application Key Laboratory of Luzhou, Southwest Medical University, Luzhou, 646000, China

Key Laboratory of Nondestructive Testing (Ministry of Education), Nanchang Hang Kong University, Nanchang, 330063, China

School of Medical Information and Engineering, Southwest Medical University, Luzhou, 646000, China

Department of Nursing, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, China

Department of Radiology, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, China

Department of Oncology, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, China

Department of Oncology, Gulin County People’s Hospital, Gulin, 646500, China

You can also search for this author in PubMed Google Scholar

Contributions: (I) Conception and design: Hao-wen Pang, You-hua Wang, Huai-wen Zhang, Yi-ren Wang; (II) Administrative support: Hao-wen Pang, Ping Zhou, Huai-wen Zhang; (III) ) Provision of study materials or patients: Huai-wen Zhang, Hao-wen Pang, Xiao-ming Zhong, Bo Song, Xi Wang; (IV) Collection and assembly of data: Yi-ren Wang, Zhong-jian Wen, Bo Hu, Xiao-man Chen; (V) Data analysis and interpretation: You-hua Wang, Yi-ren Wang, Zhong-jian Wen, Bo Hu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to Huai-wen Zhang, Xiao-ming Zhong, Hao-wen Pang or You-hua Wang.

According to the ethical guide-lines of the Helsinki Declaration and was approved by the institutional review board of Jiang-xi Cancer Hospital. Written informed consents were obtained from all patients prior to treatment. Informed consent forms were signed by all patients. The study was performed in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Jiang-xi Cancer Hospital (ethics number:2023KY082).

Consent for publication is not applicable in this study, because there is not any individual person’s data.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Below is the link to the electronic supplementary material.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Zhang, Hw., Wang, Yr., Hu, B. et al. Using machine learning to develop a stacking ensemble learning model for the CT radiomics classification of brain metastases. Sci Rep 14, 28575 (2024). https://doi.org/10.1038/s41598-024-80210-x

DOI: https://doi.org/10.1038/s41598-024-80210-x

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Scientific Reports (Sci Rep) ISSN 2045-2322 (online)

automatic laminating machine Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Using machine learning to develop a stacking ensemble learning model for the CT radiomics classification of brain metastases | Scientific Reports