Advertisement

Generative models improve radiomics performance in different tasks and different datasets: An experimental study

  • Junhua Chen
    Correspondence
    Corresponding author at: Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht 6229 ET, Netherlands.
    Affiliations
    Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht 6229 ET, Netherlands
    Search for articles by this author
  • Inigo Bermejo
    Affiliations
    Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht 6229 ET, Netherlands
    Search for articles by this author
  • Andre Dekker
    Affiliations
    Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht 6229 ET, Netherlands
    Search for articles by this author
  • Leonard Wee
    Affiliations
    Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht 6229 ET, Netherlands
    Search for articles by this author
Open AccessPublished:April 22, 2022DOI:https://doi.org/10.1016/j.ejmp.2022.04.008

      Highlights

      • Generative models can improve radiomics performance in different tasks when radiomics extracted from low dose CTs.
      • Simulation paired low-high dose CTs trained generative models can be used to denoise low dose CT without re-training.
      • Generative models can improve AUC by 0.05 of radiomics in survival predication and lung cancer diagnosis.
      • Denoising using generative models seems to be a necessary pre-processing step for radiomic features from low dose CTs.

      Abstract

      Purpose

      Radiomics is an active area of research focusing on high throughput feature extraction from medical images with a wide array of applications in clinical practice, such as clinical decision support in oncology. However, noise in low dose computed tomography (CT) scans can impair the accurate extraction of radiomic features. In this article, we investigate the possibility of using deep learning generative models to improve the performance of radiomics from low dose CTs.

      Methods

      We used two datasets of low dose CT scans – NSCLC Radiogenomics and LIDC-IDRI – as test datasets for two tasks – pre-treatment survival prediction and lung cancer diagnosis. We used encoder-decoder networks and conditional generative adversarial networks (CGANs) trained in a previous study as generative models to transform low dose CT images into full dose CT images. Radiomic features extracted from the original and improved CT scans were used to build two classifiers – a support vector machine (SVM) and a deep attention based multiple instance learning model – for survival prediction and lung cancer diagnosis respectively. Finally, we compared the performance of the models derived from the original and improved CT scans.

      Results

      Denoising with the encoder-decoder network and the CGAN improved the area under the curve (AUC) of survival prediction from 0.52 to 0.57 (p-value < 0.01). On the other hand, the encoder-decoder network and the CGAN improved the AUC of lung cancer diagnosis from 0.84 to 0.88 and 0.89 respectively (p-value < 0.01). Finally, there are no statistically significant improvements in AUC using encoder-decoder networks and CGAN (p-value = 0.34) when networks trained at 75 and 100 epochs.

      Conclusion

      Generative models can improve the performance of low dose CT-based radiomics in different tasks. Hence, denoising using generative models seems to be a necessary pre-processing step for calculating radiomic features from low dose CTs.

      Keywords

      Introduction

      Recent years have seen a dramatic increase in the applications of artificial intelligence in medical imaging [
      • Avanzo M.
      • Porzio M.
      • Lorenzon L.
      • Milan L.
      • Sghedoni R.
      • Russo G.
      • et al.
      Artificial intelligence applications in medical imaging: a review of the medical physics research in Italy.
      ]. Radiomics, [
      • Aerts H.J.W.L.
      • Velazquez E.R.
      • Leijenaar R.T.H.
      • Parmar C.
      • Grossmann P.
      • Carvalho S.
      • et al.
      Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach.
      ] for example, has been applied to clinical-decision support in oncology in a range of cancers (lung cancers, [
      • Desseroit M.-C.
      • Tixier F.
      • Weber W.A.
      • Siegel B.A.
      • Rest C.C.L.
      • Visvikis D.
      • et al.
      Reliability of PET/CT shape and heterogeneity features in functional and morphologic components of non–small cell lung cancer tumors: a repeatability analysis in a prospective multicenter cohort.
      ] head and neck cancer, [
      • Bogowicz M.
      • Riesterer O.
      • Bundschuh R.A.
      • Veit-Haibach P.
      • Hüllner M.
      • Studer G.
      • et al.
      Stability of radiomic features in CT perfusion maps.
      ] rectal cancer [
      • Tixier F.
      • Hatt M.
      • Le Rest C.C.
      • Le Pogam A.
      • Corcos L.
      • Visvikis D.
      Reproducibility of tumor uptake heterogeneity characterization through textural feature analysis in 18 F-FDG PET.
      ]) multiple medical imaging modalities (computed tomography (CT), [
      • Bogowicz M.
      • Riesterer O.
      • Bundschuh R.A.
      • Veit-Haibach P.
      • Hüllner M.
      • Studer G.
      • et al.
      Stability of radiomic features in CT perfusion maps.
      ] magnetic resonance imaging (MRI), [
      • Zhang B.
      • Tian J.
      • Dong D.i.
      • Gu D.
      • Dong Y.
      • Zhang L.u.
      • et al.
      Radiomics features of multiparametric MRI as novel prognostic factors in advanced nasopharyngeal carcinoma.
      ] and positron emission tomography (PET)), [
      • Desseroit M.-C.
      • Tixier F.
      • Weber W.A.
      • Siegel B.A.
      • Rest C.C.L.
      • Visvikis D.
      • et al.
      Reliability of PET/CT shape and heterogeneity features in functional and morphologic components of non–small cell lung cancer tumors: a repeatability analysis in a prospective multicenter cohort.
      ] and applications, such as deriving prognostic models to measure therapeutic plan efficiency [
      • Comes M.C.
      • Fanizzi A.
      • Bove S.
      • Didonna V.
      • Diotaiuti S.
      • La Forgia D.
      • et al.
      Early prediction of neoadjuvant chemotherapy response by exploiting a transfer learning approach on breast DCE-MRIs.
      ,

      Comes, Maria Colomba, Daniele La Forgia, Vittorio Didonna, Annarita Fanizzi, Francesco Giotta, Agnese Latorre, Eugenio Martinelli et al., Early prediction of breast cancer recurrence for patients treated with neoadjuvant chemotherapy: a transfer learning approach on DCE-MRIs. Cancers 13, no. 10 (2021): 2298. https://doi.org/10.3390/cancers13102298.

      ,

      La Forgia, Daniele, Angela Vestito, Maurilia Lasciarrea, Maria Colomba Comes, Sergio Diotaiuti, Francesco Giotta, Agnese Latorre et al., Response predictivity to neoadjuvant therapies in breast cancer: A qualitative analysis of background parenchymal enhancement in DCE-MRI. J. Pers. Med 11, no. 4 (2021): 256. https://doi.org/10.3390/jpm11040256.

      ]. Radiomics has also garnered attention in the field of radiotherapy, where it is known as dosiomics [

      Placidi, Lorenzo, Eliana Gioscio, Cristina Garibaldi, Tiziana Rancati, Annarita Fanizzi, Davide Maestri, Raffaella Massafra et al., A multicentre evaluation of dosiomics features reproducibility, stability and sensitivity. Cancers 13, no. 15 (2021): 3835. https://doi.org/10.3390/cancers13153835.

      ].
      Following the ALARA (As Low As Reasonably Achievable) principle [

      Musolino, Stephen V., Joseph DeFranco, and Richard Schlueck. “The ALARA principle in the context of a radiological or nuclear emergency.” Health Phys. 94 (2) (2008): 109–111. https://doi.org/10.1097/01.HP.0000285801.87304.3f.

      ], low dose CTs has become popular as the preferred imaging method for screening and monitoring populations at risk [

      Bi, Wenya Linda, Ahmed Hosny, Matthew B. Schabath, Maryellen L. Giger, Nicolai J. Birkbak, Alireza Mehrtash, Tavis Allison et al., Artificial intelligence in cancer imaging: clinical challenges and applications. Ca-Cancer J. Clin 69, no. 2 (2019): 127–157. https://doi.org/10.3322/caac.21552.

      ]. As a tradeoff of low radiation exposure, low dose CTs’ image quality is inferior to that of full dose CTs’, due to the higher noise levels present in low dose CTs. Radiomics applied to low dose CT has already been shown to improve the accuracy of pulmonary nodules analysis for early detection during lung cancer screening [
      • Gillies R.J.
      • Schabath M.B.
      Radiomics improves cancer screening and early detection.
      ,
      • Choi W.
      • Oh J.H.
      • Riyahi S.
      • Liu C.-J.
      • Jiang F.
      • Chen W.
      • et al.
      Radiomics analysis of pulmonary nodules in low-dose CT for early detection of lung cancer.
      ]. In addition, different studies have shown the potential of radiomics on low dose CT for survival prediction [
      • Homayounieh F.
      • Yan P.
      • Digumarthy S.R.
      • Kruger U.
      • Wang G.e.
      • Kalra M.K.
      Prediction of coronary calcification and stenosis: role of radiomics from Low-Dose CT.
      ,
      • van Timmeren J.E.
      • Leijenaar R.T.H.
      • van Elmpt W.
      • Reymen B.
      • Oberije C.
      • Monshouwer R.
      • et al.
      Survival prediction of non-small cell lung cancer patients using radiomics analyses of cone-beam CT images.
      ,
      • Nawa T.
      • Nakagawa T.
      • Mizoue T.
      • Kusano S.
      • Chonan T.
      • Fukai S.
      • et al.
      Long-term prognosis of patients with lung cancer detected on low-dose chest computed tomography screening.
      ,
      • Ayati N.
      • Lee S.T.
      • Zakavi S.R.
      • Cheng M.
      • Lau W.F.E.
      • Parakh S.
      • et al.
      Response evaluation and survival prediction after PD-1 immunotherapy in patients with non–small cell lung cancer: comparison of assessment methods.
      ]. However, image quality and noise impact the repeatability and reproducibility of radiomic features [
      • Traverso A.
      • Wee L.
      • Dekker A.
      • Gillies R.
      Repeatability and reproducibility of radiomic features: a systematic review.
      ] as well as their robustness [
      • Bagher-Ebadian H.
      • Siddiqui F.
      • Liu C.
      • Movsas B.
      • Chetty I.J.
      On the impact of smoothing and noise on robustness of CT and CBCT radiomics features for patients with head and neck cancers.
      ]. In other words, radiomic features extracted from low dose CTs have lower reliability than the counterparts extracted from full dose CTs. Therefore, prediction models or computer aided diagnosis systems based on radiomic features from low dose CTs will likely be less robust and accurate than those based on radiomic features from full dose CT. Improving the performance of radiomics calculated from low dose CT in different tasks and datasets is therefore a timely and potentially impactful research topic.
      One approach is to denoise low dose CT scans [
      • Kelm Z.S.
      • Blezek D.
      • Bartholmai B.
      • Erickson B.J.
      Optimizing non-local means for denoising low dose CT.
      ] and to recalculate the radiomic features based on the denoised CT. The aim of this article, is to answer the question: should we regard denoising as a preprocessing step for radiomic feature extraction from low dose CT? Image denoising can be regarded as a special case of domain adaptation [

      Chen, Minmin, Zhixiang Xu, Kilian Weinberger, and Fei Sha. Marginalized denoising autoencoders for domain adaptation. arXiv preprint arXiv:1206.4683 (2012).

      ], from low dose CT images to full dose style CT images [
      • Yang Q.
      • Yan P.
      • Zhang Y.
      • Hengyong Y.u.
      • Shi Y.
      • Mou X.
      • et al.
      Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss.
      ]. Many methods have been proposed to perform this transformation [
      • Sharma A.
      • Chaurasia V.
      A review on magnetic resonance images denoising techniques.
      ,

      Kollem, Sreedhar, Katta Rama Linga Reddy, and Duggirala Srinivasa Rao. “A review of image denoising and segmentation methods based on medical images.” Int. J. Mach. Learn. Comput. 9, (3) (2019): 288–295. https://doi.org/10.18178/ijmlc.2019.9.3.800.

      ], but recently deep learning [
      • LeCun Y.
      • Bengio Y.
      • Hinton G.
      Deep learning.
      ] based generative models have garnered special attention and achieved state-of-art results [
      • Shan H.
      • Zhang Y.i.
      • Yang Q.
      • Kruger U.
      • Kalra M.K.
      • Sun L.
      • et al.
      3-D convolutional encoder-decoder network for low-dose CT via transfer learning from a 2-D trained network.
      ,

      Chen, Hu, Yi Zhang, Mannudeep K. Kalra, Feng Lin, Yang Chen, Peixi Liao, Jiliu Zhou, and Ge Wang. Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans. Med. Imaging 36, no. 12 (2017): 2524-2535. https://doi.org/ 10.1109/TMI.2017.2715284.

      ,
      • Kang E.
      • Chang W.
      • Yoo J.
      • Ye J.C.
      Deep convolutional framelet denosing for low-dose CT via wavelet residual network.
      ]. We will use generative models to denoise low dose CT scans and improve the reliability of radiomic features [
      • Lucia F.
      • Visvikis D.
      • Desseroit M.-C.
      • Miranda O.
      • Malhaire J.-P.
      • Robin P.
      • et al.
      Prediction of outcome using pretreatment 18F-FDG PET/CT and MRI radiomics in locally advanced cervical cancer treated with chemoradiotherapy.
      ,

      Parmar, Chintan, Ralph TH Leijenaar, Patrick Grossmann, Emmanuel Rios Velazquez, Johan Bussink, Derek Rietveld, Michelle M. Rietbergen, Benjamin Haibe-Kains, Philippe Lambin, and Hugo JWL Aerts. Radiomic feature clusters and prognostic signatures specific for lung and head & neck cancer. Sci. Rep. 5, no. 1 (2015): 1–10. https://doi.org/10.1038/srep11044.

      ].
      In addition, we will explore whether more reliable radiomic features result in models with better performance using two real applications of radiomics: pre-treatment survival prediction [
      • Avanzo M.
      • Porzio M.
      • Lorenzon L.
      • Milan L.
      • Sghedoni R.
      • Russo G.
      • et al.
      Artificial intelligence applications in medical imaging: a review of the medical physics research in Italy.
      ] and cancer diagnosis [
      • Chen C.-H.
      • Chang C.-K.
      • Tu C.-Y.
      • Liao W.-C.
      • Wu B.-R.
      • Chou K.-T.
      • et al.
      Radiomic features analysis in computed tomography images of lung nodule classification.
      ,
      • Li H.
      • Zhu Y.
      • Burnside E.S.
      • Huang E.
      • Drukker K.
      • Hoadley K.A.
      • et al.
      Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set.
      ]. The cancer diagnosis task will be based on [

      Chen, Junhua, Zeng, Haiyan, Zhang, Cong, et al. “Lung cancer diagnosis using deep attention based multiple instance learning and radiomics.” Med. Phys.. 2022; 00: 00- 00. https://doi.org/10.1002/mp.15539.

      ], in which lung cancer diagnosis was approached as a multiple instance learning (MIL) problem [
      • Maron O.
      • Lozano-Pérez T.
      A framework for multiple-instance learning.
      ] where nodules in each CT scan were regarded as instances. The authors used radiomic features as the input and deep attention based MIL [
      • Ilse M.
      • Tomczak J.
      • Welling M.
      Attention-based deep multiple instance learning.
      ] as the MIL problem solver for the sake of interpretability. The authors reported a mean precision of 0.807 with a standard error of the mean (SEM) of 0.069, a recall of 0.870 (SEM 0.061), and an area under the curve (AUC) of 0.842 (SEM 0.074) by using this method.
      The most related literature to this article is [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ], where the authors trained three generative models – encoder-decoder networks [

      Chen, Hu, Yi Zhang, Mannudeep K. Kalra, Feng Lin, Yang Chen, Peixi Liao, Jiliu Zhou, and Ge Wang. Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans. Med. Imaging 36, no. 12 (2017): 2524-2535. https://doi.org/ 10.1109/TMI.2017.2715284.

      ], conditional generative adversarial networks (GANs) [
      • Isola P.
      • Zhu J.-Y.
      • Zhou T.
      • Efros A.A.
      Image-to-image translation with conditional adversarial networks.
      ] and cycle GANs [
      • Zhu J.-Y.
      • Park T.
      • Isola P.
      • Efros A.A.
      Unpaired image-to-image translation using cycle-consistent adversarial networks.
      ] – using full dose CTs and simulated paired high-noise low dose CTs. Finally, they showed that radiomic features extracted from low dose CT scans (low-noise CT and high-noise CT) denoised by the models had improved reproducibility. The main differences between [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ] and this article is that: 1) we use pre-trained generative models; 2) we use real (not simulated) low dose CTs; and 3) we focus on the improvement in radiomics-based model performance instead of feature reproducibility.
      To the authors’ best knowledge, this is the first effort to improve the performance of radiomics-based models from features extracted from low dose CT scans. Source code, Radiomics features, data for statistical analysis and supplementary materials of this article are available online at https://gitlab.com/UM-CDS/low-dose-ct-denoising/-/tree/Experimental_Study.

      Methods

      Institutional Review Board approval was not applicable for this study, since the primary source of data was an open access collection on The Cancer Imaging Archive (National Institutes of Health) [
      • Clark K.
      • Vendt B.
      • Smith K.
      • Freymann J.
      • Kirby J.
      • Koppel P.
      • et al.
      The cancer imaging archive (TCIA): maintaining and operating a public information repository.
      ] and all patients’ personal information had been removed from CT scans. This dataset has been used for this study in accordance with the Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC) conditions. The flowchart in Fig. 1 summarizes our study methodology.

      Denoising models’ development

      Based on [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ], we selected two generative models – encoder-decoder networks and CGANs – that achieved good performance in improving radiomics reproducibility as the experimental models for this study. Moreover, we took the same architecture of encoder-decoder network and CGANs presented in [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ].
      Training of encoder-decoder networks and CGANs requires paired low dose and full dose versions of the same CT scan. Although there is an open access dataset containing this kind of scans [
      • McCollough C.H.
      • Bartley A.C.
      • Carter R.E.
      • Chen B.
      • Drees T.A.
      • Edwards P.
      • et al.
      Low-dose CT for the detection and classification of metastatic liver lesions: Results of the 2016 Low Dose CT Grand Challenge.
      ], the exposure of low dose CT scans in the dataset is higher – 50 mA-seconds (mAs) - than in many low dose CT scanning situations. For example, CT scans in the non-small cell lung cancer (NSCLC) Radiogenomics dataset were scanned from 1 to 400 mAs [
      • Bakr S.
      • Gevaert O.
      • Echegaray S.
      • Ayers K.
      • Zhou M.
      • Shafiq M.
      • et al.
      A radiogenomic dataset of non-small cell lung cancer.
      ] and over half CT images scanned with an exposure lower or equal to 5 mAs. Models trained from the dataset described in [
      • McCollough C.H.
      • Bartley A.C.
      • Carter R.E.
      • Chen B.
      • Drees T.A.
      • Edwards P.
      • et al.
      Low-dose CT for the detection and classification of metastatic liver lesions: Results of the 2016 Low Dose CT Grand Challenge.
      ] may have a bad performance in much lower CT scans. The noise power of high noise images (used to train the models) in [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ] is 25 times than that in [
      • McCollough C.H.
      • Bartley A.C.
      • Carter R.E.
      • Chen B.
      • Drees T.A.
      • Edwards P.
      • et al.
      Low-dose CT for the detection and classification of metastatic liver lesions: Results of the 2016 Low Dose CT Grand Challenge.
      ]. For this reason, we used trained models from [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ] without re-training to denoise low dose CT images. The source code and pre-trained models can be found at https://gitlab.com/UM-CDS/low-dose-ct-denoising/.

      Data acquisition

      As mentioned in the Introduction, we will apply pretrained generative models to improve the performance of low CT radiomics-based models in two tasks: pre-treatment survival prediction and lung cancer diagnosis. For this purpose, we chose the NSCLC Radiogenomics dataset [
      • Bakr S.
      • Gevaert O.
      • Echegaray S.
      • Ayers K.
      • Zhou M.
      • Shafiq M.
      • et al.
      A radiogenomic dataset of non-small cell lung cancer.
      ] for survival prediction and the Lung Image Database Consortium image collection (LIDC-IDRI) for lung cancer diagnosis [
      • Armato S.G.
      • McLennan G.
      • Bidaut L.
      • McNitt-Gray M.F.
      • Meyer C.R.
      • Reeves A.P.
      • et al.
      The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT Scans: The LIDC/IDRI thoracic CT database of lung nodules.
      ], because they contain the necessary mask of the region of interest (ROI) for calculating the radiomics features and the images were scanned with low radiation exposure.
      NSCLC Radiogenomics is a unique radiogenomic dataset from a cohort of 211 patients with NSCLC [
      • Mazurowski M.A.
      Radiogenomics: what it is and why it is important.
      ], from which we used low dose CT images, their respective segmentation masks and clinical data for survival prediction. The lung image database consortium and image database resource initiative (LIDC-IDRI) dataset contains 1018 clinical chest CT scans, along with 157 patients’ diagnoses. We used the diagnoses and their respective CT scans for the lung cancer diagnosis task. Finally, 106 samples of the NSCLC Radiogenomics were selected for survival prediction and 110 samples from LIDC-IDRI for lung cancer diagnosis. The index of selected samples for further investigation can be found in Supplementary Tables 1 and 2. The average radiation exposure of selected samples was 38.65±81.97 mAs (±=SEM) in NSCLC Radiogenomics and 145.79±174.57 mAs in LIDC-IDRI. The distributions of radiation exposure for the two datasets are shown in Supplementary Fig. 1.

      Extraction of radiomic features

      Before extracting radiomic features from CT images, Hounsfield Unit (HU) value range of CT images were normalized at first. In other words, HU value of pixel in CT images larger than 1000 was set as 1000, and then send the images to extract features.
      The masks of the ROIs (tumors) are stored in DICOM format in NSCLC Radiogenomics whilst the segmentation of each nodule is stored in XML file in the LIDC-IDRI dataset. The 3D masks for corresponding ROIs (tumors or nodules) were reconstructed from their corresponding files. We used pyradiomics [
      • van Griethuysen J.J.M.
      • Fedorov A.
      • Parmar C.
      • Hosny A.
      • Aucoin N.
      • Narayan V.
      • et al.
      Computational radiomics system to decode the radiographic phenotype.
      ] (version 2.2.0) to calculate 103 radiomic features for further analysis. All features included in the analyses are listed in the Supplementary Table 3.

      Radiomics based models’ development

      One of the main tasks in the seminal article on radiomics by Aerts et al. [
      • Avanzo M.
      • Porzio M.
      • Lorenzon L.
      • Milan L.
      • Sghedoni R.
      • Russo G.
      • et al.
      Artificial intelligence applications in medical imaging: a review of the medical physics research in Italy.
      ] is survival prediction. For pre-treatment prediction of survival at 4 years, we used least squares support vector machines (SVMs) [
      • Suykens J.AK.
      • Vandewalle J.
      Least squares support vector machine classifiers.
      ] with Radial Basis Function (RBF) Kernel as our classifier. SVMs use regularization to prevent overfitting when the number of input variables is high [
      • Xu H.
      • Caramanis C.
      • Mannor S.
      Robustness and regularization of support vector machines.
      ]. The input variables for the classifier were age and the 103 radiomic features extracted from the tumor.
      For lung cancer diagnosis, we used deep attention-based MIL [
      • Ilse M.
      • Tomczak J.
      • Welling M.
      Attention-based deep multiple instance learning.
      ] as the classifier as shown in paper [

      Chen, Junhua, Zeng, Haiyan, Zhang, Cong, et al. “Lung cancer diagnosis using deep attention based multiple instance learning and radiomics.” Med. Phys.. 2022; 00: 00- 00. https://doi.org/10.1002/mp.15539.

      ]. The main characteristic of this classifier is that it can classify groups of samples (e.g. issue a diagnosis based on a set of CT scans from a patient) and reveal the importance of each sample in determining the diagnosis. The architecture of the method is shown in Supplementary Fig. 2. The inputs of the model are the radiomic features and the clinical diagnosis (cancer or not) is the output.

      Experiments

      We applied the trained generative models to denoise real low dose CT images before extracting the radiomic features. Subsequently, we trained the classification models for survival prediction and lung cancer diagnosis using radiomic features and we compared their performance with that of models trained using radiomic features extracted from low dose CT images.
      All denoising experiments for low dose CT images were executed on a Core i7 8565 U CPU with 8 GB of RAM based on pre-trained generative models. Based on training specifications described in [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ], generative models were trained 25, 50, 75 and 100 epochs. All four trained models were used for denoising. For internal validation, 40 trials of nested cross validation [
      • Cawley G.C.
      • Talbot N.LC.
      On over-fitting in model selection and subsequent selection bias in performance evaluation.
      ] of RBF kernel SVM were executed and the number of GroupKFold in each trial was set as 5 for survival prediction validation. We adopted the minority oversampling strategy described in [
      • Zhu T.
      • Lin Y.
      • Liu Y.
      • Zhang W.
      • Zhang J.
      Minority oversampling for imbalanced ordinal regression.
      ] for lung cancer diagnosis task to improve the model’s performance due to our dataset being small and imbalanced.
      We assessed the models’ performance calculating their area under the receiver-operating characteristics curve (AUC), accuracy and recall (using a probability threshold of 0.5). Finally, we used Student’s t-test, after testing the data for normality, to assess the statistical significance of the differences in model performance results.

      Results

      An example of an original CT image from the NSCLC Radiogenomics dataset and its denoised counterparts are shown in Fig. 2.
      Figure thumbnail gr2
      Fig. 2Example of low dose CT denoising: (a) original CT Image from NSCLC Radiogenomics (R01-003, radiation exposure: 7 mAs); (b) image denoised by the CGAN (100 epochs); (c) image denoised by the encoder-decoder network (100 epochs); (d) zoomed region of interests (ROI) of (a); (e) zoomed ROI of (b); and (f) zoomed ROI of (c).

      Survival prediction

      The 4-year survival prediction model based on radiomic features extracted from low dose CTs achieved an AUC of 0.524 with a standard error of the mean (SEM) of 0.042. On the other hand, the survival prediction models based on radiomic features extracted from denoised low dose CTs achieved AUC ranging between 0.54 and 0.58. As shown in Table 1 and Fig. 3, encoder-decoder networks and CGANs can improve radiomics-based models’ performance significantly. The difference between encoder-decoder network and CGAN was not significant when trained for 75 epochs and 100 epochs, similar to what was reported in reference [
      • Chen J.
      • Zhang C.
      • Traverso A.
      • Zhovannik I.
      • Dekker A.
      • Wee L.
      • et al.
      Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
      ].
      Table 1Experimental results for 4-year survival prediction.
      Training length
      MetricsWithout Denoising25 Epochs50 Epochs75 Epochs100 Epochs
      Encoder-decoder network
      AUC0.525±0.0420.580±0.0490.572±0.0400.554±0.0510.566±0.044
      p-value
      Compared with results from original radiomics.
      <0.01<0.01<0.01<0.01
      CGAN
      AUC0.537±0.0450.551±0.0490.538±0.1230.566±0.53
      p-value0.200.010.16<0.01
      Encoder-decoder network versus CGAN
      p-value
      Comparing encoder-decoder network and CGAN.
      <0.010.040.150.93
      * Compared with results from original radiomics.
      ** Comparing encoder-decoder network and CGAN.
      Figure thumbnail gr3
      Fig. 3Experimental results (AUC) of survival prediction task.

      Lung cancer diagnosis

      As shown in [

      Chen, Junhua, Zeng, Haiyan, Zhang, Cong, et al. “Lung cancer diagnosis using deep attention based multiple instance learning and radiomics.” Med. Phys.. 2022; 00: 00- 00. https://doi.org/10.1002/mp.15539.

      ], our method can achieve an AUC of 0.842 (SEM 0.074) based on radiomic features extracted from the original low dose CT scans from the LIDC-IDRI dataset. The AUCs of the classification models based on radiomics extracted from denoised images range between 0.84 and 0.89 as shown in Table 2 and Fig. 4(c). Models built using radiomic features calculated from denoised images outperformed models developed from the original radiomic features in most experiments. Similarly to survival prediction, the difference between encoder-decoder network and CGAN was not significant when trained for 75 and 100 epochs.
      Table 2The AUCs of different models for lung cancer diagnosis.
      Training length
      MetricsWithout Denoising25 Epochs50 Epochs75 Epochs100 Epochs
      Encoder-decoder Network
      AUC0.842±0.0710.883±0.0770.844±0.0670.823±0.0670.866±0.070
      p-value
      Compared with results from original radiomics.
      <0.010.860.070.02
      CGAN
      AUC0.894±0.0560.863±0.0660.837±0.0850.866±0.056
      p-value
      Compared with results from original radiomics.
      <0.010.060.490.01
      Differences of results by comparing Encoder-decoder network and CGAN
      p-value
      Compared with results from original radiomics.
      0.310.070.750.52
      * Compared with results from original radiomics.
      Figure thumbnail gr4
      Fig. 4Experimental results of lung cancer diagnosis: (a) Accuracy, (b) recall and (c) AUC.
      Fig. 4 (a) and (b) and Table 3 show that denoising had a negative impact in the accuracy and recall of the lung cancer diagnosis classification models, when using a threshold of 0.5.
      Table 3Accuracy and recall for lung cancer diagnosis.
      Training length
      Metrics0 Epochs25 Epochs50 Epochs75 Epochs100 Epochs
      Encoder-decoder network
      Accuracy0.807±0.0690.818±0.0770.792±0.0610.75±0.0670.796±0.067
      p-value
      Compared with results from original radiomics.
      0.700.10<0.010.26
      Recall0.870±0.0610.829±0.0960.831±0.0890.810±0.0970.848±0.072
      p-value
      Compared with results from original radiomics.
      <0.01<0.01<0.010.02
      CGAN
      Accuracy0.780±0.0680.779±0.0850.776±0.0740.798±0.064
      p-value
      Compared with results from original radiomics.
      0.010.01<0.010.12
      Recall0.802±0.0910.774±0.1040.827±0.0790.811±0.079
      p-value
      Compared with results from original radiomics.
      <0.01<0.01<0.01<0.01
      Encoder-decoder network versus CGAN (p-values)
      Accuracy<0.010.210.010.67
      Recall0.04<0.010.18<0.01
      * Compared with results from original radiomics.

      Discussion

      In this study, we aimed to assess the potential of generative models to improve the performance of prediction models based on radiomic features extracted from low dose CT scans. The results show that encoder-decoder networks and CGANs can improve the AUC of radiomics for survival prediction and lung cancer diagnosis based on different low dose CT datasets. These findings imply that denoising low dose CT scans using generative models is a convenient pre-processing step before calculating radiomic features to train a predictive or diagnostic model.
      The results also show that denoising using generative models might lead to a decrease in accuracy and recall. This might be caused by a shift in the receiver operating characteristic (ROC) curve as a result of the denoising. However, a higher AUC implies that there are other thresholds for which the accuracy and recall are higher with the denoised images. The threshold will differ for each possible application of these models, and a model with a higher AUC will be more likely to have a better accuracy/recall combination.
      Another interesting aspect of the results is the variability of the models’ AUCs for different numbers of training epochs. As shown in Fig. 3 and Fig. 4 (c), the performance of the models improves after the first epochs, then deteriorates when training for a higher number of epochs, and finally it seems to improve again after a particular number of training epochs. This tendency seems more significant in Fig. 4 (c) than Fig. 3. This might be explained by a phenomenon that has attracted considerable attention in the deep learning research domain in last few years -- deep double descent [
      • Nakkiran P.
      • Kaplun G.
      • Bansal Y.
      • Yang T.
      • Barak B.
      • Sutskever I.
      Deep double descent: where bigger models and more data hurt.
      ,
      • d’Ascoli S.
      • Refinetti M.
      • Biroli G.
      • Krzakala F.
      Double trouble in double descent: Bias and variance (s) in the lazy regime.
      ]. Unfortunately, the mechanisms of this phenomenon are still unclear, and more research on this topic is needed.
      It is worth delving into the cause for the observed improvement using generative models. As mentioned previously, we think this improvement is brought on by the denoising effect of generative models to low dose CT. However, as shown in Supplementary Fig. 1 (b), 40% CT images in the LIDC-IDRI dataset were not noisy (since they were scanned with over 200 mAs). Denoising these images using generative models would decrease images’ quality. Therefore, there must be another source of improvement. Our hypothesis for this alternative source of improvement is dose normalization. In other words, generative models not only improved image quality of low dose CT images in dataset but also transfer the imaging exposure of the whole dataset from a wide range to a more compact but unknown range.
      One potential limitation of our study is the low AUCs achieved by the models for pre-treatment survival prediction for lung cancer based on radiomic features. However, these are in line with results reported elsewhere. For example, Isensee et al. [
      • Isensee F.
      • Kickingereder P.
      • Wick W.
      • Bendszus M.
      • Maier-Hein K.H.
      Brain tumor segmentation and radiomics survival prediction: contribution to the brats 2017 challenge.
      ] reported an accuracy of 52.6% based on the BraTS 2017 dataset [
      • Menze B.H.
      • Jakab A.
      • Bauer S.
      • Kalpathy-Cramer J.
      • Farahani K.
      • Kirby J.
      • et al.
      The multimodal brain tumor image segmentation benchmark (BRATS).
      ] for brain tumor by using radiomics; Choi et al. [
      • Choi Y.S.
      • Ahn S.S.
      • Chang J.H.
      • Kang S.-G.
      • Kim E.H.
      • Kim S.H.
      • et al.
      Machine learning and radiomic phenotyping of lower grade gliomas: improving survival prediction.
      ] reported an integrated AUC (iAUC) of 0.620 [95% CI: 0.501–0.756] in TCGA/TCIA dataset using random survival forest to derive a prediction model; Finally, Bae et al. [
      • Bae S.
      • Choi Y.S.
      • Ahn S.S.
      • Chang J.H.
      • Kang S.-G.
      • Kim E.H.
      • et al.
      Radiomic MRI phenotyping of glioblastoma: improving survival prediction.
      ] reported an iAUC of 0.590 [95% CI: 0.502, 0.689] for overall survival prediction in Glioblastoma using MRI radiomic features. These relatively low AUCs can be partly explained by the difficulty of pre-treatment survival prediction, especially over a long term (over 2 years). In addition to the information available in the medical image, many other factors can affect survival. In fact, some researchers claim that any AUC over 0.80 is suspect [
      • Bahn E.
      • Alber M.
      On the limitations of the area under the ROC curve for NTCP modelling.
      ,
      • Cook N.R.
      Use and misuse of the receiver operating characteristic curve in risk prediction.
      ]. As a system of hand-crafted features with higher interpretability but lower information representation ability (compared with deep features), it is not surprising that radiomics has a relatively poor performance in survival prediction. Some studies have proposed techniques to improve the performance of radiomics in survival prediction, such as Jia et al. [
      • Wu J.
      • Li C.
      • Gensheimer M.
      • Padda S.
      • Kato F.
      • Shirato H.
      • et al.
      Radiological tumour classification across imaging modality and histology.
      ], that managed to increase the concordance index (C-index) from 0.6 to 0.67 or Wang et al. [
      • Wang Y.
      • Shao Q.
      • Luo S.
      • Randi F.u.
      Development of a nomograph integrating radiomics and deep features based on MRI to predict the prognosis of high grade Gliomas.
      ] who combined radiomics with deep features to improve the C-index from 0.68 to 0.72. However, these improved results are still low compared to those achieved in diagnosis, and even further developments might still drive the performance up, the performance in survival will remain relatively low due to inherent uncertainty.
      Regarding future work, we believe generative models should be trained to keep more information from the original domain. More specific, low level domain adaptation such as denoising for medical images should focus on keeping content information from original domain in the target domain. For example, by adding a content loss term in the cost function, adjusting generative models training method as shown in [

      Yang, Heran, Jian Sun, Aaron Carass, Can Zhao, Junghoon Lee, Jerry L. Prince, and Zongben Xu. Unsupervised MR-to-CT synthesis using structure-constrained CycleGAN. IEEE Trans. Med. Imaging 39, no. 12 (2020): 4249-4261. https://doi.org/ 10.1109/TMI.2020.3015379.

      ]. Second, more generative models with different architectures should be considered as the test models to find better models for this task. Thirdly, given the important fluctuations in performance across different numbers of training epochs, it is not possible to provide an optimal number of epochs based on our experiments. For consistency, we reported the results from the model trained model for 100 epochs as our final results. However, more studies about the optimal number of epochs are needed in the future. Finally, since the validity of the results of this study are limited to our selected datasets and tasks, further application to more datasets and tasks could reinforce or disprove our findings.

      Conclusion

      In this study, we assessed the potential of generative models (CGANs and encoder-decoder networks) to improve the performance of low dose CT scan radiomics-based models in two tasks – survival prediction and lung cancer diagnosis – and two datasets – NSCLC Radiogenomics and LIDC-IDRI. SVM and deep attention based MIL were used classifiers in survival prediction and lung cancer diagnosis respectively. The results support the hypothesis that generative models can improve radiomics performance in different tasks and datasets. In conclusion, denoising using generative models is an effective pre-processing step for calculating radiomic features from low dose CT.

      Declaration of Competing Interest

      The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

      Acknowledgments

      JC is supported by a China Scholarship Council scholarship (201906540036). The remaining authors acknowledge funding support from the following: STRaTegy (STW 14930), BIONIC (NWO 629.002.205), TRAIN (NWO 629.002.212), CARRIER (NWO 628.011.212) and a personal research grant by The Hanarth Funds Foundation for LW.

      Appendix A. Supplementary data

      The following are the Supplementary data to this article:

      References

        • Avanzo M.
        • Porzio M.
        • Lorenzon L.
        • Milan L.
        • Sghedoni R.
        • Russo G.
        • et al.
        Artificial intelligence applications in medical imaging: a review of the medical physics research in Italy.
        Physica Med. 2021; 83: 221-241
        • Aerts H.J.W.L.
        • Velazquez E.R.
        • Leijenaar R.T.H.
        • Parmar C.
        • Grossmann P.
        • Carvalho S.
        • et al.
        Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach.
        Nat Commun. 2014; 5
        • Desseroit M.-C.
        • Tixier F.
        • Weber W.A.
        • Siegel B.A.
        • Rest C.C.L.
        • Visvikis D.
        • et al.
        Reliability of PET/CT shape and heterogeneity features in functional and morphologic components of non–small cell lung cancer tumors: a repeatability analysis in a prospective multicenter cohort.
        J Nucl Med. 2017; 58: 406-411https://doi.org/10.2967/jnumed.116.180919
        • Bogowicz M.
        • Riesterer O.
        • Bundschuh R.A.
        • Veit-Haibach P.
        • Hüllner M.
        • Studer G.
        • et al.
        Stability of radiomic features in CT perfusion maps.
        Phys Med Biol. 2016; 61: 8736-8749
        • Tixier F.
        • Hatt M.
        • Le Rest C.C.
        • Le Pogam A.
        • Corcos L.
        • Visvikis D.
        Reproducibility of tumor uptake heterogeneity characterization through textural feature analysis in 18 F-FDG PET.
        J Nucl Med. 2012; 53: 693-700
        • Zhang B.
        • Tian J.
        • Dong D.i.
        • Gu D.
        • Dong Y.
        • Zhang L.u.
        • et al.
        Radiomics features of multiparametric MRI as novel prognostic factors in advanced nasopharyngeal carcinoma.
        Clin Cancer Res. 2017; 23: 4259-4269
      1. Placidi, Lorenzo, Eliana Gioscio, Cristina Garibaldi, Tiziana Rancati, Annarita Fanizzi, Davide Maestri, Raffaella Massafra et al., A multicentre evaluation of dosiomics features reproducibility, stability and sensitivity. Cancers 13, no. 15 (2021): 3835. https://doi.org/10.3390/cancers13153835.

        • Comes M.C.
        • Fanizzi A.
        • Bove S.
        • Didonna V.
        • Diotaiuti S.
        • La Forgia D.
        • et al.
        Early prediction of neoadjuvant chemotherapy response by exploiting a transfer learning approach on breast DCE-MRIs.
        Sci Rep. 2021; 11
      2. Comes, Maria Colomba, Daniele La Forgia, Vittorio Didonna, Annarita Fanizzi, Francesco Giotta, Agnese Latorre, Eugenio Martinelli et al., Early prediction of breast cancer recurrence for patients treated with neoadjuvant chemotherapy: a transfer learning approach on DCE-MRIs. Cancers 13, no. 10 (2021): 2298. https://doi.org/10.3390/cancers13102298.

      3. La Forgia, Daniele, Angela Vestito, Maurilia Lasciarrea, Maria Colomba Comes, Sergio Diotaiuti, Francesco Giotta, Agnese Latorre et al., Response predictivity to neoadjuvant therapies in breast cancer: A qualitative analysis of background parenchymal enhancement in DCE-MRI. J. Pers. Med 11, no. 4 (2021): 256. https://doi.org/10.3390/jpm11040256.

      4. Musolino, Stephen V., Joseph DeFranco, and Richard Schlueck. “The ALARA principle in the context of a radiological or nuclear emergency.” Health Phys. 94 (2) (2008): 109–111. https://doi.org/10.1097/01.HP.0000285801.87304.3f.

      5. Bi, Wenya Linda, Ahmed Hosny, Matthew B. Schabath, Maryellen L. Giger, Nicolai J. Birkbak, Alireza Mehrtash, Tavis Allison et al., Artificial intelligence in cancer imaging: clinical challenges and applications. Ca-Cancer J. Clin 69, no. 2 (2019): 127–157. https://doi.org/10.3322/caac.21552.

        • Gillies R.J.
        • Schabath M.B.
        Radiomics improves cancer screening and early detection.
        Cancer Epidemiol Biomarkers Prev. 2020; 29: 2556-2567
        • Choi W.
        • Oh J.H.
        • Riyahi S.
        • Liu C.-J.
        • Jiang F.
        • Chen W.
        • et al.
        Radiomics analysis of pulmonary nodules in low-dose CT for early detection of lung cancer.
        Med Phys. 2018; 45: 1537-1549
        • Homayounieh F.
        • Yan P.
        • Digumarthy S.R.
        • Kruger U.
        • Wang G.e.
        • Kalra M.K.
        Prediction of coronary calcification and stenosis: role of radiomics from Low-Dose CT.
        Acad Radiol. 2021; 28: 972-979https://doi.org/10.1016/j.acra.2020.09.021
        • van Timmeren J.E.
        • Leijenaar R.T.H.
        • van Elmpt W.
        • Reymen B.
        • Oberije C.
        • Monshouwer R.
        • et al.
        Survival prediction of non-small cell lung cancer patients using radiomics analyses of cone-beam CT images.
        Radiother Oncol. 2017; 123: 363-369
        • Nawa T.
        • Nakagawa T.
        • Mizoue T.
        • Kusano S.
        • Chonan T.
        • Fukai S.
        • et al.
        Long-term prognosis of patients with lung cancer detected on low-dose chest computed tomography screening.
        Lung Cancer. 2012; 75: 197-202https://doi.org/10.1016/j.lungcan.2011.07.002
        • Ayati N.
        • Lee S.T.
        • Zakavi S.R.
        • Cheng M.
        • Lau W.F.E.
        • Parakh S.
        • et al.
        Response evaluation and survival prediction after PD-1 immunotherapy in patients with non–small cell lung cancer: comparison of assessment methods.
        J Nucl Med. 2021; 62: 926-933
        • Traverso A.
        • Wee L.
        • Dekker A.
        • Gillies R.
        Repeatability and reproducibility of radiomic features: a systematic review.
        Int J Radiat Oncol*Biol*Phys. 2018; 102: 1143-1158
        • Bagher-Ebadian H.
        • Siddiqui F.
        • Liu C.
        • Movsas B.
        • Chetty I.J.
        On the impact of smoothing and noise on robustness of CT and CBCT radiomics features for patients with head and neck cancers.
        Med Phys. 2017; 44: 1755-1770https://doi.org/10.1002/mp.12188
        • Kelm Z.S.
        • Blezek D.
        • Bartholmai B.
        • Erickson B.J.
        Optimizing non-local means for denoising low dose CT.
        in: 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro. 2009: 662-665https://doi.org/10.1109/ISBI.2009.5193134
      6. Chen, Minmin, Zhixiang Xu, Kilian Weinberger, and Fei Sha. Marginalized denoising autoencoders for domain adaptation. arXiv preprint arXiv:1206.4683 (2012).

        • Yang Q.
        • Yan P.
        • Zhang Y.
        • Hengyong Y.u.
        • Shi Y.
        • Mou X.
        • et al.
        Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss.
        IEEE Trans Med Imag. 2018; 37: 1348-1357https://doi.org/10.1109/TMI.2018.2827462
        • Sharma A.
        • Chaurasia V.
        A review on magnetic resonance images denoising techniques.
        in: Machine Intelligence and Signal Analysis. Springer, Singapore2019: 707-715https://doi.org/10.1007/978-981-13-0923-6_60
      7. Kollem, Sreedhar, Katta Rama Linga Reddy, and Duggirala Srinivasa Rao. “A review of image denoising and segmentation methods based on medical images.” Int. J. Mach. Learn. Comput. 9, (3) (2019): 288–295. https://doi.org/10.18178/ijmlc.2019.9.3.800.

        • LeCun Y.
        • Bengio Y.
        • Hinton G.
        Deep learning.
        Nature. 2015; 521: 436-444
        • Shan H.
        • Zhang Y.i.
        • Yang Q.
        • Kruger U.
        • Kalra M.K.
        • Sun L.
        • et al.
        3-D convolutional encoder-decoder network for low-dose CT via transfer learning from a 2-D trained network.
        IEEE Trans Med Imaging. 2018; 37: 1522-1534https://doi.org/10.1109/TMI.2018.2832217
      8. Chen, Hu, Yi Zhang, Mannudeep K. Kalra, Feng Lin, Yang Chen, Peixi Liao, Jiliu Zhou, and Ge Wang. Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans. Med. Imaging 36, no. 12 (2017): 2524-2535. https://doi.org/ 10.1109/TMI.2017.2715284.

        • Kang E.
        • Chang W.
        • Yoo J.
        • Ye J.C.
        Deep convolutional framelet denosing for low-dose CT via wavelet residual network.
        IEEE Trans Med Imaging. 2018; 37: 1358-1369
        • Lucia F.
        • Visvikis D.
        • Desseroit M.-C.
        • Miranda O.
        • Malhaire J.-P.
        • Robin P.
        • et al.
        Prediction of outcome using pretreatment 18F-FDG PET/CT and MRI radiomics in locally advanced cervical cancer treated with chemoradiotherapy.
        Eur J Nucl Med Mol Imaging. 2018; 45: 768-786https://doi.org/10.1007/s00259-017-3898-7
      9. Parmar, Chintan, Ralph TH Leijenaar, Patrick Grossmann, Emmanuel Rios Velazquez, Johan Bussink, Derek Rietveld, Michelle M. Rietbergen, Benjamin Haibe-Kains, Philippe Lambin, and Hugo JWL Aerts. Radiomic feature clusters and prognostic signatures specific for lung and head & neck cancer. Sci. Rep. 5, no. 1 (2015): 1–10. https://doi.org/10.1038/srep11044.

        • Chen C.-H.
        • Chang C.-K.
        • Tu C.-Y.
        • Liao W.-C.
        • Wu B.-R.
        • Chou K.-T.
        • et al.
        Radiomic features analysis in computed tomography images of lung nodule classification.
        PLoS ONE. 2018; 13: e0192002
        • Li H.
        • Zhu Y.
        • Burnside E.S.
        • Huang E.
        • Drukker K.
        • Hoadley K.A.
        • et al.
        Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set.
        npj Breast Cancer. 2016; 2
      10. Chen, Junhua, Zeng, Haiyan, Zhang, Cong, et al. “Lung cancer diagnosis using deep attention based multiple instance learning and radiomics.” Med. Phys.. 2022; 00: 00- 00. https://doi.org/10.1002/mp.15539.

        • Maron O.
        • Lozano-Pérez T.
        A framework for multiple-instance learning.
        Adv Neural Inf Process Syst. 1997; 10
        • Ilse M.
        • Tomczak J.
        • Welling M.
        Attention-based deep multiple instance learning.
        in: International conference on machine learning. 2018: 2127-2136
        • Chen J.
        • Zhang C.
        • Traverso A.
        • Zhovannik I.
        • Dekker A.
        • Wee L.
        • et al.
        Generative models improve radiomics reproducibility in low dose CTs: a simulation study.
        Phys Med Biol. 2021; 66165002https://doi.org/10.1088/1361-6560/ac16c0
        • Isola P.
        • Zhu J.-Y.
        • Zhou T.
        • Efros A.A.
        Image-to-image translation with conditional adversarial networks.
        in: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1125-1134
        • Zhu J.-Y.
        • Park T.
        • Isola P.
        • Efros A.A.
        Unpaired image-to-image translation using cycle-consistent adversarial networks.
        in: Proceedings of the IEEE international conference on computer vision. 2017: 2223-2232
        • Clark K.
        • Vendt B.
        • Smith K.
        • Freymann J.
        • Kirby J.
        • Koppel P.
        • et al.
        The cancer imaging archive (TCIA): maintaining and operating a public information repository.
        J Digit Imaging. 2013; 26: 1045-1057
        • McCollough C.H.
        • Bartley A.C.
        • Carter R.E.
        • Chen B.
        • Drees T.A.
        • Edwards P.
        • et al.
        Low-dose CT for the detection and classification of metastatic liver lesions: Results of the 2016 Low Dose CT Grand Challenge.
        Med Phys. 2017; 44: e339-e352
        • Bakr S.
        • Gevaert O.
        • Echegaray S.
        • Ayers K.
        • Zhou M.
        • Shafiq M.
        • et al.
        A radiogenomic dataset of non-small cell lung cancer.
        Sci Data. 2018; 5
        • Armato S.G.
        • McLennan G.
        • Bidaut L.
        • McNitt-Gray M.F.
        • Meyer C.R.
        • Reeves A.P.
        • et al.
        The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT Scans: The LIDC/IDRI thoracic CT database of lung nodules.
        Med Phys. 2011; 38: 915-931
        • Mazurowski M.A.
        Radiogenomics: what it is and why it is important.
        J Am Coll Radiol. 2015; 12: 862-866https://doi.org/10.1016/j.jacr.2015.04.019
        • van Griethuysen J.J.M.
        • Fedorov A.
        • Parmar C.
        • Hosny A.
        • Aucoin N.
        • Narayan V.
        • et al.
        Computational radiomics system to decode the radiographic phenotype.
        Cancer Res. 2017; 77: e104-e107
        • Xu H.
        • Caramanis C.
        • Mannor S.
        Robustness and regularization of support vector machines.
        J Mach Learn Res. 2009; 10
        • Suykens J.AK.
        • Vandewalle J.
        Least squares support vector machine classifiers.
        Neural Process Lett. 1999; 9: 293-300
        • Cawley G.C.
        • Talbot N.LC.
        On over-fitting in model selection and subsequent selection bias in performance evaluation.
        J Mach Learn Res. 2010; 11: 2079-2107
        • Zhu T.
        • Lin Y.
        • Liu Y.
        • Zhang W.
        • Zhang J.
        Minority oversampling for imbalanced ordinal regression.
        Knowledge-Based Syst. 2019; 166: 140-155https://doi.org/10.1016/j.knosys.2018.12.021
        • Nakkiran P.
        • Kaplun G.
        • Bansal Y.
        • Yang T.
        • Barak B.
        • Sutskever I.
        Deep double descent: where bigger models and more data hurt.
        J Stat Mech. 2021; 2021: 124003
        • d’Ascoli S.
        • Refinetti M.
        • Biroli G.
        • Krzakala F.
        Double trouble in double descent: Bias and variance (s) in the lazy regime.
        in: International Conference on Machine Learning. 2020: 2280-2290
        • Isensee F.
        • Kickingereder P.
        • Wick W.
        • Bendszus M.
        • Maier-Hein K.H.
        Brain tumor segmentation and radiomics survival prediction: contribution to the brats 2017 challenge.
        in: International MICCAI Brainlesion Workshop. Springer, Cham2017: 287-297 (10.1007/978-3-319-75238-9_25)
        • Menze B.H.
        • Jakab A.
        • Bauer S.
        • Kalpathy-Cramer J.
        • Farahani K.
        • Kirby J.
        • et al.
        The multimodal brain tumor image segmentation benchmark (BRATS).
        IEEE Trans Med Imaging. 2015; 34: 1993-2024
        • Choi Y.S.
        • Ahn S.S.
        • Chang J.H.
        • Kang S.-G.
        • Kim E.H.
        • Kim S.H.
        • et al.
        Machine learning and radiomic phenotyping of lower grade gliomas: improving survival prediction.
        Eur Radiol. 2020; 30: 3834-3842
        • Bae S.
        • Choi Y.S.
        • Ahn S.S.
        • Chang J.H.
        • Kang S.-G.
        • Kim E.H.
        • et al.
        Radiomic MRI phenotyping of glioblastoma: improving survival prediction.
        Radiology. 2018; 289: 797-806
        • Bahn E.
        • Alber M.
        On the limitations of the area under the ROC curve for NTCP modelling.
        Radiother Oncol. 2020; 144: 148-151https://doi.org/10.1016/j.radonc.2019.11.018
        • Cook N.R.
        Use and misuse of the receiver operating characteristic curve in risk prediction.
        Circulation. 2007; 115: 928-935https://doi.org/10.1161/CIRCULATIONAHA.106.672402
        • Wu J.
        • Li C.
        • Gensheimer M.
        • Padda S.
        • Kato F.
        • Shirato H.
        • et al.
        Radiological tumour classification across imaging modality and histology.
        Nat Mach Intell. 2021; 3: 787-798
        • Wang Y.
        • Shao Q.
        • Luo S.
        • Randi F.u.
        Development of a nomograph integrating radiomics and deep features based on MRI to predict the prognosis of high grade Gliomas.
        Math Biosci Eng. 2021; 18: 8084-8095https://doi.org/10.3934/mbe.2021401
      11. Yang, Heran, Jian Sun, Aaron Carass, Can Zhao, Junghoon Lee, Jerry L. Prince, and Zongben Xu. Unsupervised MR-to-CT synthesis using structure-constrained CycleGAN. IEEE Trans. Med. Imaging 39, no. 12 (2020): 4249-4261. https://doi.org/ 10.1109/TMI.2020.3015379.