Advertisement

Robustness and reproducibility of radiomics in T2 weighted images from magnetic resonance image guided linear accelerator in a phantom study

  • Mengdi Sun
    Affiliations
    Department of Graduate, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China

    Department of Radiation Oncology of the Thorax Cancer (5th Radiation Oncology) Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
  • Ahmad Baiyasi
    Affiliations
    Department of Radiology, Wayne State University, Detroit, United States
    Search for articles by this author
  • Xuechun Liu
    Affiliations
    Department of Radiation Oncology Physics and Technology, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China

    Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
  • Xihua Shi
    Affiliations
    Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
  • Xu Li
    Affiliations
    Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
  • Jian Zhu
    Affiliations
    Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
  • Yong Yin
    Affiliations
    Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
  • Jiani Hu
    Affiliations
    Department of Radiology, Wayne State University, Detroit, United States
    Search for articles by this author
  • Zhenjiang Li
    Correspondence
    Corresponding authors at: Department of Graduate, Shandong First Medical University and Shandong Academy of Medical, Jinan, China.
    Affiliations
    Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
  • Baosheng Li
    Correspondence
    Corresponding authors at: Department of Graduate, Shandong First Medical University and Shandong Academy of Medical, Jinan, China.
    Affiliations
    Department of Graduate, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China

    Department of Radiation Oncology of the Thorax Cancer (5th Radiation Oncology) Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
    Search for articles by this author
Open AccessPublished:March 11, 2022DOI:https://doi.org/10.1016/j.ejmp.2022.03.002

      Highlights

      • The Shape features are the most reproducible features in all of the tests.
      • The changes in features extracted when the beam was off were smaller than from when the beam was on.
      • Intraobserver reliability of features was high compared with other influences.
      • Motion substantially impacted the robustness of the features.

      Abstract

      Purpose

      Quantitative radiomics features extracted from medical images have been shown to provide value in predicting clinical outcomes. The study for robustness and reproducibility of radiomics features obtained with magnetic resonance image guided linear accelerator (MR-Linac) is insufficient. The objective of this work was to investigate the stability of radiomics features extracted from T2-weighted images of MR-Linac for five common effect factors.

      Materials and method

      In this work, ten jellies, five fruits/vegetables, and a dynamic phantom were used to evaluate the impact of test–retest, intraobserver, varied thicknesses, radiation, and motion. These phantoms were scanned on a 1.5 T MRI system of MR-Linac. For test–retest data, the phantoms were scanned twice with repositioning within 15 min. To assess for intraobserver comparison, the segmentation of MR images was repeated by one observer in a double-blind manner. Three slice thicknesses (1.2 mm, 2.4 mm, and 4.8 mm) were used to select robust features that were insensitive to different thicknesses. The effect of radiation on features was studied by acquiring images when the beam was on. Common movement images of patients during radiotherapy were simulated by a dynamic phantom with five motion states to study the motion effect. A total of 1409 radiomics features, including shape features, first-order features, and texture features, were extracted from the original, wavelet, square, logarithmic, exponential and gradient images. The robustness and reproducibility features were evaluated using the concordance correlation coefficient (CCC).

      Result

      The intraobserver group had the most robust features (936/1079, 86.7%), while the group of motion effects had the lowest robustness (56/936, 6.0%), followed by the group of different thickness cohorts (374/936, 40.0%). The stability of features in the test–retest and radiation groups was 1072 of 1312 (81.7%) and 810 of 936 (86.5%), respectively. Overall, 25 of 1409 (2.4%) radiomics features remained robust in all five tests, mostly focusing on the image type of the wavelet. The number of stable features extracted from when the beam was on was less than that extracted when the beam was off. Shape features were the most robust of all of the features in all of the groups, excluding the motion group.

      Conclusion

      Compared with other factors fewer features remained robust to the effect of motion. This result emphasizes the need to consider the effect of respiration motion. The study for T2-weighted images from MR-Linac under different conditions will help us to build a robust predictive model applicable for radiotherapy.

      Keywords

      Abbreviations:

      MR-Linac (magnetic resonance image guided linear accelerator), CCC (concordance correlation coefficient), CT (computed tomography), PET (positron emission tomography), MRI (magnetic resonance imaging), ART (adaptive radiation therapy), ROI (region of interest), IBSI (Image biomarker standardization initiative), GLRLM (gray-level run-length matrix), GLCM (gray level cooccurrence matrix), GLSZM (gray level size zone matrix), NGTDM (neighboring gray tone difference matrix), GLDM (gray level dependence matrix), DSC (Dice similarity coefficient)

      Introduction

      On medical images, tumors are commonly described with qualitative descriptors (microcalcification, burr, heterogeneous, cavitated) [
      • Giménez A.
      • Franquet T.
      • Prats R.
      • Estrada P.
      • Villalba J.
      • Bagué S.
      Unusual primary lung tumors: a radiologic-pathologic overview.
      ]. However, there has been a recent increase in quantitative, precise methods for describing the changes in a developing or radiated tumor. One-dimensional size, density, speed of growth, and so on are traditional quantitative methods for describing tumors[
      • Therasse P.
      • Arbuck S.G.
      • Eisenhauer E.A.
      • et al.
      New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada.
      ]; radiomics has emerged as a newer quantitative method that allows for a more thorough extraction of large amounts of radiographic features from images [
      • Papadimitroulas P.
      • Brocki L.
      • Christopher Chung N.
      • et al.
      Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization.
      ,
      • Avanzo M.
      • Stancanello J.
      • El Naqa I.
      Beyond imaging: the promise of radiomics.
      ]. This process has been studied as a means of providing specific imaging biomarkers and has contributed to disease classification in patients. Some studies have shown that extracted radiomics features from computed tomography (CT) [
      • Zhao B.
      • Tan Y.
      • Tsai W.Y.
      • Schwartz L.H.
      • Lu L.
      Exploring variability in CT characterization of tumors: a preliminary phantom study.
      ,
      • Buch K.
      • Li B.
      • Qureshi M.M.
      • Kuno H.
      • Anderson S.W.
      • Sakai O.
      Quantitative assessment of variation in CT parameters on texture features: pilot study using a nonanatomic phantom.
      ,
      • Mackin D.
      • Fave X.
      • Zhang L.
      • et al.
      Measuring computed tomography scanner variability of radiomics features.
      ], positron emission tomography (PET) [
      • El Naqa I.
      • Grigsby P.
      • Apte A.
      • et al.
      Exploring feature-based approaches in PET images for predicting cancer treatment outcomes.
      ,
      • Cook G.J.R.
      • et al.
      Radiomics in PET: principles and applications.
      ], and magnetic resonance imaging (MRI) [
      • Baeßler B.
      • Weiss K.
      • Pinto Dos Santos D.
      Robustness and reproducibility of radiomics in magnetic resonance imaging: a phantom study.
      ] can be used to predict the survival rate [
      • van Timmeren J.E.
      • Leijenaar R.T.H.
      • van Elmpt W.
      • et al.
      Survival prediction of non-small cell lung cancer patients using radiomics analyses of cone-beam CT images.
      ], tumor toxic reaction [
      • Qin Q.
      • Shi A.
      • Zhang R.
      • et al.
      Cone-beam CT radiomics features might improve the prediction of lung toxicity after SBRT in stage I NSCLC patients.
      ], early assessment treatment response [
      • Shi L.
      • Rong Y.
      • Daly M.
      • et al.
      Cone-beam computed tomography-based delta-radiomics for early response assessment in radiotherapy for locally advanced lung cancer.
      ,
      • van Timmeren J.E.
      • Leijenaar R.T.H.
      • van Elmpt W.
      • Reymen B.
      • Lambin P.
      Feature selection methodology for longitudinal cone-beam CT radiomics.
      ,
      • Horvat N.
      • Veeraraghavan H.
      • Khan M.
      • et al.
      MR imaging of rectal cancer: radiomics analysis to assess treatment response after neoadjuvant therapy.
      ] and lymph node metastasis [
      • Li Z.
      • Li H.
      • Wang S.
      • et al.
      MR-based radiomics nomogram of cervical cancer in prediction of the lymph-vascular space invasion preoperatively.
      ,
      • Coroller T.P.
      • Grossmann P.
      • Hou Y.
      • et al.
      CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma.
      ]. Recently, the development of radiomics has contributed to adaptive radiation therapy (ART) [
      • Shi L.
      • Rong Y.
      • Daly M.
      • et al.
      Cone-beam computed tomography-based delta-radiomics for early response assessment in radiotherapy for locally advanced lung cancer.
      ,
      • Bosetti D.G.
      • Ruinelli L.
      • Piliero M.A.
      • et al.
      Cone-beam computed tomography-based radiomics in prostate cancer: a mono-institutional study.
      ].
      ART is intended to provide essential insight into the real dose delivered to patients and to provide valuable information for clinical decisions. Commonly used images in ART include cone-beam CT (CBCT) and megavoltage computed tomographic (MVCT) images, which can be used for position verification during radiotherapy. In recent years, many studies have been published in the environment of CBCT and MVCT imaging, emphasizing the challenges of reproducibility of radiomic features when using different scanner acquisition parameters [
      • Fave X.
      • Mackin D.
      • Yang J.
      • et al.
      Can radiomics features be reproducibly measured from CBCT images for patients with non-small cell lung cancer?.
      ,

      Gu J, Zhu J, Qiu Q, et al. The Feasibility Study of Megavoltage Computed Tomographic (MVCT) Image for Texture Feature Analysis. Front Oncol. 2018;8:586. Published 2018 Dec 5. doi:10.3389/fonc.2018.00586.

      ].
      Conversely, as an emerging technology, there have been fewer discussions of the robustness of radiomics features for MRI in radiotherapy, and previous related studies have focused almost entirely on conventional diagnostic MRI [
      • Boulanger M.
      • Nunes J.C.
      • Chourak H.
      • et al.
      Deep learning methods to generate synthetic CT from MRI in radiotherapy: a literature review.
      ]. Several experiments have studied the feature robustness of MR images in test–retest, interobserver and intraobserver, acquisition parameters, scanners, and so on [
      • Baeßler B.
      • Weiss K.
      • Pinto Dos Santos D.
      Robustness and reproducibility of radiomics in magnetic resonance imaging: a phantom study.
      ,
      • Lee J.
      • Steinmann A.
      • Ding Y.
      • et al.
      Radiomics feature robustness as measured using an MRI phantom.
      ,
      • Mazzoni L.N.
      • Bock M.
      • Levesque I.R.
      • Lurie D.J.
      • Palma G.
      New developments in MRI: system characterization, technical advances and radiotherapy applications.
      ,
      • Bernatz S.
      • Zhdanovich Y.
      • Ackermann J.
      • et al.
      Impact of rescanning and repositioning on radiomic features employing a multi-object phantom in magnetic resonance imaging.
      ,
      • Bianchini L.
      • Botta F.
      • Origgi D.
      • et al.
      PETER PHAN: an MRI phantom for the optimisation of radiomic studies of the female pelvis.
      ,
      • Bianchini L.
      • Santinha J.
      • Loução N.
      • et al.
      A multicenter study on radiomic features from T2 -weighted images of a customized MR pelvic phantom setting the basis for robust radiomic models in clinics.
      ,
      • Dreher C.
      • Kuder T.A.
      • König F.
      • et al.
      Radiomics in diffusion data: a test-retest, inter- and intra-reader DWI phantom study.
      ]. However, compared to conventional MR, MR-Linac is constructed which integrates a 1.5 T MRI closed bore MRI scanner (Philips, Best, The Netherlands) with a 7MV linear accelerator (Elekta AB, Stockholm, Sweden), mounted on a ring-shaped gantry. Recently, several academics have explored the robustness of quality for MR-Linac compared with conventional MRI in imaging systems [
      • Wang J.
      • Yung J.
      • Kadbi M.
      • Hwang K.
      • Ding Y.
      • Ibbott G.S.
      Assessment of image quality and scatter and leakage radiation of an integrated MR-LINAC system.
      ,
      • Tijssen R.H.N.
      • Philippens M.E.P.
      • Paulson E.S.
      • et al.
      MRI commissioning of 1.5T MR-linac systems - a multi-institutional study.
      ]. However, no one has investigated the factors that commonly affect the reliability of radiomics features in MR images from MR-Linac during radiotherapy.
      Thus, the main objective of this phantom study was to investigate the robustness and reproducibility of radiomic features for the frequently used MRI sequences in MR-Linac under five different influences that are commonly affected during radiotherapy and to propose robust features, which can be dependably used in future clinical studies.

      Methods and materials

      The workflow of the study is shown in Fig. 1.
      Figure thumbnail gr1
      Fig. 1Workflow of the study. Here, 0° and 90° refer to the angles of the gantry beam; 0 A, 5 A, 8 A and 15 A in part ⑤ indicate the motion amplitude of the dynamic phantom. 0 vs. 5, 0 vs. 8, and 0 vs. 15 were means used to calculate the consistency between 0 mm amplitude and 5, 8, and 15 mm amplitudes respectively. ① to ⑤ refer to five influential we studied in this work. ①test-retest effect; ②intraobserver effect; ③thickness effect; ④radiation effect; ⑤motion effect, including the study of motion combined with irradiation. F1, F2, F3 refer to the features selected under different conditions.

      Phantom and MR parameters

      Ten jellies, five fruits/vegetables, and a dynamic motion phantom (CIRS 008Z) served as our radiomics phantoms. The different jellies and fruits/vegetables were used to reflect tumors with different shapes and texture performance and allowed for T2 weighted (T2w) images analysis and the extraction of radiomics features in the difference signal intensity images (Fig. 2). Each jelly was produced by a 0.1% water solution of agar (chembase, China). Since the jellies/fruits/vegetables are too small to imaging, a water phantom was designed in-house to assist in acquiring MR images (the volume of each jelly and fruit/vegetable is shown in Supplementary Material Table 1). The water phantom is a thorax-shaped container filled with a solution of 0.4 mM Mncl2 to simulate the relaxation time T2 of muscle tissue surrounding a thoracic tumor. Five fruits/vegetables and ten jellies were placed in the cavity (Fig. 2 b) to provide MR signal and texture consistent with a set of representative thorax lesions.
      Figure thumbnail gr2
      Fig. 2(a) Images of the radiomics phantom that we chose in the study. A ∼ J are the ten jellies, and K ∼ O are the fruits/vegetables, including one mangosteen (K), one tomato (L), one lemon (M), one onion (N), and one carrot (O). (b) The water phantom used in our work and the red box is the cavity for the jellies and fruits/vegetables. (c) An exemplarity 3-dimensional slice after segmentation with colored segmentation label. (d) The 2-dimensional segmentation label.
      The details of the dynamic phantom are shown in Fig. 3. This dynamic phantom body has a width of 32 cm, a height of 18 cm, and a length of 25.6 cm (Fig. 3 a, d). The dynamic phantom has a moving rod (length of 22 cm, diameter of 6.3 cm, Fig. 3 b) mode of liver-equivalent material that can be installed to move cyclically with varied breath-simulated curves. Due to the “tumor” in the moving rod being nonreplaceable, thus the rod was replaced by a bottle (length of 20 cm, diameter of 6.3 cm, Fig. 3 c) that was the same diameter as the moving rod and fixed with the actuator using adhesive tape. To study the robustness of features influenced by different motions, we glued jellies and fruits/vegetables in the bottle one after another, moving in varying amplitudes (Fig. 3 d).
      Figure thumbnail gr3
      Fig. 3(a) Picture of the dynamic phantom. (b) Picture of the moving rod. The “tumor” is stable in the center of the rod. (c) The bottle that we chose to replace the moving rod in the dynamic phantom. Ten jellies and fruits/vegetables were glued in the center of the bottle, respectively, which was stuck to the actuator using adhesive tape. (d) Picture of the dynamic phantom body.
      MR images were used for the phantoms in the default thorax protocol using T2 3D Tra sequence with the following parameters: TR/TE = 2100 ms/206 ms, voxel = 2.0 × 2.0 × 2.4 mm, FOV = 320 × 448 × 300 mm, scan time = 200 s-228 s, matrix = 160 × 224 × 250 slices; thickness = 2.4; and NSA = 2. The thickness parameter was manipulated in parts of the experiment to investigate whether it affected the robustness of radiomics features. In this work, radiomics features extracted from 3D images, since the ratio of the layer thickness to the pixel distance in the layer is not 1, the image data are resampled and corrected according to the actual ratio to prevent deviations in the radiomics analysis, therefore eliminating the anisotropy of the data format itself.

      The definition of CCC and selected tumor volume dependence features

      The concordance correlation coefficient (CCC) is a commonly used method to evaluate the agreement between paired data [
      • Lin L.I.
      A concordance correlation coefficient to evaluate reproducibility.
      ]. In this work, the CCC was determined pairwise for features extracted from the paired images, concerning formula (1). We chose 0.9 as our cutoff value by the McBride criteria, which state that a correlation of 0.9 reflects the medium-consistency intensity, and all correlations < 0.9 are poor [

      McBride GB, “A proposal for strength-of-agreement criteria for Lin’s concordance correlation coefficient,” NIW A Client Report No. HAM2005–062 (2005), pp. 1–14.

      ]. The features of robustness with CCC > 0.9 were then enrolled in the following experiments. The classification of CCC values is shown in Table 1.
      Table 1The classification of CCC.
      RobustnessThe threshold values of CCC
      ExcellentCCC > 0.9
      Good0.75 < CCC ≤ 0.9
      Medium0.5 < CCC ≤ 0.75
      BadCCC ≤ 0.5
      For given features, let α be a vector of that feature’s values across the phantoms’ first scan. Let β be a vector of that features across the phantom’s second scan; μα and μβ are the features’ means. In addition, σα2 and σβ2 are the variances of the features. The CCC, a commonly used method to calculate the robustness of features, is thus defined as.
      CCC=1-α-β2σα2+σβ2+μα-μβ2
      (1)


      The Spearman correlation coefficient (ρ) between radiomics features and the region of interest (ROI) volume was calculated, since it has been showed that some radiomics features embed volume information [
      • Welch M.L.
      • McIntosh C.
      • Haibe-Kains B.
      • et al.
      Vulnerabilities of radiomic signature development: the need for safeguards.
      ,
      • Scalco E.
      • Belfatto A.
      • Mastropietro A.
      • et al.
      T2w-MRI signal normalization affects radiomics features reproducibility.
      ]. The ROI is mainly contoured alone in the volume outside the jellies and fruits/vegetables in the acquired images, Fig. 2 (c) and (d) show the 3D surface representation and cross section of the ROI, respectively. We calculated the Spearman correlation between each feature and the test and retest image volumes individually. Any feature with ρ > 0.85 in both images was removed because of that feature’s strong correlation with volume [
      • Fave X.
      • Mackin D.
      • Yang J.
      • et al.
      Can radiomics features be reproducibly measured from CBCT images for patients with non-small cell lung cancer?.
      ,
      • Zou K.H.
      • Warfield S.K.
      • Bharatha A.
      • et al.
      Statistical validation of image segmentation quality based on a spatial overlap index.
      ]. The cutoff value of 0.85 was referenced by Fave et al [
      • Fave X.
      • Mackin D.
      • Yang J.
      • et al.
      Can radiomics features be reproducibly measured from CBCT images for patients with non-small cell lung cancer?.
      ] who considered the ρ > 0.85 to be highly correlated and Zou et al [
      • Zou K.H.
      • Warfield S.K.
      • Bharatha A.
      • et al.
      Statistical validation of image segmentation quality based on a spatial overlap index.
      ] who reported that ρ > 0.8 represented a strong correlation. This part aims to decrease the features that were false positives in the following analysis. We chose a high cutoff to remove features that are sensitive to different conditions rather than ROI volume.

      Phantom Test-Retest MR images

      To assess the effect of test–retest, each jelly and fruit/vegetable was scanned twice with repositioning. The interval was within 15 min, producing two images of each jelly and fruit/vegetable. After acquiring all T2w images, an experienced oncologist contoured the ROI using the commercial software AccuContour V3.0. To test the short-term robustness on MR images, the CCC was calculated pairwise for features extracted from the twice scanned images.

      Feature extraction

      A total of 1409 radiomics features were selected from the ROIs of each MR image using the commercial software AccuContour V3.0 (http://www.manteiatech.com/index_en.html), which was used for the radiomics calculation because it allows for standardized preprocessing of the medical imaging data and to extracted radiomics features in various images types. This software was developed using the open-source Python package Pyradiomics version 3.0.1, which allows for the extraction of the majority of the features defined by the Image biomarker standardization initiative (IBSI) [
      • Zwanenburg A.
      • Vallières M.
      • Abdalah M.A.
      • et al.
      The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping.
      ]. DICOM images were loaded into the AccuContour V3.0 and the ROI was contoured in the volume outside the jellies and fruits/vegetables. Defined features from original, wavelet, square, squareroot, logarithmic, exponential, and Laplacian of Gaussian-filtered images were extracted.
      The main groupings of radiomics features that were selected included shape features describing the shape and geometric properties of the ROI such as the volume, maximum diameter along different orthogonal directions, maximum surface, tumor compactness, and sphericity, for a total of 14 features. First-order features were described to describe the distribution of individual voxel values without concern for spatial relationships, for a total of 18 features; second-order features provided a measure of the spatial arrangement of the voxel intensities and hence of intralesion heterogeneity and included the gray-level run-length matrix (GLRLM), gray level cooccurrence matrix (GLCM), gray level size zone matrix (GLSZM), neighboring gray tone difference matrix (NGTDM), and gray level dependence matrix (GLDM), for a total of 1377 features. Higher-order statistical features were obtained by statistical methods after applying filters or mathematical transforms to the images such as fractal analysis, Min-kowski functionals, wavelet transform, and Laplacian transforms of Gaussian-filtered images. These mathematical transforms help to extract features from areas with increasingly coarse texture patterns [
      • Rizzo S.
      • Botta F.
      • Raimondi S.
      • et al.
      Radiomics: the facts and the challenges of image analysis.
      ,
      • Ergen B.
      • Baykara M.
      Texture based feature extraction methods for content based medical image retrieval systems.
      ]. The bin-width has a strong relationship with reproducibility in MRI intraobserver sensitivity [
      • Scalco E.
      • Belfatto A.
      • Mastropietro A.
      • et al.
      T2w-MRI signal normalization affects radiomics features reproducibility.
      ]. Therefore, in this work, we used a fixed bin width of 25, which has commonly been used in the literature with high reproducibility in MRI studies [
      • Welch M.L.
      • McIntosh C.
      • Haibe-Kains B.
      • et al.
      Vulnerabilities of radiomic signature development: the need for safeguards.
      ,
      • Scalco E.
      • Belfatto A.
      • Mastropietro A.
      • et al.
      T2w-MRI signal normalization affects radiomics features reproducibility.
      ,

      Aerts HJ, Velazquez ER, Leijenaar RT, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach [published correction appears in Nat Commun. 2014;5:4644. Cavalho, Sara [corrected to Carvalho, Sara]]. Nat Commun. 2014;5:4006. Published 2014 Jun 3. doi:10.1038/ncomms5006.

      ].
      To increase the MRI repeatability, we normalized the image pixel values in the same range as.
      P(X)=Range×I(x)-mininimaxiin-minni+1
      (2)


      Range refers to the number of discrete values (16, 32, 64, 128), I represents the intensity of the original image, and n is the set of pixels in the ROI.
      The types of features and their subclass features are shown in Supplementary Material Table 2. The description of feature extraction and definition are listed at https://pyradiomics.readthedocs.io/en/latest/features.html.

      Intraobserver effect

      Segmentation of images into ROIs such as a tumor, normal tissue, and other anatomical structures is a crucial step for subsequent informatics analysis. Manual segmentation by expert readers is often treated as ground truth. However, it suffers from high intraobserver variability [
      • Brouwer C.L.
      • Steenbakkers R.J.
      • van den Heuvel E.
      • et al.
      3D Variation in delineation of head and neck organs at risk.
      ,
      • Senan S.
      • van Sörnsen de Koste J.
      • Samson M.
      • et al.
      Evaluation of a target contouring protocol for 3D conformal radiotherapy in non-small cell lung cancer.
      ]. To investigate the effect of contouring error during radiotherapy, the ROIs were contoured twice with an experienced radiation oncologist. Furthermore, to select features robust to the effect of contours, the Dice coefficient was used to calculate the spatial overlap accuracy of the two manual contours for the images in the intraobserver cohort. The value of the Dice similarity coefficient (DSC) ranges from 0 to 1. A Dice coefficient of 1 indicates exact overlap, and a value of 0 corresponds to no overlap, concerning formula (3) (Ref. [
      • Thada V.
      • Jaglan V.
      Comparison of Jaccard, Dice, Cosine similarity coefficient to find best fitness value for web retrieved documents using genetic algorithm.
      ]). This allowed coefficient us to control any reliability bias and accurately test the effect of intraobserver analysis on the extraction of radiomics features.
      DSCA,B=2ABA+B
      (3)


      Let A be the set of robustness features on the first contour, and B be the set of robustness features on the repeat contour.

      Thickness effect

      Different thicknesses could impact fractality and lacunarity. Decreased or increased slice thickness would produce noise on the imaging [
      • Zhao B.
      • Tan Y.
      • Tsai W.Y.
      • Schwartz L.H.
      • Lu L.
      Exploring variability in CT characterization of tumors: a preliminary phantom study.
      ]. To investigate whether thickness affects features acquired from MR images, we scanned ten jellies and five fruits/vegetables using slice thicknesses of 1.2 mm, 2.4 mm (default value), and 4.8 mm in the same position. Then, the correlation coefficient was calculated of 1.2 mm and 4.8 mm with 2.4 mm separately.

      Radiation effect

      The MR-Linac combines a 1.5 T MRI system with an MV Linac system that can introduce interference, especially when the radiation beam is on (irradiation) [
      • Liney G.P.
      • Dong B.
      • Begg J.
      • et al.
      Technical Note: experimental results from a prototype high-field inline MRI-linac.
      ]. Furthermore, scattered electrons have an increase in the surface due to the interaction between radiation and phantoms. Thus, determining which features are susceptible to beam conditions was necessary. We scanned MR images before, during, and after accelerator delivery of radiation with gantry angle of 0° and 90° (Fig. 4). CCC values were calculated for each image at both 0- and 90-degree angles of with the beam on. The robust features of these MR images were identified by identifying features that remained stable regardless of the beam condition. The Wilcoxon test was used to investigate whether the differences between each image under the above influencing factors were statistically significant.
      Figure thumbnail gr4
      Fig. 4The front view of MR-Linac showing the position of the beam and phantom when the beam was on in the RF room.

      Motion effect

      The respiratory motion and heartbeats of the patients are the main reasons for uncertainty in texture values during radiotherapy. Longer MR scan times produces more serious image artifacts than CT. To analyze the effects of motion, a dynamic phantom (CRIS 008Z) was used. For this work, the dynamic part was moved in a wave of cos4(t) with amplitudes of 0, 5, 8, and 15 mm, and the details of these waves are shown in Supplementary Fig. 1. According to the dynamic phantom’s reference, the waveform of cos4(t) approximates the normal respiratory function of patients. The images acquired at different amplitudes are shown in Supplementary Fig. 2(a ∼ d), which demonstrates that the degree of motion artifact increased with increased in the amplitude of motion. The influence of radiomics features during radiotherapy was complicated; thus, we acquired images with a combination of motion and irradiation (Supplementary Fig. 2 e ∼ h). AccuContour V3.0 was used to manually contour ROIs on MR images obtained from the phantom, and subsequently, features were extracted from the ROIs. The CCC values were calculated for the extracted features. The robust features were repeatable in the consistency between 0 and 5 mm, 8 mm, or 15 mm.

      Results

      Phantom Test-Retest MR images

      A total of 1409 features were extracted from the phantom images. A total of 97 features were excluded due to the absolute values of their ρ with ROI volumes being>0.85 of the test and retest images, and thus removing these features might prevent features that volume dependent. Thus, 1312 features were included in the subsequent analysis. Of the original 1312 features, 1079 features (F1) had a CCC > 0.9 in the test–retest. These included 224/270 (88.5%) features from the First-order results, 312/360 (86.7%) features from GLCM, 154/210 (80.6%) features from GLDM, 183/240 (84.7%) features from GLRLM, 153/240 (68.6%) features from GLSZM, 43/75 (72.9%) features from NGTDM, and 10/10 (100%) shape features in all image types including original, wavelet, square, square-root, exponential, gradient, and logarithm (Supplementary Material Table 3). The shape features showed the highest robustness in the test–retest, followed by first-order and GLCM. GLSZM features with less robust features were used to represent the consistency of texture features.

      Intraobserver effect

      Features that were robust in test–retest have enrolled in this comparison. The Dice coefficients were calculated for the contours on each jelly or fruit/vegetable. The mean ± standard deviation Dice similarity coefficient was 0.96 ± 0.03 between the twice contour.
      The robustness was consistently high for intraobserver analysis with a median value of 0.982 (range for overall pooled analysis 0.319–1.000). Overall, radiomics features showed excellent intraobserver robustness (n = 936/1079, 86.7%, F3). Shape, and first-order were the most robust features (n = 10/10, 100%; and n = 208/224, 92.9%), in which all of the subclasses remained stable, and the NGTDM feature (n = 29/59, 67.4%) was the least robust feature under the intraobserver effect.

      Thickness effect

      The number of robustness features was significantly different when extracted from 1.2 vs. 2.4 mm and 2.4 vs. 4.8 mm slice images. The features extracted from 1.2 vs. 2.4 mm slice thickness were better than from 2.4 vs. 4.8 mm for all feature types. Features that were robust in test–retest and intraobserver analyses were enrolled this comparison. The robustness of the feature was lower for 2.4 vs. 4.8 mm thicknesses (n = 416/936, 44.4%,) than for 1.2 vs. 2.4 mm thicknesses (n = 591/936, 63.1%,). According to all feature classes, shape features were most robust across all scanning thicknesses (n = 10/10, 100%). GLCM features type extracted from 1.2 vs. 2.4 mm thickness showed higher sensitive robustness as compared with those extracted from the 2.4 vs. 4.8 mm thickness images (1.2 vs. 2.4 mm thickness n = 168/360, 46.7%; 2.4 vs. 4.8 mm thickness n = 107/360, 29.7%).
      A total of 374/936 (40.0%) features showed excellent robustness across the different thicknesses. The value of the CCC in 1.2 vs. 2.4 mm thickness was mainly>0.9, while the value of the CCC was concentrated in the>0.9 and the range of 0.5 to 0.75 in 2.4 vs. 4.8 mm thickness (Fig. 3 (a) in supplement).

      Radiation effect

      Across all feature types, the number of stable features was mostly higher for features calculated from the beam on images than for those calculated from the beam off images (Fig. 3 (b) in Supplement). The images acquired from when the beam was on provided a higher percentage of robust features in different gantry angles. Features that were robust in test–retest and intraobserver analyses passed this comparison. For a 0° gantry angle, the robust features calculated from the images when the beam was on were 883/936 (94.3%) and 896/936 (95.4%) features when the beam was off. With the gantry angle positioned at 90°, 876/936 (93.6%) features exhibited robust when the beam was on, and 878/936 (93.8%) features kept robustness when the beam was off. In addition to the effect of radiation, the robustness of features can change with different beam positions, with 793/936 (84.7%) features remaining robust when the beam position was changed from 0° to 90° when the beam was on. The details are shown in Supplementary Material Table 3. The Shape and NGTDM features were both the most stable features with the beam both on and off, regardless of the beam position. The intersecting set of features in different conditions is shown in Fig. 3 (c) in the Supplement. A total of 810/936 (86.5%) features maintained robustness with the beam both on and off, with the least number of stable features seen in the GLSZM group. The greatest number of subclass features in NGTDM were complexity features.
      After the consistency correlation analysis, the Wilcoxon test was used to perform statistical analysis on the features obtained under different beam conditions (Supplementary Material Table 4). The P-value between beam on or off were mostly less than 0.0001.

      Motion effect

      In this part, we examined whether changing the amplitude of motion produced changes in the robustness of radiomic features. Features that were robust in test–retest and intraobserver tests were passed this comparison. Among all of the features, the number of radiomics features robustly decreased with increasing motion amplitude, (Fig. 5(a)). Three features, DependenceNonUniformity, GrayLevelNonUniformity, and RunLengthNonUniformity were robust for a motion of 5–15 mm. For motion with 5 mm amplitude, n = 158/936 (16.9%) features were stable. At 8 mm of motion, n = 112/936 (12.0%) features were robust, and n = 73/936 (7.8%) features were repeated at an amplitude of 15 mm. After combining radiation with motion, 131/936 (13.9%), 90/936 (9.6%), and 30/936 (3.2%) features were remained robust in 5, 8, and 15 mm amplitudes, respectively, and the number of robustness features of the combined effect of radiation and motion was less than those influenced by motion, Fig. 5 (a), with details as shown in Table 3 in the Supplement. Under this condition, four features, GrayLevelNonUniformity, RunLengthNonUniformity, Energy and TotalEnergy, remained stable for amplitudes of 5–15 mm.
      Figure thumbnail gr5
      Fig. 5(a) Results of the motion test and combined motion with irradiation. M refers to the effect of motion; RM refers to the effect combining motion with irradiation. Larger values refer to better consistency with 0 mm amplitude in which the phantom was stable. Heatmap illustrating the changes inconsistency of the changes with the increased amplitudes with or without irradiation. (b) The CCC varieties between pairs of amplitudes were plotted by the types of groups being compared. Here, 0 vs. 5, 0 vs. 8, and 0 vs. 15 were means used to calculate the consistency between 0 mm amplitude and 5, 8, and 15 mm amplitudes.
      Compared with other perturbations, motion most affected the stability of radiomics features and emphasized the importance of reducing the motion effect during radiotherapy to enhance image robustness. Fifty-six features were reproducible for the effect of motion. Fourteen features remained robust to both the effect of motion and the influence of combined motion with radiation, focusing on the image type of the wavelet, GrayLevelNonUniformity was the most robust feature among these features, and was used to measure the changes in gray intensity and run length. The results are shown in Fig. 5(b), where the concordance correlation coefficient for GrayLevelNonUniformity from GLDM and GLRLM also demonstrates the illustrative conclusion. Only the coarseness features, which were used to represent the unevenness of images, remained robust in the group of NGTDM features.
      Multivariate robust features are defined as features that have excellent robustness under different types of outside influences. Under the effects of five factors, the feature changes are shown in Fig. 6 (a). The intraobserver features remained most robust in the whole test (936/1079, 86.7%), followed by the effect of radiation (810/936, 86.5%). The number of robustness features under the influence of motion was less than that of other factors. The changing trend in the GLSZM feature was similar to the various of GLDM. The GLCM feature type described by the spatial correlation was the most sensitive of these factors. The intersecting set of robust features of all the effect factors is shown in Fig. 6 (b). A total of 1.9% (25/1312) of features remained robust in the whole test, mainly focusing on the original image type and wavelet image type. The details are shown in Table 5 in the supplement. Motion was the most influential factor on the robustness of radiomics features (Fig. 6 c).
      Figure thumbnail gr6
      Fig. 6(a) Numbers of robust features in each feature type under five influencing factors. TF refers to the total number of features of each type; T-R refers to the effect of test–retest; I-O refers to the effect of intra-observer. (b) A Venn diagram to illustrate the intersection stable features under different influences. (c) Heatmap representation of the change in radiomics features under five different influences. The closer that the color is to red, the more robust that the feature is; the closer the color is to green, the less robust that the feature is. A refers to the effect of intra-observer, B refers to the effect of test–retest, C refers to the effect of the consistency between before irradiation and after irradiation in the 90° gantry position, D refers to the effect of the consistence between before irradiation and irradiation in the 90° gantry position, E refers to the effect of the consistency between before irradiation and irradiation in the 0° gantry position, F refers to the effect of the consistence between before irradiation and after irradiation in the 0° gantry position, G refers to the effect of the consistency between thickness for 1.2 and 2.4 mm, H refers to the effect the consistency between thickness in 4.8 and 2.4 mm, I refers to the consistency between 0 and 15 mm amplitude with irradiation; J refers to the consistency between 0 and 15 mm amplitude, K refers to the consistency between 0 and 5 mm amplitude, L refers to the consistency between 0 and 8 mm amplitude, and M refers to the consistency between 0 and 5 mm amplitude with irradiation, N refers to the consistence between 0 and 8 mm amplitude with irradiation.

      Discussion

      As new and advanced radiotherapy equipment, MR-Linac makes daily MRI scanning possible for treatment response monitoring [
      • Cusumano D.
      • Boldrini L.
      • Dhont J.
      • et al.
      Artificial intelligence in magnetic resonance guided radiotherapy: medical and physical considerations on state of art and future perspectives.
      ]. However, the magnetic field intensity (1.5 T) and the linear accelerator render it different from the conventional diagnostic magnetic resonance system [
      • Mutic S.
      • Dempsey J.F.
      The ViewRay system: magnetic resonance-guided and controlled radiotherapy.
      ]. Therefore, the robust radiomics features selected from conventional diagnostic MRI are not suitable for MR-Linac. In this work, we systematically investigated the effects of test–retest, intraobserver, thickness, radiation, and motion robustness on radiomics features derived from T2w images commonly used in conventional clinical practice in a well-controlled phantom setup, which could be helpful to establishing a reliable clinical prediction model.
      The test–retest cohort provides a controlled environment to determine the radiomics features most likely to determine the inherent characteristics of the tumor [
      • Fiset S.
      • Welch M.L.
      • Weiss J.
      • et al.
      Repeatability and reproducibility of MRI-based radiomic features in cervical cancer.
      ]. Like in other studies, we observed a significant effect of test–retest variability on the extracted radiomics features. Although we used the same scanning parameters and with position repeats, many radiomic features appear to be highly sensitive to machine influence. Therefore, removing the features sensitive to test–retest is necessary for the following radiomics investigation, laying a foundation toward the establishment of clinical modes. In this study, 82.2% of robust features in the test–retest remained robust, which was better than other studies. For example, Baeßler et al. [
      • Baeßler B.
      • Weiss K.
      • Pinto Dos Santos D.
      Robustness and reproducibility of radiomics in magnetic resonance imaging: a phantom study.
      ] showed that 46.0% and 54.0% of features maintained robustness on low-resolution and high-resolution T2w images, respectively. Shiri et al. [

      Shiri, I., Abdollahi, H., Shaysteh, S. & Rabi Mahdavi, S. Test-Retest Reproducibility and Robustness Analysis of Recurrent Glioblastoma MRI Radiomics Texture Features. Iranian Journal of Radiology Special iss, doi:10.5812/iranjradiol.48035.

      ] demonstrated that 74.0% of assessed radiomics features on MRI, had high test–retest stability in thirteen glioblastoma patients.
      The intraobserver as the most essential step of the workflow of radiomics, more easily led to bias in tumor delineation from delineation subjectivity of observers. In this work, 86.7% of features maintained excellent robustness in the intraobserver cohort, indicating that the highest robustness in all factors was similar to the result of Dreher et al. [
      • Dreher C.
      • Kuder T.A.
      • König F.
      • et al.
      Radiomics in diffusion data: a test-retest, inter- and intra-reader DWI phantom study.
      ] who indicated that the intraobserver and interobserver were high for all sequences that they studied. Similarly, a previous MRI phantom study reported that robust intraobserver features were superior to other factors [
      • Baeßler B.
      • Weiss K.
      • Pinto Dos Santos D.
      Robustness and reproducibility of radiomics in magnetic resonance imaging: a phantom study.
      ]. Reducing the sensitive features influenced by subjective bias through automatic image segmentation is a major step for radiomics research, leading toward the development of automated image segmentation and artificial intelligence [
      • Kalendralis P.
      • Sloep M.
      • van Soest J.
      • Dekker A.
      • Fijten R.
      Making radiotherapy more efficient with FAIR data.
      ].
      Previous studies have investigated the effects of thickness on tumor size measurements (e.g., volume) in cancer screening programs and therapy response assessment [
      • Sahin B.
      • Ergur H.
      Assessment of the optimum section thickness for the estimation of liver volume using magnetic resonance images: a stereological gold standard study.
      ]. To our knowledge, our study was the first to investigate the effects of MR slice thickness from MR-Linac on quantitative image radiomics features extracted to denote the tumor’s volume, shape, and density using a manageable phantom. The different thickness cohorts had the largest effect of radiomics features in our work, with only 40.0% of the features remaining stable. Our results showed that with increasing thickness, fewer features retain stability. This result is similar to Zhao et al. [
      • Zhao B.
      • Tan Y.
      • Tsai W.Y.
      • Schwartz L.H.
      • Lu L.
      Exploring variability in CT characterization of tumors: a preliminary phantom study.
      ] who showed that a thinner slice thickness was better than a thicker slice thickness for density mean, density 3D gray-level cooccurrence matrix (GLCM) energy, and homogeneity. In our consideration, a larger thickness caused a sufficient number of voxels to not be reached, which could explain the low robustness found for this factor. The thickness of daily images is constant and easy to control; therefore, maintaining the consistency of image thickness remains an indispensable measure to ensure the stability of features during radiotherapy.
      Despite considerable changes in the number of robust radiomic features between various beam conditions, we showed that 810 features were exceedingly robust to variations between beams being on and off, and between gantry angle changes. The system of the accelerator was the main difference between conventional MR and MR-Linac, consequently, the effect of beam conditions should be considered. Electrons from radiation cause a larger amount of noise on the images, as well as artifacts that influence the quality of such images [
      • Kajikawa T.
      • Kadoya N.
      • Tanaka S.
      • et al.
      Dose distribution correction for the influence of magnetic field using a deep convolutional neural network for online MR-guided adaptive radiotherapy.
      ]. As a result, the number of reproducibility features extracted from the beam on was fewer than those extracted from the beam off with the gantry at the 0° and 90° positions. Our results, in this case, differ from those of Wang et al. [
      • Wang J.
      • Yung J.
      • Kadbi M.
      • Hwang K.
      • Ding Y.
      • Ibbott G.S.
      Assessment of image quality and scatter and leakage radiation of an integrated MR-LINAC system.
      ], who demonstrated that there were no significant differences in image robustness or quality with or without radiation.
      Motion, as the leading bias originating from respiration and heartbeats during radiotherapy, the robustness and variability should be the first consideration when investigating the robustness of radiomics features extracted from tumors in the thorax and abdomen [
      • Kajikawa T.
      • Kadoya N.
      • Tanaka S.
      • et al.
      Dose distribution correction for the influence of magnetic field using a deep convolutional neural network for online MR-guided adaptive radiotherapy.
      ]. For MRI, the imaging time was longer than that for CBCT. Thus, for thoracic and abdominal tumors, the effect of respiratory and heartbeat motion on the robustness of radiomics features extracted from MRI was lower than that from CBCT images during radiotherapy. Fave et al. [
      • Fave X.
      • Mackin D.
      • Yang J.
      • et al.
      Can radiomics features be reproducibly measured from CBCT images for patients with non-small cell lung cancer?.
      ] reported that patients with NSCLC had tumor motion ranging from 5 mm to 10 mm, and they also claimed that most features changed substantially with the increasing amplitude of the ROI. Nevertheless, in this study, to identify features that were less affected by movement, a larger range of motion was chosen. The larger motion amplitude might filter out some features that could be used in the future but should determine more robust features for clinical applications in chest or abdomen tumors.
      From this study, we can come to four key points. First, shape features are the most stable and reproducible features in all tests except the effect of motion. Similarly, this point has been reported by several works in the literature, not only from patient data but also from phantom studies [
      • Baeßler B.
      • Weiss K.
      • Pinto Dos Santos D.
      Robustness and reproducibility of radiomics in magnetic resonance imaging: a phantom study.
      ,
      • Fiset S.
      • Welch M.L.
      • Weiss J.
      • et al.
      Repeatability and reproducibility of MRI-based radiomic features in cervical cancer.
      ,
      • van Timmeren J.E.
      • Leijenaar R.T.H.
      • van Elmpt W.
      • et al.
      Test-retest data for radiomics feature stability analysis: generalizable or study-specific?.
      ]. Second, considerable differences exist in the number of robust features between different beam conditions. The number of robustness features extracted from the beam off images was greater than that from the beam was on. Third, the effect of intraobserver draws the highest percentage of robust features in this phantom study, followed by the beam condition. Fourth, 25 of 1409 features remained robust to the variations among all five tests in our study (Supplement Table 5). Therefore, we suppose that these features can be used to design radiomics features within clinical studies.
      Our study has several limitations. First, although we used jellies and fruits/vegetables of various shapes to mimic patient tumors, the irregular jellies and fruits/vegetables were fundamentally different from real tumors, and the edge of the jellies and fruits/vegetables were smooth, which does not mimic common tumor features such microcalcifications, burrs and lobular configurations, which are all common in malignant lesions that would require radiotherapy in the first place. This tumor-alternative method might show variability in phantom textures in five common factors during radiotherapy. Due to the different textures inherent to the actual tumors, the features that we extracted cannot fully reflect the true tumor situation. Second, the radiomics features were extracted from a phantom study. These phantoms were different from the tissue composition of patients’ bodies, and this was a single-center phantom study that might not be investigated by other machines. Furthermore, we investigated the most commonly used MRI sequence and did not consider other quantitative MRI sequences, such as T1-weighted imaging and fluid-attenuated inversion recovery (FLAIR) imaging. In the future, we wish to verify the robustness features from some patients in different sequences.
      In conclusion, the robustness and reproducibility features of MR-Linac were tested in this work under five common influencing factors, and thus can be reliably applied in clinical studies. Shape features are considered the most robust features except for the influence of motion. The number of robustness features of the motion cohort was the fewest, emphasizing the need to decrease the amplitude of respiration motion. The 25 features with excellent robustness under five influences could be most acceptable for the design of future radiomics signatures, although these results should be validated in the clinical study in cancer patients.

      Declaration of Competing Interest

      The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

      Acknowledgements

      This work was supported by Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical. Support was also provided by the National Key Research and Development Program of China (2016YFC0105106); Academic Promotion Program of Shandong First Medical University(2019LJ004); The National Natural Science Foundation of China (82102173). The contents are solely the responsibility of the authors and do not necessarily represent the official views of the funding agencies.

      Appendix A. Supplementary data

      The following are the Supplementary data to this article:

      References

        • Giménez A.
        • Franquet T.
        • Prats R.
        • Estrada P.
        • Villalba J.
        • Bagué S.
        Unusual primary lung tumors: a radiologic-pathologic overview.
        Radiographics. 2002; 22: 601-619https://doi.org/10.1148/radiographics.22.3.g02ma25601
        • Therasse P.
        • Arbuck S.G.
        • Eisenhauer E.A.
        • et al.
        New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada.
        J Natl Cancer Inst. 2000; 92: 205-216https://doi.org/10.1093/jnci/92.3.205
        • Papadimitroulas P.
        • Brocki L.
        • Christopher Chung N.
        • et al.
        Artificial intelligence: deep learning in oncological radiomics and challenges of interpretability and data harmonization.
        Phys Med. 2021; 83: 108-121https://doi.org/10.1016/j.ejmp.2021.03.009
        • Avanzo M.
        • Stancanello J.
        • El Naqa I.
        Beyond imaging: the promise of radiomics.
        Phys Med. 2017; 38: 122-139https://doi.org/10.1016/j.ejmp.2017.05.071
        • Zhao B.
        • Tan Y.
        • Tsai W.Y.
        • Schwartz L.H.
        • Lu L.
        Exploring variability in CT characterization of tumors: a preliminary phantom study.
        Transl Oncol. 2014; 7 (Published 2014 Feb 1): 88-93https://doi.org/10.1593/tlo.13865
        • Buch K.
        • Li B.
        • Qureshi M.M.
        • Kuno H.
        • Anderson S.W.
        • Sakai O.
        Quantitative assessment of variation in CT parameters on texture features: pilot study using a nonanatomic phantom.
        AJNR Am J Neuroradiol. 2017; 38: 981-985https://doi.org/10.3174/ajnr.A5139
        • Mackin D.
        • Fave X.
        • Zhang L.
        • et al.
        Measuring computed tomography scanner variability of radiomics features.
        Invest Radiol. 2015; 50: 757-765https://doi.org/10.1097/RLI.0000000000000180
        • El Naqa I.
        • Grigsby P.
        • Apte A.
        • et al.
        Exploring feature-based approaches in PET images for predicting cancer treatment outcomes.
        Pattern Recognit. 2009; 42: 1162-1171https://doi.org/10.1016/j.patcog.2008.08.011
        • Cook G.J.R.
        • et al.
        Radiomics in PET: principles and applications.
        Clin Transl Imaging. 2014; 2: 269-276https://doi.org/10.1007/s40336-014-0064-0
        • Baeßler B.
        • Weiss K.
        • Pinto Dos Santos D.
        Robustness and reproducibility of radiomics in magnetic resonance imaging: a phantom study.
        Invest Radiol. 2019; 54: 221-228https://doi.org/10.1097/RLI.0000000000000530
        • van Timmeren J.E.
        • Leijenaar R.T.H.
        • van Elmpt W.
        • et al.
        Survival prediction of non-small cell lung cancer patients using radiomics analyses of cone-beam CT images.
        Radiother Oncol. 2017; 123: 363-369https://doi.org/10.1016/j.radonc.2017.04.016
        • Qin Q.
        • Shi A.
        • Zhang R.
        • et al.
        Cone-beam CT radiomics features might improve the prediction of lung toxicity after SBRT in stage I NSCLC patients.
        Thorac Cancer. 2020; 11: 964-972https://doi.org/10.1111/1759-7714.13349
        • Shi L.
        • Rong Y.
        • Daly M.
        • et al.
        Cone-beam computed tomography-based delta-radiomics for early response assessment in radiotherapy for locally advanced lung cancer.
        Phys Med Biol. 2020; 65015009https://doi.org/10.1088/1361-6560/ab3247
        • van Timmeren J.E.
        • Leijenaar R.T.H.
        • van Elmpt W.
        • Reymen B.
        • Lambin P.
        Feature selection methodology for longitudinal cone-beam CT radiomics.
        Acta Oncol. 2017; 56: 1537-1543https://doi.org/10.1080/0284186X.2017.1350285
        • Horvat N.
        • Veeraraghavan H.
        • Khan M.
        • et al.
        MR imaging of rectal cancer: radiomics analysis to assess treatment response after neoadjuvant therapy.
        Radiology. 2018; 287: 833-843https://doi.org/10.1148/radiol.2018172300
        • Li Z.
        • Li H.
        • Wang S.
        • et al.
        MR-based radiomics nomogram of cervical cancer in prediction of the lymph-vascular space invasion preoperatively.
        J Magn Reson Imaging. 2019; 49: 1420-1426https://doi.org/10.1002/jmri.26531
        • Coroller T.P.
        • Grossmann P.
        • Hou Y.
        • et al.
        CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma.
        Radiother Oncol. 2015; 114: 345-350https://doi.org/10.1016/j.radonc.2015.02.015
        • Bosetti D.G.
        • Ruinelli L.
        • Piliero M.A.
        • et al.
        Cone-beam computed tomography-based radiomics in prostate cancer: a mono-institutional study.
        Strahlenther Onkol. 2020; 196: 943-951https://doi.org/10.1007/s00066-020-01677-x
        • Fave X.
        • Mackin D.
        • Yang J.
        • et al.
        Can radiomics features be reproducibly measured from CBCT images for patients with non-small cell lung cancer?.
        Med Phys. 2015; 42: 6784-6797https://doi.org/10.1118/1.4934826
      1. Gu J, Zhu J, Qiu Q, et al. The Feasibility Study of Megavoltage Computed Tomographic (MVCT) Image for Texture Feature Analysis. Front Oncol. 2018;8:586. Published 2018 Dec 5. doi:10.3389/fonc.2018.00586.

        • Boulanger M.
        • Nunes J.C.
        • Chourak H.
        • et al.
        Deep learning methods to generate synthetic CT from MRI in radiotherapy: a literature review.
        Phys Med. 2021; 89: 265-281https://doi.org/10.1016/j.ejmp.2021.07.027
        • Lee J.
        • Steinmann A.
        • Ding Y.
        • et al.
        Radiomics feature robustness as measured using an MRI phantom.
        Sci Rep. 2021; 11 (Published 2021 Feb 17): 3973https://doi.org/10.1038/s41598-021-83593-3
        • Mazzoni L.N.
        • Bock M.
        • Levesque I.R.
        • Lurie D.J.
        • Palma G.
        New developments in MRI: system characterization, technical advances and radiotherapy applications.
        Phys Med. 2021; 90: 50-52https://doi.org/10.1016/j.ejmp.2021.09.001
        • Bernatz S.
        • Zhdanovich Y.
        • Ackermann J.
        • et al.
        Impact of rescanning and repositioning on radiomic features employing a multi-object phantom in magnetic resonance imaging.
        Sci Rep. 2021; 11: 14248https://doi.org/10.1038/s41598-021-93756-x
        • Bianchini L.
        • Botta F.
        • Origgi D.
        • et al.
        PETER PHAN: an MRI phantom for the optimisation of radiomic studies of the female pelvis.
        Phys Med. 2020; 71: 71-81https://doi.org/10.1016/j.ejmp.2020.02.003
        • Bianchini L.
        • Santinha J.
        • Loução N.
        • et al.
        A multicenter study on radiomic features from T2 -weighted images of a customized MR pelvic phantom setting the basis for robust radiomic models in clinics.
        Magn Reson Med. 2021; 85: 1713-1726https://doi.org/10.1002/mrm.28521
        • Dreher C.
        • Kuder T.A.
        • König F.
        • et al.
        Radiomics in diffusion data: a test-retest, inter- and intra-reader DWI phantom study.
        Clin Radiol. 2020; 75: 798.e13-798.e22https://doi.org/10.1016/j.crad.2020.06.024
        • Wang J.
        • Yung J.
        • Kadbi M.
        • Hwang K.
        • Ding Y.
        • Ibbott G.S.
        Assessment of image quality and scatter and leakage radiation of an integrated MR-LINAC system.
        Med Phys. 2018; 45: 1204-1209https://doi.org/10.1002/mp.12767
        • Tijssen R.H.N.
        • Philippens M.E.P.
        • Paulson E.S.
        • et al.
        MRI commissioning of 1.5T MR-linac systems - a multi-institutional study.
        Radiother Oncol. 2019; 132: 114-120https://doi.org/10.1016/j.radonc.2018.12.011
        • Lin L.I.
        A concordance correlation coefficient to evaluate reproducibility.
        Biometrics. 1989; 45: 255-268
      2. McBride GB, “A proposal for strength-of-agreement criteria for Lin’s concordance correlation coefficient,” NIW A Client Report No. HAM2005–062 (2005), pp. 1–14.

        • Welch M.L.
        • McIntosh C.
        • Haibe-Kains B.
        • et al.
        Vulnerabilities of radiomic signature development: the need for safeguards.
        Radiother Oncol. 2019; 130: 2-9https://doi.org/10.1016/j.radonc.2018.10.027
        • Scalco E.
        • Belfatto A.
        • Mastropietro A.
        • et al.
        T2w-MRI signal normalization affects radiomics features reproducibility.
        Med Phys. 2020; 47: 1680-1691https://doi.org/10.1002/mp.14038
        • Zou K.H.
        • Warfield S.K.
        • Bharatha A.
        • et al.
        Statistical validation of image segmentation quality based on a spatial overlap index.
        Acad Radiol. 2004; 11: 178-189https://doi.org/10.1016/s1076-6332(03)00671-8
        • Zwanenburg A.
        • Vallières M.
        • Abdalah M.A.
        • et al.
        The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping.
        Radiology. 2020; 295: 328-338https://doi.org/10.1148/radiol.2020191145
        • Rizzo S.
        • Botta F.
        • Raimondi S.
        • et al.
        Radiomics: the facts and the challenges of image analysis.
        Eur Radiol Exp. 2018; 2: 36https://doi.org/10.1186/s41747-018-0068-z
        • Ergen B.
        • Baykara M.
        Texture based feature extraction methods for content based medical image retrieval systems.
        Biomed Mater Eng. 2014; 24: 3055-3062https://doi.org/10.3233/BME-141127
      3. Aerts HJ, Velazquez ER, Leijenaar RT, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach [published correction appears in Nat Commun. 2014;5:4644. Cavalho, Sara [corrected to Carvalho, Sara]]. Nat Commun. 2014;5:4006. Published 2014 Jun 3. doi:10.1038/ncomms5006.

        • Brouwer C.L.
        • Steenbakkers R.J.
        • van den Heuvel E.
        • et al.
        3D Variation in delineation of head and neck organs at risk.
        Radiat Oncol. 2012; 7: 32https://doi.org/10.1186/1748-717X-7-32
        • Senan S.
        • van Sörnsen de Koste J.
        • Samson M.
        • et al.
        Evaluation of a target contouring protocol for 3D conformal radiotherapy in non-small cell lung cancer.
        Radiother Oncol. 1999; 53: 247-255https://doi.org/10.1016/s0167-8140(99)00143-7
        • Thada V.
        • Jaglan V.
        Comparison of Jaccard, Dice, Cosine similarity coefficient to find best fitness value for web retrieved documents using genetic algorithm.
        Int J Innov Eng Technol. 2013; 2: 202-205
        • Liney G.P.
        • Dong B.
        • Begg J.
        • et al.
        Technical Note: experimental results from a prototype high-field inline MRI-linac.
        Med Phys. 2016; 43: 5188https://doi.org/10.1118/1.4961395
        • Cusumano D.
        • Boldrini L.
        • Dhont J.
        • et al.
        Artificial intelligence in magnetic resonance guided radiotherapy: medical and physical considerations on state of art and future perspectives.
        Phys Med. 2021; 85: 175-191https://doi.org/10.1016/j.ejmp.2021.05.010
        • Mutic S.
        • Dempsey J.F.
        The ViewRay system: magnetic resonance-guided and controlled radiotherapy.
        Semin Radiat Oncol. 2014; 24: 196-199https://doi.org/10.1016/j.semradonc.2014.02.008
        • Fiset S.
        • Welch M.L.
        • Weiss J.
        • et al.
        Repeatability and reproducibility of MRI-based radiomic features in cervical cancer.
        Radiother Oncol. 2019; 135: 107-114https://doi.org/10.1016/j.radonc.2019.03.001
      4. Shiri, I., Abdollahi, H., Shaysteh, S. & Rabi Mahdavi, S. Test-Retest Reproducibility and Robustness Analysis of Recurrent Glioblastoma MRI Radiomics Texture Features. Iranian Journal of Radiology Special iss, doi:10.5812/iranjradiol.48035.

        • Kalendralis P.
        • Sloep M.
        • van Soest J.
        • Dekker A.
        • Fijten R.
        Making radiotherapy more efficient with FAIR data.
        Phys Med. 2021; 82: 158-162https://doi.org/10.1016/j.ejmp.2021.01.083
        • Sahin B.
        • Ergur H.
        Assessment of the optimum section thickness for the estimation of liver volume using magnetic resonance images: a stereological gold standard study.
        Eur J Radiol. 2006; 57: 96-101https://doi.org/10.1016/j.ejrad.2005.07.006
        • Kajikawa T.
        • Kadoya N.
        • Tanaka S.
        • et al.
        Dose distribution correction for the influence of magnetic field using a deep convolutional neural network for online MR-guided adaptive radiotherapy.
        Phys Med. 2020; 80: 186-192https://doi.org/10.1016/j.ejmp.2020.11.002
        • van Timmeren J.E.
        • Leijenaar R.T.H.
        • van Elmpt W.
        • et al.
        Test-retest data for radiomics feature stability analysis: generalizable or study-specific?.
        Tomography. 2016; 2: 361-365https://doi.org/10.18383/j.tom.2016.00208