Advertisement
Research Article| Volume 107, 102542, March 2023

Download started.

Ok

Updating a clinical Knowledge-Based Planning prediction model for prostate radiotherapy

Open AccessPublished:February 11, 2023DOI:https://doi.org/10.1016/j.ejmp.2023.102542

      Highlights

      • Three-years long continuous update and maintenance process of a KBP model for prostate treatments.
      • Two different approaches of KBP update.
      • Practical suggestions for conducting a maintenance process of KBP models to ensure quality of outcomes and model robustness.
      • The process proposed are oriented to be used in the everyday clinical routine of medical physicist.

      Abstract

      Background and purpose

      Clinical knowledge-based planning (KBP) models dedicated to prostate radiotherapy treatment may require periodical updates to remain relevant and to adapt to possible changes in the clinic. This study proposes a paired comparison of two different update approaches through a longitudinal analysis.

      Materials and methods

      A clinically validated KBP model for moderately hypofractionated prostate therapy was periodically updated using two approaches: one was targeted at achieving the biggest library size (Mt), while the other one at achieving the highest mean sample quality (Rt). Four subsequent updates were accomplished. The goodness, robustness and quality of the outcomes were measured and compared to those of the common ancestor. Plan quality was assessed through the Plan Quality Metric (PQM) and plan complexity was monitored.

      Results

      Both update procedures allowed for an increase in the OARs sparing between +3.9 % and +19.2 % compared to plans generated by a human planner. Target coverage and homogeneity slightly reduced [−0.2 %;−14.7 %] while plan complexity showed only minor changes.
      Increasing the sample size resulted in more reliable predictions and improved goodness-of-fit, while increasing the mean sample quality improved the outcomes but slightly reduced the models reliability.

      Conclusions

      Repeated updates of clinical KBP models can enhance their robustness, reliability and the overall quality of automatically generated plans. The periodical expansion of the model sample accompanied by the removal of the unacceptable low quality plans should maximize the benefits of the updates while limiting the associated workload.

      1. Introduction

      The performance of all prediction systems based on prior knowledge strongly depends on the consistency, quality, and vastness of such knowledge. When these systems are adopted in a clinical environment, their possible benefits should be weighed against the amount of human resources required for their set-up, maintenance and updating. In fact, such systems require an intense workload of data farming and mining, which may last even beyond their initial clinical implementation.
      A substantial body of literature demonstrated the effectiveness of Knowledge-Based Planning (KBP) systems in assisting and improving the clinical management of external beam radiation therapy treatment plans. Improvements in the quality of treatments, increased consistency, reduced variability, and reduction in planners’ workloads are just a few of the major benefits associated with the implementation of such systems in a clinical setting [
      • Chang A.T.Y.
      • Hung A.W.M.
      • Cheung F.W.K.
      • Lee M.C.H.
      • Chan O.S.H.
      • Philips H.
      • et al.
      Comparison of planning quality and efficiency between conventional and knowledge-based algorithms in nasopharyngeal cancer patients using intensity modulated radiation therapy.
      ,
      • Cornell M.
      • Kaderka R.
      • Hild S.J.
      • Ray X.J.
      • Murphy J.D.
      • Atwood T.F.
      • et al.
      Noninferiority study of automated knowledge-based planning versus human-driven optimization across multiple disease sites.
      ,
      • Hussein M.
      • Heijmen B.J.M.
      • Verellen D.
      • Nisbet A.
      Automation in intensity modulated radiotherapy treatment planning—a review of recent innovations.
      ,
      • Kaderka R.
      • Hild S.J.
      • Bry V.N.
      • Cornell M.
      • Ray X.J.
      • Murphy J.D.
      • et al.
      Wide-scale clinical implementation of knowledge-based planning: an investigation of workforce efficiency, need for post-automation refinement, and data-driven model maintenance.
      ,
      • Li N.
      • Carmona R.
      • Sirak I.
      • Kasaova L.
      • Followill D.
      • Michalski J.
      • et al.
      Highly efficient training, refinement, and validation of a knowledge-based planning quality-control system for radiation therapy clinical trials.
      ,
      • Momin S.
      • Fu Y.
      • Lei Y.
      • Roper J.
      • Bradley J.D.
      • Curran W.J.
      • et al.
      Knowledge-based radiation treatment planning: A data-driven method survey.
      ,
      • Panettieri V.
      • Ball D.
      • Chapman A.
      • Cristofaro N.
      • Gawthrop J.
      • Griffin P.
      • et al.
      Development of a multicentre automated model to reduce planning variability in radiotherapy of prostate cancer.
      ,
      • Wortel G.
      • Eekhout D.
      • Lamers E.
      • van der Bel R.
      • Kiers K.
      • Wiersma T.
      • et al.
      Characterization of automatic treatment planning approaches in radiotherapy.
      ]. On the other hand, many research groups have reported labour intensive, iterative and time-consuming processes associated with KBP systems such as: population selection, model revision, model refinement and model validation [
      • Li N.
      • Carmona R.
      • Sirak I.
      • Kasaova L.
      • Followill D.
      • Michalski J.
      • et al.
      Highly efficient training, refinement, and validation of a knowledge-based planning quality-control system for radiation therapy clinical trials.
      ,
      • Boutilier J.J.
      • Craig T.
      • Sharpe M.B.
      • Chan T.C.Y.
      Sample size requirements for knowledge-based treatment planning.
      ,
      • Ge Y.
      • Wu Q.J.
      Knowledge-based planning for intensity-modulated radiation therapy: A review of data-driven approaches.
      ,
      • Tamura M.
      • Monzen H.
      • Matsumoto K.
      • Kubo K.
      • Ueda Y.
      • Kamima T.
      • et al.
      Influence of cleaned-up commercial knowledge-based treatment planning on volumetric-modulated arc therapy of prostate cancer.
      ]. In general, some maintenance is necessary to ensure KBP models proper clinical usability, relevance and quality. This leads to the necessity of a periodical repetition of the aforementioned process, especially in the cases when the clinical practice changes significantly [
      • Tol J.P.
      • Doornaert P.
      • Witte B.I.
      • Dahele M.
      • Slotman B.J.
      • Verbakel W.F.A.R.
      A longitudinal evaluation of improvements in radiotherapy treatment plan quality for head and neck cancer patients.
      ].
      Different groups have recently reported their endeavours to reduce this maintenance effort: the performance of KBP models could be improved with an iterative learning process for head and neck (HN) and prostate cases [
      • Fogliata A.
      • Cozzi L.
      • Reggiori G.
      • Stravato A.
      • Lobefalo F.
      • Franzese C.
      • et al.
      RapidPlan knowledge based planning: iterative learning process and model ability to steer planning strategies.
      ,
      • Hundvin J.A.
      • Fjellanger K.
      • Pettersen H.E.S.
      • Nygaard B.
      • Revheim K.
      • Sulen T.H.
      • et al.
      Clinical iterative model development improves knowledge-based plan quality for high-risk prostate cancer with four integrated dose levels.
      ,
      • Nakamura K.
      • Okuhata K.
      • Tamura M.
      • Otsuka M.
      • Kubo K.
      • Ueda Y.
      • et al.
      An updating approach for knowledge-based planning models to improve plan quality and variability in volumetric-modulated arc therapy for prostate cancer.
      ,
      • Wang M.
      • Li S.
      • Huang Y.
      • Yue H.
      • Li T.
      • Wu H.
      • et al.
      An interactive plan and model evolution method for knowledge-based pelvic VMAT planning.
      ]; data-driven methods can be employed to periodically improve KBP automated planning routines and KBP model performance could be incrementally improved with supervised machine learning [
      • Kaderka R.
      • Hild S.J.
      • Bry V.N.
      • Cornell M.
      • Ray X.J.
      • Murphy J.D.
      • et al.
      Wide-scale clinical implementation of knowledge-based planning: an investigation of workforce efficiency, need for post-automation refinement, and data-driven model maintenance.
      ,
      • Monzen H.
      • Tamura M.
      • Ueda Y.
      • Fukunaga J.-I.
      • Kamima T.
      • Muraki Y.
      • et al.
      Dosimetric evaluation with knowledge-based planning created at different periods in volumetric-modulated arc therapy for prostate cancer: a multi-institution study.
      ].
      However, regardless of the strategy used to update KBP models, a clinical validation should be carried out to prove the benefit associated with their introduction in a real-world clinical environment where clinical requirements, patient management and planning practices can change over time. As a matter of fact, none of the aforementioned studies conducted a long-term longitudinal evaluation with a large number of cases, and only some of them have reported multiple updates of the same model. Furthermore, to our knowledge, there is not any study comparing different updating strategies explicitly.
      In this study, we performed a paired comparison of two antithetical KBP models update strategies and tested how these impact the quality of prostate treatments plans through a three-year long continuous observation. In particular, we performed multiple subsequent updates of the first clinically adopted prostate model to assess whether an updating procedure should prioritize a larger sample size or a higher mean sample quality.

      2. Materials and methods

      2.1 Enrolled patient sample

      Within the period from 01/01/2018 to 31/12/2020, a total of one hundred and one patient records were recruited for this work. Inclusion criteria were: men with localized histologically confirmed prostate adenocarcinoma (all risk groups with clinical stage T1b-T3a, N0, M0 with no clinical evidence of pelvic lymphadenopathy, and an estimated risk of nodal involvement <15 % based on the Roach formula [
      • Roach M.
      • Marquez C.
      • Yuo H.-S.
      • Narayan P.
      • Coleman L.
      • Nseyo U.O.
      • et al.
      Predicting the risk of lymph node involvement using the pre-treatment prostate specific antigen and gleason score in men with clinically localized prostate cancer.
      ]); radical external beam radiotherapy conducted with a moderate hypofractionation scheme to the prostate only. Clinical Target Volume (CTV) was contoured following the NCCN v.3.2016 prostate guidelines. The penile bulb was added whenever a recent magnetic resonance imaging study was available. The contouring procedure was completed by two dedicated radiation oncologists. The planning goals were to cover the PTV with 95 % of the prescribed dose (V95% > 98 %) while limiting overdosage to 107 % of the prescribed dose (D1cc < 107 %).
      Until the end of 2017 all of prostate treatments were delivered on a Unique linac (Varian Medical Systems, Palo Alto, CA) with VMAT technique with an IGRT scheme based on MV imaging and X-ray triggering of gold fiducials based on stereoscopic 2D image [
      • Scaggion A.
      • Negri A.
      • Rossato M.A.
      • Roggio A.
      • Simonato F.
      • Bacco S.
      • et al.
      Delivering RapidArc®: a comprehensive study on accuracy and long term stability.
      ]. The standard treatment schedule was 70 Gy in 28 fractions at the beginning of the study [
      • Lee W.R.
      • Dignam J.J.
      • Amin M.B.
      • Bruner D.W.
      • Low D.
      • Swanson G.P.
      • et al.
      Randomized phase III noninferiority study comparing two radiotherapy fractionation schedules in patients with low-risk prostate cancer.
      ]. During this period the Planning Target Volume (PTV) was obtained by expanding the CTV with a posterior margin of 5 mm and 7 mm margin in all other directions.
      Since the beginning of 2018 a new TrueBeam STX unit (Varian Medical Systems, Palo Alto, CA) with CBCT capability started its clinical operations. This introduction led to change the IGRT scheme for prostate treatments to daily CBCT. Such introduction led to gradually change the treatment schedule to 60 Gy in 20 fractions following recent evidence [
      • Catton C.N.
      • Lukka H.
      • Gu C.-S.
      • Martin J.M.
      • Supiot S.
      • Chung P.W.M.
      • et al.
      Randomized trial of a hypofractionated radiation regimen for the treatment of localized prostate cancer.
      ,
      • Dearnaley D.
      • Syndikus I.
      • Mossop H.
      • Khoo V.
      • Birtle A.
      • Bloomfield D.
      • et al.
      Conventional versus hypofractionated high-dose intensity-modulated radiotherapy for prostate cancer: 5-year outcomes of the randomised, non-inferiority, phase 3 CHHiP trial.
      ,
      • Incrocci L.
      • Wortel R.C.
      • Alemayehu W.G.
      • Aluwini S.
      • Schimmel E.
      • Krol S.
      • et al.
      Hypofractionated versus conventionally fractionated radiotherapy for patients with localised prostate cancer (HYPRO): final efficacy results from a randomised, multicentre, open-label, phase 3 trial.
      ]. From September 2019 X-ray triggering of gold fiducials based on OBI X-ray imaging was introduced in the clinical routine for a fraction of patients. The PTV margins recipe remained unaltered for patient undergoing daily CBCT while it was reduced to 4 mm posteriorly and 5 mm in all other directions for the patients implanted with gold fiducials.
      For all the patients included in the study rectum, bladder and femoral heads were delineated as Organ-at-Risks (OARs).
      The planning goals were maintained constant over the entire period of inquiry. All plans were optimized according to our department's prostate radical treatment protocol which is based on the ASTRO/ASCO/AUA guidelines [
      • Morgan S.C.
      • Hoffman K.
      • Loblaw D.A.
      • Buyyounouski M.K.
      • Patton C.
      • Barocas D.
      • et al.
      Hypofractionated radiation therapy for localized prostate cancer: an ASTRO, ASCO, and AUA evidence-based guideline.
      ].

      2.2 Treatment plans

      All patients were treated with Volumetric Modulated Arc Treatment (VMAT) using two complete arcs either with 6X or 10X, on a couple of matched TrueBeam STX machines equipped with HD Millenium MLC (Varian Medical Systems, Palo Alto, CA). 10× was only used for overweight patients when a planning comparison against 6× was considered significantly favourable. All treatment plans were obtained with Eclipse TPS, optimized with the PO algorithm v15.5.11 and computed with Acuros XB v15.5.11 (Varian Medical Systems, Palo Alto, CA) with dose-to-medium reporting. All treatment plans resulted from a specific strategy adopted to limit plan complexity, which was developed within our institution. This strategy has proven to guarantee high levels of plan deliverability without compromising plan quality or clinical acceptability [
      • Morgan S.C.
      • Hoffman K.
      • Loblaw D.A.
      • Buyyounouski M.K.
      • Patton C.
      • Barocas D.
      • et al.
      Hypofractionated radiation therapy for localized prostate cancer: an ASTRO, ASCO, and AUA evidence-based guideline.
      ]. In detail, the plans were optimized by setting the Aperture Shape Controller (ASC) priority to Very High and using a monitor units (MU) limit objective which aims to constrain the total number of MU close to a MU/cGy value of 3 [
      • Scaggion A.
      • Fusella M.
      • Agnello G.
      • Bettinelli A.
      • Pivato N.
      • Roggio A.
      • et al.
      Limiting treatment plan complexity by applying a novel commercial tool.
      ,

      Varian Medical System. Eclipse Photon and Electron Reference Guide v15.5 2017.

      ]. ASC is a leaf sequencer that simplifies MLC configuration by minimizing the curvature of the beam-eye-view shapes. ASC is implemented as a multiplicative penalty term in the optimizer cost function and can be modified by the user by choosing among six discrete penalty levels. In rare cases when clinical goals were not met, the MU limit was adjusted until clinical requirements were achieved.

      2.3 Plan quality assessment

      The Plan Quality Metric (PQM) was adopted as a global measure of quality in order to assess overall plan quality and simplify the plan comparison process [
      • Hernandez V.
      • Hansen C.R.
      • Widesott L.
      • Bäck A.
      • Canters R.
      • Fusella M.
      • et al.
      What is plan quality in radiotherapy? The importance of evaluating dose metrics, complexity, and robustness of treatment plans.
      ]. PQM is a user-defined metric designed to compare the quality of competing treatment plans and it is useful to limit the subjectivity of judgment. It gathers into a single number the judgment of quality expressed by a clinical team on the basis of its knowledge and experience. It is built through a list of sub-metrics (e.g. Dose-Volume Histogram (DVH) metrics), which should summarize the treatment’s specific goals. Each metric is associated with a numerical scoring function to model clinician’s judgment criteria as accurately as possible. The PQM is the sum of the scores obtained by each sub-metric and measures the extent to which the plan adheres to the list of identified goals. First introduced by Nelms, it is now adopted in many studies [
      • Nelms B.E.
      • Robinson G.
      • Markham J.
      • Velasco K.
      • Boyd S.
      • Narayan S.
      • et al.
      Variation in external beam treatment plan quality: An inter-institutional study of planners and planning systems.
      ,
      • Ahmad I.
      • Chufal K.S.
      • Bhatt C.P.
      • Miller A.A.
      • Bajpai R.
      • Chhabra A.
      • et al.
      Plan quality assessment of modern radiotherapy delivery techniques in left-sided breast cancer: an analysis stratified by target delineation guidelines.
      ,
      • Landers A.
      • O’Connor D.
      • Ruan D.
      • Sheng K.
      Automated 4π radiotherapy treatment planning with evolving knowledge-base.
      ,
      • Sasaki M.
      • Nakaguuchi Y.
      • Kamomae T.
      • Tsuzuki A.
      • Kobuchi S.
      • Kuwahara K.
      • et al.
      Analysis of prostate intensity- and volumetric-modulated arc radiation therapy planning quality with PlanIQTM.
      ,
      • Cilla S.
      • Deodato F.
      • Romano C.
      • Ianiro A.
      • Macchia G.
      • Re A.
      • et al.
      Personalized automation of treatment planning in head-neck cancer: A step forward for quality in radiation therapy?.
      ]. The details of the PQM algorithm used in this work are described elsewhere and are reported in detail in the supplementary materials [
      • Fusella M.
      • Scaggion A.
      • Pivato N.
      • Rossato M.A.
      • Zorz A.
      • Paiusco M.
      Efficiently train and validate a RapidPlan model through APQM scoring.
      ,
      • Scaggion A.
      • Fusella M.
      • Roggio A.
      • Bacco S.
      • Pivato N.
      • Rossato M.A.
      • et al.
      Reducing inter- and intra-planner variability in radiotherapy plan output with a commercial knowledge-based planning solution.
      ]. The PQM% algorithm was initially developed for treatments delivering 70 Gy in 28 fractions. To be used for 60 Gy in 20 fraction treatments, all the sub-metrics and the scoring functions has been linearly scaled down with the total dose.
      In this study, the PQM% is used in conjunction with its adjusted version (APQM%) to allow for plan quality ranking across a cohort of patients [
      • Ahmed S.
      • Nelms B.
      • Gintz D.
      • Caudell J.
      • Zhang G.
      • Moros E.G.
      • et al.
      A method for a priori estimation of best feasible DVH for organs-at-risk: Validation for head and neck VMAT planning.
      ]. APQM% tailors the PQM algorithm to the anatomical characteristics of each patient. In order to assess whether plans were obtained with different trade-offs between target coverage and OARs sparing, the PQM% was also split into two complementary measures: the PQMtarget% which gathers together all of the sub-metrics related to target coverage, homogeneity and conformance; and the PQMOARs% which represents the sum of the sub-metrics related to OARs sparing. The same subdivision was applied to the APQM%.

      2.4 Generation of models

      The first KBP model for prostate treatments was adopted in our clinical practice at the beginning of 2018. It was based on a prior sample of 73 historical prostate plans (gathered from January 2015 to December 2016) treated with a Unique machine equipped with Millennium MLC with dose prescription of either 78 Gy/39 fx or 70 Gy/28 fx [
      • Fusella M.
      • Scaggion A.
      • Pivato N.
      • Rossato M.A.
      • Zorz A.
      • Paiusco M.
      Efficiently train and validate a RapidPlan model through APQM scoring.
      ,
      • Scaggion A.
      • Fusella M.
      • Roggio A.
      • Bacco S.
      • Pivato N.
      • Rossato M.A.
      • et al.
      Reducing inter- and intra-planner variability in radiotherapy plan output with a commercial knowledge-based planning solution.
      ]. It was generated with RapidPlan v15.1 (Varian Medical Systems, Palo Alto, CA) and will be referred to as M0. A continuous program of updates was implemented after its clinical introduction, with a model update performed approximately every 25 new prostate cases. To fulfil Wang and colleagues’ recommendation, the initial model was updated in an attempt to enlarge the model sample size [
      • Wang M.
      • Li S.
      • Huang Y.
      • Yue H.
      • Li T.
      • Wu H.
      • et al.
      An interactive plan and model evolution method for knowledge-based pelvic VMAT planning.
      ].
      The investigation lasted approximately 36 months and ended with four subsequent updates performed at the end of four time intervals referred to as Pt with t = 1,…,4. The whole process is visually summarized in Fig. 1. A description of the patients sample accrued in the four periods is given in the Supplementary Material.
      Figure thumbnail gr1
      Fig. 1a) The workflow for plan generation and model upgrade for period pt. b) The timetable for the entire study is presented along with its subdivision into periods, the models in uses and the models obtained as a result of the updating procedures.
      During the entire investigation, the clinical planning was conducted by human planners assisted by the prediction of the first clinical KBP model M0 (plans). During this planning effort, the planner started from the KBP predictions and was left free to manually refine the optimization objective and iterate the optimization process until a satisfactory result was reached. In the case of unreliable DVH prediction the planner was free to perform a completely manual optimization without relying on the KBP predictions. This is the usual clinical approach to planning and would remain unaltered through time if the KBP model did not get updated. To compare the effectiveness of the two updating strategies in challenging conditions, each Human plan was compared to a plan generated through a fully automated optimization based on the prediction of the most recently updated model M(t-1) (AutoRP plans). Thereby, the updated model outcome did not benefit of the human interaction devoted to refine the plan, making the comparison with the human-generated plans more challenging [
      • Chang A.T.Y.
      • Hung A.W.M.
      • Cheung F.W.K.
      • Lee M.C.H.
      • Chan O.S.H.
      • Philips H.
      • et al.
      Comparison of planning quality and efficiency between conventional and knowledge-based algorithms in nasopharyngeal cancer patients using intensity modulated radiation therapy.
      ,
      • Hussein M.
      • South C.P.
      • Barry M.A.
      • Adams E.J.
      • Jordan T.J.
      • Stewart A.J.
      • et al.
      Clinical validation and benchmarking of knowledge-based IMRT and VMAT treatment planning in pelvic anatomy.
      ]. The set of optimization objectives automatically generated by RapidPlan remained unaltered throughout the study and is reported in the Supplementary Material.
      The two competing plans (Human and AutoRP), sharing the same geometry and energy, were reviewed by a clinician to judge their clinical soundness and were also ranked through the PQM%. The plan judged clinically appropriate with the higher quality (i.e. the higher PQM%) was added to the sample forming the M(t-1) model population so to obtain an expanded library which would have become the population of the next update, i.e. model Mt. At the end of each period, after approximately 25 cases, the new Mt model was trained (see Fig. 1a). This updating procedure aimed to improve the model’s generality by increasing the number of cases the model was trained on. After the training, each Mt model was cleaned and refined to remove plans or single OARs identified as outliers or largely influential points, and underwent a single round of internal and external validation [
      • Hussein M.
      • South C.P.
      • Barry M.A.
      • Adams E.J.
      • Jordan T.J.
      • Stewart A.J.
      • et al.
      Clinical validation and benchmarking of knowledge-based IMRT and VMAT treatment planning in pelvic anatomy.
      ,

      Varian Medical System. Eclipse Photon and Electron Reference Guide v13.7 2015.

      ,
      • Fogliata A.
      • Belosi F.
      • Clivio A.
      • Navarria P.
      • Nicolini G.
      • Scorsetti M.
      • et al.
      On the pre-clinical validation of a commercial model-based optimisation engine: Application to volumetric modulated arc therapy for patients with lung or prostate cancer.
      ].
      The second updating procedure was based on the efficient selection method proposed by Fusella et al [
      • Fusella M.
      • Scaggion A.
      • Pivato N.
      • Rossato M.A.
      • Zorz A.
      • Paiusco M.
      Efficiently train and validate a RapidPlan model through APQM scoring.
      ]. This procedure was designed to increase the mean quality of model samples while keeping the same library size as the ancestor. The plans comprising the Mt library were sorted according to the APQM% and only the 73 top-ranked plans (i.e. those with the highest quality) were retained. A new model was trained on this reduced sample and is referred to as Rt. Rt models were subjected to internal and external validation as Mt models, with outlier removal limited to cases that were highly influential on the model’s predictions(those classified as outlier by the Cook distance) [
      • Fusella M.
      • Scaggion A.
      • Pivato N.
      • Rossato M.A.
      • Zorz A.
      • Paiusco M.
      Efficiently train and validate a RapidPlan model through APQM scoring.
      ].

      2.5 Models comparison through open-loop test

      A thorough comparison of model performance was accomplished by means of an open-loop test conducted on an external validation set (i.e. not used for training the models) consisting of 30 historical patients not included in any of the model libraries [
      • Fusella M.
      • Scaggion A.
      • Pivato N.
      • Rossato M.A.
      • Zorz A.
      • Paiusco M.
      Efficiently train and validate a RapidPlan model through APQM scoring.
      ,
      • Scaggion A.
      • Fusella M.
      • Roggio A.
      • Bacco S.
      • Pivato N.
      • Rossato M.A.
      • et al.
      Reducing inter- and intra-planner variability in radiotherapy plan output with a commercial knowledge-based planning solution.
      ]. This group was first used to validate the M0 model. All patients in this group were treated with 70 Gy in 28 fractions in late 2016. A set of plans generated by an experienced human planner, without any support from the KBP predictions, was taken as a reference. A completely automated planning procedure was undertaken for each patient with each of the KBP models and compared to the relative reference plans.
      The RapidPlan DVH estimation algorithm generates an alert if the geometrical characteristics of the patient whose DHV is to be estimated fall outside or at the library sample’s extreme borders. Each alert is an indication that the DVH estimation might not be reliable [

      Varian Medical System. Eclipse Photon and Electron Reference Guide v13.7 2015.

      ]. To assess the models’ generality, the number of alerts returned by RapidPlan were collected for each patient and each involved OAR. The number of alerts collected was used as a rough measure of model reliability, with fewer alerts indicating a more reliable model.
      To assess the KBP models’ general quality the goodness-of-fit statistics (R2 and X2) and the goodness-of-estimation parameter (mean square error, MSE) were collected and compared. R2 represents the coefficient of determination of regression model parameters and X2 is the average chi square of regression model parameters. Better results are expected as the former approaches 1 while the latter approaches 0. The MSE describes how well the model is able to estimate the original DVH in a training plan, and the closer it is to 0, the better the model’s estimation capability for plans that are not part of the training set [
      • Fogliata A.
      • Cozzi L.
      • Reggiori G.
      • Stravato A.
      • Lobefalo F.
      • Franzese C.
      • et al.
      RapidPlan knowledge based planning: iterative learning process and model ability to steer planning strategies.
      ,
      • Fusella M.
      • Scaggion A.
      • Pivato N.
      • Rossato M.A.
      • Zorz A.
      • Paiusco M.
      Efficiently train and validate a RapidPlan model through APQM scoring.
      ,

      Varian Medical System. Eclipse Photon and Electron Reference Guide v13.7 2015.

      ].

      2.6 Evaluation of plan complexity

      In line with the existing literature, plan complexity was evaluated using 4 complexity metrics, computed with a Matlab routine developed in-house [
      • Scaggion A.
      • Fusella M.
      • Agnello G.
      • Bettinelli A.
      • Pivato N.
      • Roggio A.
      • et al.
      Limiting treatment plan complexity by applying a novel commercial tool.
      ]. These metrics were chosen on the basis of their complementarity and their reported correlation with plan deliverability [
      • Scaggion A.
      • Fusella M.
      • Agnello G.
      • Bettinelli A.
      • Pivato N.
      • Roggio A.
      • et al.
      Limiting treatment plan complexity by applying a novel commercial tool.
      ,
      • Hernandez V.
      • Saez J.
      • Pasler M.
      • Jurado-Bruggeman D.
      • Jornet N.
      Comparison of complexity metrics for multi-institutional evaluations of treatment plans in radiotherapy.
      ]. These complexity metrics consist of: the ratio of the total number of Monitor Units to the dose per fraction (MU/cGy); the Edge Metric (EM), which measures the complexity of the MLC aperture shapes as the ratio of MLC side edge length to aperture area [
      • Younge K.C.
      • Matuszak M.M.
      • Moran J.M.
      • McShan D.L.
      • Fraass B.A.
      • Roberts D.A.
      Penalization of aperture complexity in inversely planned volumetric modulated arc therapy.
      ]; the VMAT adapted Modulation Complexity Score (vMCS), which represents modulation complexity taking into account the relative variation on leaf positions, beam aperture area, and MUs between control points [
      • Masi L.
      • Doro R.
      • Favuzza V.
      • Cipressi S.
      • Livi L.
      Impact of plan parameters on the dosimetric accuracy of volumetric modulated arc therapy.
      ]; and the total modulation index (MIt) which reflects the speed and acceleration of modulating parameters such as MLC movements, dose-rate, and gantry speed [
      • Park J.M.
      • Park S.-Y.
      • Kim H.
      • Kim J.H.
      • Carlson J.
      • Ye S.-J.
      Modulation indices for volumetric modulated arc therapy.
      ].

      2.7 Statistical analysis

      Statistical differences arising from the anatomical and dosimetric features, quality and complexity of plans generated using the subsequent updates of the initial model M0 were assessed through the two-tailed Student’s t-test when the tested variables were normally distributed. Data normality was checked through the Shapiro-Wilk test. The significance level was set to 0.05 throughout the whole study, with Bonferroni’s correction when multiple comparisons were performed.
      The whole study has been conducted following the recommendations of the Radiotherapy Treatment plannINg study Guidelines (RATING) [
      • Hansen C.R.
      • Crijns W.
      • Hussein M.
      • Rossi L.
      • Gallego P.
      • Verbakel W.
      • et al.
      Radiotherapy Treatment plannINg study Guidelines (RATING): A framework for setting up and reporting on scientific treatment planning studies.
      ], the completed RATING scoresheet is given as Supplementary Material.

      3. Results

      3.1 Model characteristics

      The average plan quality of Rt models, as measured by the PQM%, increased with each model update and the PQM% distribution of Rt model sample nudged towards 100 % as a direct consequence of the update approach. The average quality of Mt models increased while their spread remained relatively constant. The detailed characteristics of all obtained KBP models in terms of plan numerosity and plan quality are given in Table 1. A Bonferroni corrected two-tailed Student’s t-test proved that the APQM% of Rt models increased significantly by approximately 2 % after every update, while for Mt models a statistically significant increase of approximately 3 % was reached at the third update. For Mt models, such statistically significant differences were almost entirely the result of increased OARs sparing, while target coverage and homogeneity remained relatively unchanged.
      Table 1Numerosity and quality characteristics of models’ library population. Statistically significant differences in comparison to M0 are marked with asterisks.
      APQM%APQMtarget%APQMOARs%
      ModelSampleMean ± SDRangeMean ± SDRangeMean ± SDRange
      [min;max][min;max][min;max]
      M07387.3 ± 5.3[75.8;96.6]88.2 ± 6.3[64.2;95.0]86.9 ± 11.9[56.0;100.0]
      M19488.2 ± 5.5[76.1;97.8]87.1 ± 7.7[64.2;96.4]89.8 ± 11.3[60.0;100.0]
      M211788.9 ± 5.7[76.1;97.8]87.5 ± 7.0[64.8;96.4]90.9 ± 11.2[60.0;100.0]
      M314190.1 ± 5.6*[76.1;97.8]87.7 ± 7.1[64.8;96.4]93.2 ± 10.2*[60.0;100.0]
      M416490.6 ± 5.5*[76.1;97.8]87.4 ± 7.4[61.7;96.4]94.4 ± 9.6*[60.0;100.0]
      R17390.4 ± 4.4*[83.3;97.8]87.3 ± 7.0[69.2;96.4]94.3 ± 7.3*[75.2;100.0]
      R27392.5 ± 3.4*[84.2;97.8]89.3 ± 4.4[75.8;96.4]96.3 ± 5.5*[77.3;100.0]
      R37394.7 ± 2.3*[88.5;97.8]90.9 ± 3.7*[78.1;96.4]99.0 ± 2.2*[88.9;100.0]
      R47395.7 ± 1.4*[92.5;97.8]92.2 ± 2.5*[86.2;96.4]99.6 ± 0.9*[93.9;100.0]
      The overall quality of Mt models increased monotonically (an increase in R2 and a reduction in X2 and MSE) for all OARs, while it fluctuated for Rt models with overall indicators that were not always better than the M0 model. The anatomical characteristics of patient samples varied across the different models: PTV volume decreased monotonically across updates for both Mt and Rt models. More in detail, it was 131 ± 42 cc for M0 and became 111 ± 39 cc and 106 ± 37 cc for R3 and R4, respectively, a difference that was statistically significant to a Bonferroni corrected two-tailed paired Student’s t-test. PTV shrinkage was closely coupled with a reduction in PTV-bladder overlap, which was significant for R3 and R4. Rectum and bladder volumes were constant across all the models. Details are reported in the Supplementary Material.

      3.2 Model outcomes

      AutoRP plans produced with Mt models were compared to Human plans on the set of newly enrolled patients in each specific period. The quality of Human plans remained constant over the entire duration of the investigation. On the other hand, the quality of AutoRP plans increased as the model was updated and the associated variability (here represented in term of sample standard deviation) tended to decrease (Table 2). The average quality difference is null at period P1 and it becomes positive and statistically significant to a two-tailed paired Student’s t-test in P3 and P4.
      Table 2Comparison of Human vs AutoRP plans along the four periods of investigation. Differences in plan quality (APQM%) between Human and AutoRP plans are reported (mean ± standard deviation) and marked with an asterisk if statistically significant. Complexity metrics are also reported (mean ± standard deviation) significant paired differences are marked with an asterisk. EM: Edge Metric; vMCS: VMAT adapted Modulation Complexity Score; MIt: Total Modulation Index.
      PeriodP1P2P3P4
      Plan generationHumanAutoRPHumanAutoRPHumanAutoRPHumanAutoRP
      M0M0M0M1M0M2M0M3
      APQM%85.7 ± 7.085.1 ± 9.185.7 ± 7.689.6 ± 6.484.3 ± 8.290.8 ± 6.586.1 ± 6.191.4 ± 5.5
      Difference−0.6 ± 10.73.9 ± 9.36.5 ± 8.8*5.3 ± 6.2*
      In the next model1312914420223
      MU/cGy3.73 ± 0.763.34 ± 0.653.42 ± 0.873.54 ± 1.043.18 ± 0.683.18 ± 0.592.89 ± 0.322.97 ± 0.34
      EM0.23 ± 0.120.19 ± 0.110.17 ± 0.050.17 ± 0.060.15 ± 0.040.16 ± 0.040.14 ± 0.030.15 ± 0.03**
      VMCS0.40 ± 0.070.38 ± 0.040.44 ± 0.080.42 ± 0.070.44 ± 0.070.41 ± 0.06**0.44 ± 0.040.44 ± 0.04
      MIt25.5 ± 6.029.2 ± 4.825.9 ± 7.326.3 ± 6.623.2 ± 5.925.5 ± 6.422.9 ± 3.823.5 ± 4.2
      Similarly, also the number of human plans winning the comparison against the AutoRP ones decreased monotonically from P1 to P4, resulting in 2 human plans and 23 AutoRP (M3) plans accepted in P4. There was no consistent discernible difference in plan complexity. EM, VMCS and MIt remained nearly constant throughout the whole investigation period at approximately 0.16, 0.42 and 25 respectively with only sporadic differences between Human and AutoRP plans. MU/cGy showed a slightly but constant reduction for both classes of plans throughout the whole investigation period passing from ∼3.5 cGy−1 in P1 to ∼3.1 cGy−1 in P4 for both classes of plans. A detailed graph can be found in the Supplementary Material.

      3.3 Models comparison through open-loop test

      To compare model performances, we measured the differences in plan quality (ΔPQM%) of an automatic optimization guided by the models’ prediction against a set of reference plans on the external validation set. The results are shown in Fig. 2. The mean ΔPQMOARs% ranged within 3.9 and 19.2 (Fig. 2c): all models showed a significantly higher OAR sparing than M0 when compared to the reference plans, with the exception of M1. Target coverage and homogeneity slight reduced: mean ΔPQMtarget% was within −0.2 % and −14.7 % (Fig. 2b). None of these reduction were statistitically significant to a Student’s paired t-test.
      Figure thumbnail gr2
      Fig. 2Plan quality of automated plan model outcomes. a) Overall quality for all the models compared to the reference plan set. b) Target coverage quality difference (model-reference). c) OARs sparing quality difference (model-reference). The central line marks the median, the cross marks the mean, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the adjacent values, which are the most extreme data values that are not outliers, and the circles represent the outliers. The asterisks mark models that are significantly different to the reference set.
      The number of reliable DVH predictions, reported in Fig. 3, increased with updates for Mt models but decreased for Rt models.
      Figure thumbnail gr3
      Fig. 3Number of robust predictions for the open-loop comparison set: a) M1–M4 models; b) R1–R4 models. The M0 model is always given as a reference.
      In general, with subsequent model upgrades, plans tended to be more complex: larger MU/cGy, more complex MLC shaping and movement and larger modulation of gantry speed and dose rate, but significant differences were consistently observed only for MIt when compared to the reference set of plans (see Fig. 4). The increased complexity is thus mainly related to an increased degree of modulation of gantry speed and dose rate.
      Figure thumbnail gr4
      Fig. 4Plan complexity of AutoRP plans for the validation sample. Differences are taken with respect to set of reference plans (model – reference). Panels report: a) MU/cGy, b) EM, c) vMCS, d) MIt. The central line marks the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the adjacent values, which are the most extreme data values that are not outliers, and the circles represent the outliers. No statistically significant differences appear. EM: Edge Metric; vMCS: VMAT adapted Modulation Complexity Score; MIt: Total Modulation Index.

      4. Discussion

      In this study, we compared two antithetical approaches to periodically update a KBP model dedicated to prostate radiotherapy and tested how they perform on a longitudinal clinical validation characterized by changes in the clinical settings. In particular, compared to the sample of patients used to train the ancestor KBP model (M0), three main clinical differences were introduced in the case sample considered during the investigation period. A high-definition MLC was used for all treatments, the fraction scheduling was gradually changed from 70 Gy/28 fx to 60 Gy/20 fx and the CTV-PTV margin recipe was reduced for a fraction of the cases towards the end of the investigation period. KPB updates were performed approximately every 25 new treatments and the benefits of such periodic updates, based on two different strategies, are reported.
      Expanding the population of plans used to train a KBP model (Mt models) monotonically increases the overall model quality (an increase in R2 and a reduction in X2 and MSE), improves its predictions reliability and robustness, and caused an increase in the quality of its outcomes over the four time periods. The latter should not be considered as a consequence of the increasing sample size only. In fact, this was also due to the plan selection procedure followed in this work which was particularly challenging for the automatically generated plans due to the lack of human intervention for plan refinement. On the other hand, training a KBP model on a fixed number of more recent high-quality plans (Rt models) can at most improve the quality of its outcomes while undermining the model’s quality as well as the robustness of its predictions.
      The plan selection procedure adopted to populate the Rt models did not preserve the sample’s anatomical homogeneity: the PTV average volume and the percentage of in-field bladder volume decreased across the updates (Supplementary material). Thus, selecting the population of plans only on the basis of their quality, as measured by the PQM%, leads to the discarding of unfavourable cases, e.g. patients with larger prostate glands and higher OARs volumes involved in the treatment field. If such a selection procedure is to be adopted for KBP model update, attention should be posed to reduce or minimize such artificial reduction of the geometrical clinical domain of the patient’s characteristics. On the other hand, the narrower distributions of the predictors used by the model to generate a prediction may increase the probability of classifying a plan as a geometric outlier which may be also an explanation of the lower reliability of the predictions of Rt models compared to Mt ones.
      Both the model generation phase (Table 2) and the open-loop comparison (Fig. 2) revealed an increase in the quality of model outcomes, when compared to human plans supported by the ancestor model M0. For Mt models, consistent increases are seen after the second update. For Rt models, one round of updates was already enough to induce consistent improvements. The quality increase was two-folded. On the one hand, the model upgrade seemed to be beneficial in our evolving clinical routine. In fact, in P4, when all the patients are treated with the new treatment schedule and a fraction of them are treated with reduced CTV-PTV margin, the KBP ancestor model was proven not to be as performant as the automatically generated one. In fact, from Table 2 we observed that AutoRP (M3) plans had a significantly higher quality than Human ones, with 23 out of 25 automatically generated plans winning the selection procedure, even under the challenging conditions of lack of human refinement. On the other side, the open-loop comparison shows that the updated models outperform the ancestor one even on a set of cases which dates back to the same period of M0 population.
      The average increase in the quality of the outcomes of Rt models is not necessarily in contrast with the reduction of the reliability of their predictions. In fact, the raw measure of model reliability introduced in this study is a good surrogate for the extent of knowledge each model contains, but the quality of the model should be measured on the basis of goodness-of-fit statistics (R2 and X2) and the goodness-of-estimation parameter (mean square error, MSE). In other words, a good algebraic relationship between the geometric and dosimetric characteristics of a reduced amount of good plans representing only a limited portion of the clinical geometric domain can still give reasonable predictions outside that limited geometric domain [
      • Ge Y.
      • Wu Q.J.
      Knowledge-based planning for intensity-modulated radiation therapy: A review of data-driven approaches.
      ,
      • Chatterjee A.
      • Serban M.
      • Faria S.
      • Souhami L.
      • Cury F.
      • Seuntjens J.
      Novel knowledge-based treatment planning model for hypofractionated radiotherapy of prostate cancer patients.
      ].
      The automated planning procedure, with a fixed set of optimization objectives, drove an increased organ sparing with a slight loss in target coverage and homogeneity (Fig. 2). Our set of objectives probably tends to favour OARs sparing rather than PTV coverage and conformality. To correct for such trend, the set of optimization objectives should have been updated along with the model. This action was outside the aim of this work and was already proved useful in the work of Kaderka and colleagues [
      • Kaderka R.
      • Hild S.J.
      • Bry V.N.
      • Cornell M.
      • Ray X.J.
      • Murphy J.D.
      • et al.
      Wide-scale clinical implementation of knowledge-based planning: an investigation of workforce efficiency, need for post-automation refinement, and data-driven model maintenance.
      ].
      The real improvement induced by the updating procedure might be underestimated, as a certain amount of skilled manual interventions are needed to achieve the highest quality results, even when KBP-generated objectives drive the optimization [
      • Chang A.T.Y.
      • Hung A.W.M.
      • Cheung F.W.K.
      • Lee M.C.H.
      • Chan O.S.H.
      • Philips H.
      • et al.
      Comparison of planning quality and efficiency between conventional and knowledge-based algorithms in nasopharyngeal cancer patients using intensity modulated radiation therapy.
      ,
      • Hussein M.
      • South C.P.
      • Barry M.A.
      • Adams E.J.
      • Jordan T.J.
      • Stewart A.J.
      • et al.
      Clinical validation and benchmarking of knowledge-based IMRT and VMAT treatment planning in pelvic anatomy.
      ,
      • Tol J.P.
      • Delaney A.R.
      • Dahele M.
      • Slotman B.J.
      • Verbakel W.F.A.R.
      Evaluation of a Knowledge-Based Planning solution for head and neck cancer.
      ,
      • Fogliata A.
      • Wang P.-M.
      • Belosi F.
      • Clivio A.
      • Nicolini G.
      • Vanetti E.
      • et al.
      Assessment of a model based optimization engine for volumetric modulated arc therapy for patients with advanced hepatocellular cancer.
      ,
      • Wu B.
      • Kusters M.
      • Kunze-busch M.
      • Dijkema T.
      • McNutt T.
      • Sanguineti G.
      • et al.
      Cross-institutional knowledge-based planning (KBP) implementation and its performance comparison to Auto-Planning Engine (APE).
      ]. This work confirmed previous results by other groups: expanding KBP libraries induces an increase in model quality, while it does not guarantee the improved clinical quality of the model outcomes. On the other hand, reducing the model sample only to increase its mean quality might result in a less robust and less general model [
      • Hussein M.
      • Heijmen B.J.M.
      • Verellen D.
      • Nisbet A.
      Automation in intensity modulated radiotherapy treatment planning—a review of recent innovations.
      ,
      • Hundvin J.A.
      • Fjellanger K.
      • Pettersen H.E.S.
      • Nygaard B.
      • Revheim K.
      • Sulen T.H.
      • et al.
      Clinical iterative model development improves knowledge-based plan quality for high-risk prostate cancer with four integrated dose levels.
      ,
      • Wang M.
      • Li S.
      • Huang Y.
      • Yue H.
      • Li T.
      • Wu H.
      • et al.
      An interactive plan and model evolution method for knowledge-based pelvic VMAT planning.
      ,
      • Fusella M.
      • Scaggion A.
      • Pivato N.
      • Rossato M.A.
      • Zorz A.
      • Paiusco M.
      Efficiently train and validate a RapidPlan model through APQM scoring.
      ].
      As regards to plan complexity, several studies have reported that the improved plan quality due to KBP planning also resulted in increased plan complexity [
      • Monzen H.
      • Tamura M.
      • Ueda Y.
      • Fukunaga J.-I.
      • Kamima T.
      • Muraki Y.
      • et al.
      Dosimetric evaluation with knowledge-based planning created at different periods in volumetric-modulated arc therapy for prostate cancer: a multi-institution study.
      ,
      • Tamura M.
      • Monzen H.
      • Matsumoto K.
      • Kubo K.
      • Otsuka M.
      • Inada M.
      • et al.
      Mechanical performance of a commercial knowledge-based VMAT planning for prostate cancer.
      ,
      • Wall P.D.H.
      • Fontenot J.D.
      Evaluation of complexity and deliverability of prostate cancer treatment plans designed with a knowledge-based VMAT planning technique.
      ]. Our results demonstrated that when the plan complexity is appropriately constrained, the improvement in plan quality induced by KBP model predictions does not necessarily imply an unnecessary increase in plan complexity. Only one out of four metrics (MIt) showed significant increases which are mostly related to larger variations in the instantaneous dose rate and gantry speed.
      The results of this work strictly apply only to VMAT prostate treatments, but the methods proposed herein are easily applicable to any other treatment sites and treatment techniques. The prostate case limits the need of trade-offs between OAR sparing and target coverage, but a well prepared quality metric should help the clinical staff to rank the plans on the basis of their quality. The PQM% can be replaced by any other measure of overall plan quality or by a detailed clinical evaluation by a team of clinicians [
      • Biston M.C.
      • Costea M.
      • Gassa F.
      • Serre A.A.
      • Voet P.
      • Larson R.
      • et al.
      Evaluation of fully automated a priori MCO treatment planning in VMAT for head-and-neck cancer.
      ,
      • Akpati H.
      • Kim C.
      • Kim B.
      • Park T.
      • Meek A.
      Unified dosimetry index (UDI): a figure of merit for ranking treatment plans.
      ].
      In conclusion, the longitudinal evaluation proposed in the present study showed the benefit associated with the update of a KBP model in a real clinical scenario characterized by changes happening over time. This work suggests how the KBP periodical update should be carried out in order to be both clinically and cost-effective. Pros and cons associated with the two updating strategies proposed were discussed.
      In general, the update routine should not only focus on increasing the sample library, but also on defining the criteria for the inclusion of new plans. Such criteria should be focused on selecting plans with high-quality removing unacceptable low-quality ones.
      The growing number of cases in the model sample and the elimination of clearly underperforming plans should result in: an increase in plan quality, an improved model reliability and also an improvement in the model’s general quality [
      • Wang M.
      • Li S.
      • Huang Y.
      • Yue H.
      • Li T.
      • Wu H.
      • et al.
      An interactive plan and model evolution method for knowledge-based pelvic VMAT planning.
      ]. In comparison to iterative approaches, this approach should save human resources and does not require the use of an external validation tool [
      • Tamura M.
      • Monzen H.
      • Matsumoto K.
      • Kubo K.
      • Ueda Y.
      • Kamima T.
      • et al.
      Influence of cleaned-up commercial knowledge-based treatment planning on volumetric-modulated arc therapy of prostate cancer.
      ,
      • Fogliata A.
      • Cozzi L.
      • Reggiori G.
      • Stravato A.
      • Lobefalo F.
      • Franzese C.
      • et al.
      RapidPlan knowledge based planning: iterative learning process and model ability to steer planning strategies.
      ,
      • Hundvin J.A.
      • Fjellanger K.
      • Pettersen H.E.S.
      • Nygaard B.
      • Revheim K.
      • Sulen T.H.
      • et al.
      Clinical iterative model development improves knowledge-based plan quality for high-risk prostate cancer with four integrated dose levels.
      ,
      • Wang M.
      • Li S.
      • Huang Y.
      • Yue H.
      • Li T.
      • Wu H.
      • et al.
      An interactive plan and model evolution method for knowledge-based pelvic VMAT planning.
      ,
      • Monzen H.
      • Tamura M.
      • Ueda Y.
      • Fukunaga J.-I.
      • Kamima T.
      • Muraki Y.
      • et al.
      Dosimetric evaluation with knowledge-based planning created at different periods in volumetric-modulated arc therapy for prostate cancer: a multi-institution study.
      ,
      • Hussein M.
      • South C.P.
      • Barry M.A.
      • Adams E.J.
      • Jordan T.J.
      • Stewart A.J.
      • et al.
      Clinical validation and benchmarking of knowledge-based IMRT and VMAT treatment planning in pelvic anatomy.
      ].

      Declaration of Competing Interest

      The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

      Appendix A. Supplementary data

      The following are the Supplementary data to this article:

      References

        • Chang A.T.Y.
        • Hung A.W.M.
        • Cheung F.W.K.
        • Lee M.C.H.
        • Chan O.S.H.
        • Philips H.
        • et al.
        Comparison of planning quality and efficiency between conventional and knowledge-based algorithms in nasopharyngeal cancer patients using intensity modulated radiation therapy.
        Int J Radiat Oncol. 2016; 95: 981-990
        • Cornell M.
        • Kaderka R.
        • Hild S.J.
        • Ray X.J.
        • Murphy J.D.
        • Atwood T.F.
        • et al.
        Noninferiority study of automated knowledge-based planning versus human-driven optimization across multiple disease sites.
        Int J Radiat Oncol. 2020; 106: 430-439
        • Hussein M.
        • Heijmen B.J.M.
        • Verellen D.
        • Nisbet A.
        Automation in intensity modulated radiotherapy treatment planning—a review of recent innovations.
        Br J Radiol. 2018; 91: 20180270https://doi.org/10.1259/bjr.20180270
        • Kaderka R.
        • Hild S.J.
        • Bry V.N.
        • Cornell M.
        • Ray X.J.
        • Murphy J.D.
        • et al.
        Wide-scale clinical implementation of knowledge-based planning: an investigation of workforce efficiency, need for post-automation refinement, and data-driven model maintenance.
        Int J Radiat Oncol. 2021; 111: 705-715
        • Li N.
        • Carmona R.
        • Sirak I.
        • Kasaova L.
        • Followill D.
        • Michalski J.
        • et al.
        Highly efficient training, refinement, and validation of a knowledge-based planning quality-control system for radiation therapy clinical trials.
        Int J Radiat Oncol. 2017; 97: 164-172
        • Momin S.
        • Fu Y.
        • Lei Y.
        • Roper J.
        • Bradley J.D.
        • Curran W.J.
        • et al.
        Knowledge-based radiation treatment planning: A data-driven method survey.
        J Appl Clin Med Phys. 2021; 22: 16-44
        • Panettieri V.
        • Ball D.
        • Chapman A.
        • Cristofaro N.
        • Gawthrop J.
        • Griffin P.
        • et al.
        Development of a multicentre automated model to reduce planning variability in radiotherapy of prostate cancer.
        Phys Imaging Radiat Oncol. 2019; 11: 34-40
        • Wortel G.
        • Eekhout D.
        • Lamers E.
        • van der Bel R.
        • Kiers K.
        • Wiersma T.
        • et al.
        Characterization of automatic treatment planning approaches in radiotherapy.
        Phys Imaging Radiat Oncol. 2021; 19: 60-65
        • Boutilier J.J.
        • Craig T.
        • Sharpe M.B.
        • Chan T.C.Y.
        Sample size requirements for knowledge-based treatment planning.
        Med Phys. 2016; 43: 1212-1221https://doi.org/10.1118/1.4941363
        • Ge Y.
        • Wu Q.J.
        Knowledge-based planning for intensity-modulated radiation therapy: A review of data-driven approaches.
        Med Phys. 2019; 46: 2760-2775https://doi.org/10.1002/mp.13526
        • Tamura M.
        • Monzen H.
        • Matsumoto K.
        • Kubo K.
        • Ueda Y.
        • Kamima T.
        • et al.
        Influence of cleaned-up commercial knowledge-based treatment planning on volumetric-modulated arc therapy of prostate cancer.
        J Med Phys. 2020; 45: 71
        • Tol J.P.
        • Doornaert P.
        • Witte B.I.
        • Dahele M.
        • Slotman B.J.
        • Verbakel W.F.A.R.
        A longitudinal evaluation of improvements in radiotherapy treatment plan quality for head and neck cancer patients.
        Radiother Oncol. 2016; 119: 337-343https://doi.org/10.1016/j.radonc.2016.04.011
        • Fogliata A.
        • Cozzi L.
        • Reggiori G.
        • Stravato A.
        • Lobefalo F.
        • Franzese C.
        • et al.
        RapidPlan knowledge based planning: iterative learning process and model ability to steer planning strategies.
        Radiat Oncol. 2019; 14https://doi.org/10.1186/s13014-019-1403-0
        • Hundvin J.A.
        • Fjellanger K.
        • Pettersen H.E.S.
        • Nygaard B.
        • Revheim K.
        • Sulen T.H.
        • et al.
        Clinical iterative model development improves knowledge-based plan quality for high-risk prostate cancer with four integrated dose levels.
        Acta Oncol. 2021; 60: 237-244
        • Nakamura K.
        • Okuhata K.
        • Tamura M.
        • Otsuka M.
        • Kubo K.
        • Ueda Y.
        • et al.
        An updating approach for knowledge-based planning models to improve plan quality and variability in volumetric-modulated arc therapy for prostate cancer.
        J Appl Clin Med Phys. 2021; 22: 113-122
        • Wang M.
        • Li S.
        • Huang Y.
        • Yue H.
        • Li T.
        • Wu H.
        • et al.
        An interactive plan and model evolution method for knowledge-based pelvic VMAT planning.
        J Appl Clin Med Phys. 2018; 19: 491-498
        • Monzen H.
        • Tamura M.
        • Ueda Y.
        • Fukunaga J.-I.
        • Kamima T.
        • Muraki Y.
        • et al.
        Dosimetric evaluation with knowledge-based planning created at different periods in volumetric-modulated arc therapy for prostate cancer: a multi-institution study.
        Radiol Phys Technol. 2020; 13: 327-335
        • Roach M.
        • Marquez C.
        • Yuo H.-S.
        • Narayan P.
        • Coleman L.
        • Nseyo U.O.
        • et al.
        Predicting the risk of lymph node involvement using the pre-treatment prostate specific antigen and gleason score in men with clinically localized prostate cancer.
        Int J Radiat Oncol. 1994; 28: 33-37
        • Scaggion A.
        • Negri A.
        • Rossato M.A.
        • Roggio A.
        • Simonato F.
        • Bacco S.
        • et al.
        Delivering RapidArc®: a comprehensive study on accuracy and long term stability.
        Phys Med. 2016; 32: 866-873https://doi.org/10.1016/j.ejmp.2016.05.056
        • Lee W.R.
        • Dignam J.J.
        • Amin M.B.
        • Bruner D.W.
        • Low D.
        • Swanson G.P.
        • et al.
        Randomized phase III noninferiority study comparing two radiotherapy fractionation schedules in patients with low-risk prostate cancer.
        J Clin Oncol. 2016; 34: 2325-2332
        • Catton C.N.
        • Lukka H.
        • Gu C.-S.
        • Martin J.M.
        • Supiot S.
        • Chung P.W.M.
        • et al.
        Randomized trial of a hypofractionated radiation regimen for the treatment of localized prostate cancer.
        J Clin Oncol. 2017; 35: 1884-1890
        • Dearnaley D.
        • Syndikus I.
        • Mossop H.
        • Khoo V.
        • Birtle A.
        • Bloomfield D.
        • et al.
        Conventional versus hypofractionated high-dose intensity-modulated radiotherapy for prostate cancer: 5-year outcomes of the randomised, non-inferiority, phase 3 CHHiP trial.
        Lancet Oncol. 2016; 17: 1047-1060
        • Incrocci L.
        • Wortel R.C.
        • Alemayehu W.G.
        • Aluwini S.
        • Schimmel E.
        • Krol S.
        • et al.
        Hypofractionated versus conventionally fractionated radiotherapy for patients with localised prostate cancer (HYPRO): final efficacy results from a randomised, multicentre, open-label, phase 3 trial.
        Lancet Oncol. 2016; 17: 1061-1069
        • Morgan S.C.
        • Hoffman K.
        • Loblaw D.A.
        • Buyyounouski M.K.
        • Patton C.
        • Barocas D.
        • et al.
        Hypofractionated radiation therapy for localized prostate cancer: an ASTRO, ASCO, and AUA evidence-based guideline.
        J Clin Oncol. 2018; 36: 3411-3430
        • Scaggion A.
        • Fusella M.
        • Agnello G.
        • Bettinelli A.
        • Pivato N.
        • Roggio A.
        • et al.
        Limiting treatment plan complexity by applying a novel commercial tool.
        J Appl Clin Med Phys. 2020; 21: 27-34
      1. Varian Medical System. Eclipse Photon and Electron Reference Guide v15.5 2017.

        • Hernandez V.
        • Hansen C.R.
        • Widesott L.
        • Bäck A.
        • Canters R.
        • Fusella M.
        • et al.
        What is plan quality in radiotherapy? The importance of evaluating dose metrics, complexity, and robustness of treatment plans.
        Radiother Oncol. 2020; 153: 26-33
        • Nelms B.E.
        • Robinson G.
        • Markham J.
        • Velasco K.
        • Boyd S.
        • Narayan S.
        • et al.
        Variation in external beam treatment plan quality: An inter-institutional study of planners and planning systems.
        Pract Radiat Oncol. 2012; 2: 296-305
        • Ahmad I.
        • Chufal K.S.
        • Bhatt C.P.
        • Miller A.A.
        • Bajpai R.
        • Chhabra A.
        • et al.
        Plan quality assessment of modern radiotherapy delivery techniques in left-sided breast cancer: an analysis stratified by target delineation guidelines.
        BJR|Open. 2020; 2: 20200007
        • Landers A.
        • O’Connor D.
        • Ruan D.
        • Sheng K.
        Automated 4π radiotherapy treatment planning with evolving knowledge-base.
        Med Phys. 2019; 46: 3833-3843https://doi.org/10.1002/mp.13682
        • Sasaki M.
        • Nakaguuchi Y.
        • Kamomae T.
        • Tsuzuki A.
        • Kobuchi S.
        • Kuwahara K.
        • et al.
        Analysis of prostate intensity- and volumetric-modulated arc radiation therapy planning quality with PlanIQTM.
        J Appl Clin Med Phys. 2021; 22: 132-142
        • Cilla S.
        • Deodato F.
        • Romano C.
        • Ianiro A.
        • Macchia G.
        • Re A.
        • et al.
        Personalized automation of treatment planning in head-neck cancer: A step forward for quality in radiation therapy?.
        Phys Med. 2021; 82: 7-16
        • Fusella M.
        • Scaggion A.
        • Pivato N.
        • Rossato M.A.
        • Zorz A.
        • Paiusco M.
        Efficiently train and validate a RapidPlan model through APQM scoring.
        Med Phys. 2018; 45: 2611-2619https://doi.org/10.1002/mp.12896
        • Scaggion A.
        • Fusella M.
        • Roggio A.
        • Bacco S.
        • Pivato N.
        • Rossato M.A.
        • et al.
        Reducing inter- and intra-planner variability in radiotherapy plan output with a commercial knowledge-based planning solution.
        Phys Med. 2018; 53: 86-93
        • Ahmed S.
        • Nelms B.
        • Gintz D.
        • Caudell J.
        • Zhang G.
        • Moros E.G.
        • et al.
        A method for a priori estimation of best feasible DVH for organs-at-risk: Validation for head and neck VMAT planning.
        Med Phys. 2017; 44: 5486-5497
        • Hussein M.
        • South C.P.
        • Barry M.A.
        • Adams E.J.
        • Jordan T.J.
        • Stewart A.J.
        • et al.
        Clinical validation and benchmarking of knowledge-based IMRT and VMAT treatment planning in pelvic anatomy.
        Radiother Oncol. 2016; 120: 473-479
      2. Varian Medical System. Eclipse Photon and Electron Reference Guide v13.7 2015.

        • Fogliata A.
        • Belosi F.
        • Clivio A.
        • Navarria P.
        • Nicolini G.
        • Scorsetti M.
        • et al.
        On the pre-clinical validation of a commercial model-based optimisation engine: Application to volumetric modulated arc therapy for patients with lung or prostate cancer.
        Radiother Oncol. 2014; 113: 385-391
        • Hernandez V.
        • Saez J.
        • Pasler M.
        • Jurado-Bruggeman D.
        • Jornet N.
        Comparison of complexity metrics for multi-institutional evaluations of treatment plans in radiotherapy.
        Phys Imaging Radiat Oncol. 2018; 5: 37-43https://doi.org/10.1016/j.phro.2018.02.002
        • Younge K.C.
        • Matuszak M.M.
        • Moran J.M.
        • McShan D.L.
        • Fraass B.A.
        • Roberts D.A.
        Penalization of aperture complexity in inversely planned volumetric modulated arc therapy.
        Med Phys. 2012; 39: 7160-7170https://doi.org/10.1118/1.4762566
        • Masi L.
        • Doro R.
        • Favuzza V.
        • Cipressi S.
        • Livi L.
        Impact of plan parameters on the dosimetric accuracy of volumetric modulated arc therapy.
        Med Phys. 2013; 40: 071718
        • Park J.M.
        • Park S.-Y.
        • Kim H.
        • Kim J.H.
        • Carlson J.
        • Ye S.-J.
        Modulation indices for volumetric modulated arc therapy.
        Phys Med Biol. 2014; 59: 7315-7340https://doi.org/10.1088/0031-9155/59/23/7315
        • Hansen C.R.
        • Crijns W.
        • Hussein M.
        • Rossi L.
        • Gallego P.
        • Verbakel W.
        • et al.
        Radiotherapy Treatment plannINg study Guidelines (RATING): A framework for setting up and reporting on scientific treatment planning studies.
        Radiother Oncol. 2020; 153: 67-78
        • Chatterjee A.
        • Serban M.
        • Faria S.
        • Souhami L.
        • Cury F.
        • Seuntjens J.
        Novel knowledge-based treatment planning model for hypofractionated radiotherapy of prostate cancer patients.
        Phys Med. 2020; 69: 36-43https://doi.org/10.1016/j.ejmp.2019.11.023
        • Tol J.P.
        • Delaney A.R.
        • Dahele M.
        • Slotman B.J.
        • Verbakel W.F.A.R.
        Evaluation of a Knowledge-Based Planning solution for head and neck cancer.
        Int J Radiat Oncol. 2015; 91: 612-620https://doi.org/10.1016/j.ijrobp.2014.11.014
        • Fogliata A.
        • Wang P.-M.
        • Belosi F.
        • Clivio A.
        • Nicolini G.
        • Vanetti E.
        • et al.
        Assessment of a model based optimization engine for volumetric modulated arc therapy for patients with advanced hepatocellular cancer.
        Radiat Oncol. 2014; 9: 236https://doi.org/10.1186/s13014-014-0236-0
        • Wu B.
        • Kusters M.
        • Kunze-busch M.
        • Dijkema T.
        • McNutt T.
        • Sanguineti G.
        • et al.
        Cross-institutional knowledge-based planning (KBP) implementation and its performance comparison to Auto-Planning Engine (APE).
        Radiother Oncol. 2017; 123: 57-62
        • Tamura M.
        • Monzen H.
        • Matsumoto K.
        • Kubo K.
        • Otsuka M.
        • Inada M.
        • et al.
        Mechanical performance of a commercial knowledge-based VMAT planning for prostate cancer.
        Radiat Oncol. 2018; 13https://doi.org/10.1186/s13014-018-1114-y
        • Wall P.D.H.
        • Fontenot J.D.
        Evaluation of complexity and deliverability of prostate cancer treatment plans designed with a knowledge-based VMAT planning technique.
        J Appl Clin Med Phys. 2020; 21: 69-77https://doi.org/10.1002/acm2.12790
        • Biston M.C.
        • Costea M.
        • Gassa F.
        • Serre A.A.
        • Voet P.
        • Larson R.
        • et al.
        Evaluation of fully automated a priori MCO treatment planning in VMAT for head-and-neck cancer.
        Phys Med. 2021; 87: 31-38https://doi.org/10.1016/j.ejmp.2021.05.037
        • Akpati H.
        • Kim C.
        • Kim B.
        • Park T.
        • Meek A.
        Unified dosimetry index (UDI): a figure of merit for ranking treatment plans.
        J Appl Clin Med Phys. 2008; 9: 99-108https://doi.org/10.1120/jacmp.v9i3.2803