Plan robustness and RBE influence for proton dose painting by numbers for head and neck cancers

Purpose: To investigate the feasibility of dose painting by numbers (DPBN) with respect to robustness for proton therapy for head and neck cancers (HNC), and to study the influence of variable RBE on the TCP and OAR dose burden. Methods and materials: Data for 19 patients who have been scanned pretreatment with PET-FDG and subsequently treated with photon therapy were used in the study. A dose response model developed for photon therapy was implemented in a TPS, allowing DPBN plans to be created. Conventional homogeneous dose and DPBN plans were created for each patient, optimized with either fixed RBE = 1.1 or a variable RBE model. Robust optimization was used to create clinically acceptable plans. To estimate the maximum potential loss in TCP due to actual SUV variations from the pre-treatment imaging, we applied a test case with randomized SUV distribution. Results: Regardless of the use of variable RBE for optimization or evaluation, a statistically significant increase ( p < 0.001) in TCP was found for DPBN plans as compared to homogeneous dose plans. Randomizing the SUV distribution decreased the TCP for all plans. A correlation between TCP increase and variance of the SUV distribution and target volume was also found. Conclusion: DPBN for protons and HNC is feasible and could lead to a TCP gain. Risks associated with the temporal variation of SUV distributions could be mitigated by imposing minimum doses to targets. The correlation found between TCP increase and SUV variance and target volume may be used for patient selection.


Introduction
Head and neck cancer (HNC) caused more than 900 000 deaths worldwide in 2020 [1].Radiotherapy (RT) is a common treatment modality for HNC, and recent advances in its implementation have led to increased patient survival [2].However, the large number of organs at risk (OAR) in the anatomical region makes RT prone to side effects which can significantly reduce patient quality of life [3,4], supporting the search for more gentle radiotherapy techniques.
It has been common practice to deliver a homogeneous dose to the target, which for a given mean energy imparted ("integral dose") maximizes the tumor control probability (TCP) under the assumption that the dose response for different parts of the target also is homogeneous [5].However, for a non-uniform target dose response there is a corresponding non-uniform dose distribution that maximizes the TCP.Non-uniform dose distributions are prescribed in several clinically employed RT techniques, such as the usage of boost volumes [6] or in Gamma knife radiosurgery [7].A technique yet to be realized clinically is dose painting by numbers (DPBN), which makes use of voxel specific dose prescriptions [8] based on functional image sets.These can be PET-FDG or MRI, whose signal values have been shown to correlate with tumor response in radiotherapy [9][10][11][12][13].
Dose prescriptions from functional image sets can be generated in different ways, varying from a simple linear function to more elaborated methods based on TCP maximization [14].Theoretical studies based both on the former [15][16][17][18] and the latter approach [18][19][20] have shown potential to increase TCP.A promising method is the derivation of TCP parameters based on failures in clinical patients, i.e. tumor recurrence distributions in a population of post-RT patients [12,13,21].
Due to the physical properties of proton beams, their interaction in matter creates advantages that may be utilized in the treatment delivery, and which in turn has potential to reduce some of the toxicities associated with photon RT whilst still maintaining or even increasing local tumor control [22,23].Although results from proton-specific clinical trials are still lacking, several studies, such as ARTSCAN V [24] and DAHANCA 35 [25], are underway comparing PT to conventional photon RT for HNC.One further characteristic of proton beams that might be explored for dose painting is the variation in relative biological effect (RBE) [26][27][28][29][30], albeit the current clinical standard is to assume a fixed RBE of 1.1.[31].
Clinical studies of dose painting have shown mixed results, with findings of elevated toxicity [32][33][34] and with limited support of elevated local control [35].Although there are several studies that are suggestive of PT's greater dose painting potential as compared to photon RT [36][37][38], and its possibility of reducing doses to OARs which may help reduce potential toxicity increases in dose painting [38], we are not aware of any study that has explored a TCP model based on tumor recurrence frequencies.Using such a model, developed in earlier work for photon RT dose painting [13,21], in this planning study we aim to investigate the potential of PT dose painting to increase the TCP for a set of HNC patients.We will also investigate the effect of using a variable RBE model [39] and compare it to the standard RBE = 1.1 model.

Patient data
We used data for 20 HNC patients that were treated with conventional photon RT and for which CT and pretreatment PET-FDG image sets were available [21].Data for one patient had to be excluded due to data corruption, leaving data for 19 patients to be included in the study.CTVs and OARs were delineated in accordance with the ARTSCAN V study protocol [24] by an experienced radiation oncologist.Artefacts due to dental implants were present in some CT slices for most patients.These CT artefact regions were manually delineated and set to be interpreted as water-equivalent to minimize the impact on proton range calculations.Besides the primary target ROI (CTVT), most patients also had CTVs in the form of bilateral lymph node targets, which had been defined to be treated as either therapeutic or prophylactic with a lower dose.

Dose painting
We used a standard uptake value (SUV) based dose painting method derived in earlier work [13,21] which uses recurrence ("failure") frequencies in retrospective treatment outcome data for homogeneous dose photon radiotherapy.Briefly, the TCP for a voxel is parameterized with a logistic dose-response function [40] as where D is the voxel's physical dose, D 50 (SUV) and γ 50,eff are described below, and the EQD 2 function gives the equivalent dose based on 2 Gy fractions assuming an α/β-ratio of 10 Gy for the primary HNC targets [41].We assume independence of the voxel dose responses.The TCP for a patient is thus given as TCP patient (D) = ∏ vox∈target TCP vox (D,SUV, RBE).Since the parameters of Eq. (1) were derived for photons, we included the RBE-factor in the expression for application to protons.To derive D 50 (SUV) from the retrospective treatment outcome data, the TCP function needs for consistency to equal the voxel based local control ratio as a function of SUV, i.e.
The TCP objective function was then implemented in a research version of the RayStation (RaySearch Laboratories AB, Stockholm) treatment planning system (TPS), version 8.99.30.141, seeking to maximize the TCP based on Eq. ( 1) for a treatment plan as a function of the dose and SUV distribution.However, since the maximal TCP occurs as the limit when the dose approaches infinity, other objectives and/or constraints limiting the dose to organs at risk or the tumor bed are needed to create realistic treatment plans (see sect.2.4).
To make DPBN plans comparable to homogeneous dose plans, we imposed a constraint such that all plans have the same RBE-weighted average dose to the target of 70 Gy RBE, i.e. the integral RBEweighted dose to the target is constant.

Variable RBE
To investigate the influence of using a model for variable RBE, we implemented the nanoCluE model [39] in RayStation.In short, the model starts from the identity where α Q,T is the linear LQ-parameter for the radiation quality of interest Q and tissue T, α R,T is the corresponding parameter the reference radiation quality R, and (α/β) R,T is the ratio of the linear to the quadric terms for the reference radiation.The ratio α Q,T /β R,T is modelled as a function of three parameters; a nanodosimetric quantity n, a microdosimetric quantity μ, and the "tissue parameter" b = (α/β) R,T .As described in our earlier work [43 44], n derives from the cell inactivation potential of energy deposition clusters in the biomolecule scale and μ derives from the cavity size for which the microdosimetric mean specific energy Z F equals the mean inactivation dose.The functional form used to describe α Q,T /β R,T was a low order polynomial with coefficients derived by testing against a dataset of experimental LQ parameters (Section 4, Table 3 in appendix to [39]) to find the form that yielded the best goodness of fit.
For each voxel in the patient geometry, a dose weighted RBE max,T was calculated, and the resulting RBT T for a given tissue T calculated by means of the standard LQ formalism using where d is the dose deposited in the voxel per fraction, and assuming β Q,T = β R,T .The nanoClue approach thus combines a heuristic way of including nanodosimetric mechanisms such as DNA strand breaks from ionization clusters with microdosimetric effects of hitting sufficients number of cells with suffient energy, together with a purely empirical number fit to measured LQ data.We assumed (α/β) R,T = 10 Gy for the primary target volumes and the therapeutic lymph node regions [41], and (α/β) R,T = 3 Gy for all other tissues.Following earlier work [45], only the primary protons were considered when computing a dose weighted RBE max,T .

Treatment planning
Treatment planning was performed in a research version of the RayStation TPS (version 8.99.30.141).Pencil beam scanning using a beam model based on commissioning data from the IBA Proteus Plus site at the Skandion Clinic (Uppsala, Sweden) site was used.Energies in the interval 60-230 MeV were commissioned; spot sizes (2D Gaussian standard deviations) at isocenter were set in the interval of 3-7 mm, depending on energy.To allow for treatment of superficial targets a 3.5 cm water equivalent thickness (WET) range shifter was applied when appropriate.
Two sets of plans were created for each patient.The first set were plans optimized to give a homogeneous dose to the CTVT, and the second set were DPBN plans.In each set two types of plans were made; one optimized with fixed RBE = 1.1, and one optimized with variable RBE.Each plan optimized with RBE = 1.1 was evaluated twice, once also using RBE = 1.1 and once using nanoCluE.(see Table 1 for an overview).This yields a grand total of six RBE-weighted dose distributions for each patient, which can be organized as three pairs, each pair consisting of one homogeneous and one DPBN dose distribution.
Planning objectives used as baseline for all patients are shown in.Table 2.As a basis for the criteria, the ARTSCAN V study protocol was used.To keep in line with [21], we used 70 Gy (RBE) as a dose to the primary target, as opposed to 68 Gy (RBE) as specified in the ARTSCAN V study protocol.All treatment plans were multi-field optimized (MFO) in which all fields are simultaneously optimized, such that each field can deliver an inhomogeneous dose distribution [42], a technique known to be advantageous for HNC patients [43].
A uniform treatment planning approach was adopted for all 19 patients, with four coplanar fields (gantry angles 60 • , 120 • , 240 • , 300 • ).The feature of automatic range shifter selection of the TPS was used for all fields with the corresponding air gap set to 2 cm, which was kept small to reduce the spot size for maximal dose resolution [44,45].Spot and layer spacing were chosen automatically, allowing the TPS to vary the spacing between layers and spots to account for variations in spot size, as shown by [46] to be effective.The Monte Carlo (MC) dose calculation engine was used since the RBE model needed the energy of the particles depositing dose in a voxel.For consistency, MC was used both for optimization and final dose calculation for all patients and plans.
We used robust optimization (RO) based on the CTV for all plans, since a PTV-margin does not apply for dose painting plans [46,47].To that end the minimax method [48] in RayStation was utilized for the robust optimization of all plans.Briefly, the method works by generating a set of treatment scenarios in the form of isocenter displacements and density changes.Following the ARTSCAN V study protocol we used mm displacements and 3 % change in density as perturbation input values.Each isocenter displacement was performed parallel and antiparallel to the cardinal axes, which along with the nominal scenario, in which there is no isocenter displacement, leads to 7 displacement scenarios.Each of these scenarios was then subjected to − 3%, 0 % (nominal) and +3 % density changes in each voxel, for a total of scenarios.The minimax algorithm then seeks to optimize plan parameters to yield the best plan according to the planning objectives for the worst performing of these scenarios.
In a few cases, some of these objectives had to be relaxed to ensure sufficient target coverage.Two patients suffered from laryngeal cancer, and for these the larynx was considered as CTVT and not an OAR.For each, the resultant dose distribution was normalized such that the mean dose to the primary target in the nominal geometrical scenario was Gy (RBE).

Evaluation of plan differences and robustness
The TCP difference was the primary metric for comparison of each pair of dose distributions.We performed two-tailed Wilcoxon signed rank tests to test for significance in the differences in TCP between the homogeneous and the DPBN-plans.
The robustness of the plans due to geometrical uncertainties was evaluated through perturbing the dose distribution by 14 isocenter displacements (six along the cardinal axes as during optimization and eight such that the absolute value of the displacement is equal along all cardinal axes and the distance from the nominal case is 4 mm), once for − 3% density and once for + 3 % density, for a total of 28 scenarios; with the addition of the nominal geometrical scenario (i.e.no isocenter displacement and density change) the total is 29 scenarios.For the primary targets, robustness was evaluated through the TCP distribution for each patient and plan for the 29 scenarios.
To conservatively assess the potential loss of TCP due to changes in the SUV distribution while the treatment progresses, we also simulated a set of treatments using the originally planned dose distributions but acting on random scrambled SUV distributions.For each of the geometrical robustness scenarios explained above, 20 SUV distributions were created by randomly shuffling the SUV values such that the number of different SUV values were conserved but their locations within the tumor varied, and hence also their locations relative hot and cold dose spots were changed.For each of these 20 SUV distributions we calculated the TCP from the planned dose distribution using Eq. ( 1).

Table 1
The combinations of plan optimization and evaluation, used for each of the homogenous and DPBN plan sets.For labeling of results the TCP values can be calculated using either a fixed (subscript "eval-fix") or variable RBE (subscript "eval-var"), starting with the physical dose distribution for each plan.

TCP
The robustness test distributions of TCP increases are shown in Fig. 1.Ten of the 19 the patients had scenarios in which the TCP increase was greater than 3 %.Although the TCP gains varied between the patients, the Wilcoxon two tailed signed rank test yielded p-values < 0.001 for all considered cases of RBE optimization and evaluation.The variance of the TCP gains for the cases in which RBE was optimized using nanoCluE was larger, indicative of a diminished robustness to geometrical uncertainties.For some patients, in particular patient 9, and to a lesser extent, patient 18 and 19, the proximity of the critical OARs (brain stem and spinal cord) to the target volume led to difficulties for the optimizer to ensure geometrical robustness of the plans in the Eval eval-var case, leading to a loss in TCP increase.A different set of gantry angles may have mitigated the loss.
Fig. 2 shows the TCP increase versus the product of the standard deviation of the SUV distribution and the tumor volume for the nominal geometrical scenario.In line with earlier work [21] a strong correlation between the TCP gain for the nominal geometrical case and the product of the standard deviation of the SUV distribution and the tumor volume was found, as shown by the Pearson correlation coefficients r in Fig. 2.
Also shown in Fig. 2 is the case where TCP was computed using the randomized SUV distribution that yielded the lowest TCP.The curve indicates that the potential loss using the conservative estimate is smaller than the potential gain.
Fig. 3 shows the TCP distributions for the nominal SUV cases for both  homogeneous and DPBN-plans, as well as for the randomized SUV distributions for the DPBN-plans.The median loss in TCP due to the randomized SUV as compared to the homogeneous case with nominal SUV is at most 4.5 %.In contrast, the median gain from homogeneous dose with nominal SUV to DPBN with nominal SUV is at most 12 %.

OARs
Avoidance probability plots for the brain stem and spinal cord are shown in Fig. 4. The avoidance probability is defined as the fraction of scenarios in which the dose given to a certain OAR volume is less than the tolerance doses (45 Gy for brain stem, 48 Gy for spinal cord).In general, the Fix eval-fix case conforms best to the DVH criteria set in the sense that more scenarios are given less than the tolerance doses, while the Fix eval-var case performs the worst, and Var eval-var lies somewhere in the middle.For the brain stem, the ARTSCAN V protocol does not require robust optimization, instead using a rather low maximum dose criterion of 45 Gy; this was fulfilled for all nominal cases, but as evident from the plot, failed for a number of the geometrical robustness scenarios.
For the OARs in which the ARTSCAN V criteria were defined in terms of mean dose, Fig. 5 illustrates the mean dose distributions in a boxplot, with the ARTSCAN V criteria for each OAR as a horizontal magenta line.
The rather large spread in mean doses to certain OARs, such as the oral cavity and the parotid glands is due to the need to sacrifice these OARs for same patients in order to ensure a sufficient target coverage.

Discussion
For the patients and DPBN model we have considered, our results indicate that there is a statistically significant gain in TCP with proton therapy.Although the geometrical robustness and magnitude may have varied depending on whether a variable RBE was assumed or not, significant p-values < 0.001 were seen in all cases.The relatively small span of the TCP distributions as shown in Fig. 3 also indicate that the plans with respect to target coverage are all robust to geometrical uncertainties.
Under the assumption of the current clinical standard of RBE = 1.1 during plan optimization, the TCP increase is preserved even if RBE actually varies as predicted by the nanoCluE model.On the other hand, the OARs in general received higher biologically-weighted absorbed doses for the Fix eval-var case.This was indeed expected, since the nano-CluE model, like a number of other proton RBE models (see [49] for a review), predicts higher RBE values for lower α/β-ratios, and indeed we have assumed a low α/β = 3 Gy for all tissues except for the primary and therapeutic target volumes.
For the Eval eval-var case, this excess biologically weighted absorbed dose as seen had to be decreased to conform to the planning criteria.This reduced the target coverage and thus the predicted TCP as compared to the Fix eval-var case.Optimizing using nanoCluE in this case may thus be  considered a gentler approach in that the OAR burden is reduced, although the geometrical uncertainty robustness is also reduced.In general, optimizing using a variable RBE model may be viewed as performing PT with modified Bragg curves (a higher peak to entrance ratio, due to a lower RBE in the entrance channel as compared to the Bragg peak).Since a TCP increase could not be observed when comparing the Fix eval-fix and Eval eval-var case, we conclude that under the conditions of this study, these modified Bragg peaks were not beneficial for dose painting.
The temporal stability of the SUV distribution is a potential issue that may interfere with dose painting approaches.Repeated FDG-PET scans are made difficult by the RT-induced inflammation, for instance of the mucosal tissue, whose glucose uptake may be difficult to distinguish from a tumour-specific signal [50].There is some evidence of temporal stability of PET-FDG distributions prior to the commencing of radiotherapy [51].The model which we have used to maximize TCP is based on the tumor recurrence distributions of a cohort of 59 patients [13], and no account was taken of variations in the SUV distributions during treatment.
Since SUV changes were likely during the treatment of this original cohort [50], by applying the model on other patients, an assumption is made that the SUV distribution, and by extension, the radiosensitivity distribution changes similarly during the course of PT.By creating randomized SUV distributions, we may conservatively assess the TCP in cases where the dose distribution in the DPBN case no longer matches the SUV distribution.The possible loss in TCP with the randomized distributions was in percentage points smaller than the gain.This was due to the planning constraint of 60 Gy to 98 % of the CTVT.Without this constraint, the dose to 98 % volume would be closer to 40 Gy for some patients.
Given the statistically significant increase in TCP by using dose painting, further studies are warranted, possibly culminating in a clinical study.A pre-clinical study of interest would be to study whether the TCP model is accurate in predicting HNC RT outcomes.Once this is established, a clinical study could be performed under cautious conditions by setting minimum and maximum doses within the dose painting target.The stricter such constraints are set, however, the smaller the TCP gains are likely to be, requiring a larger patient cohort to ensure sufficient statistical power.To prove a TCP increase of 10 percentage points, a randomized trial would need more than 100 patients in each arm for a power of 80 % and a significance level of 5 %.In our cohort, only one case yielded such a large TCP increase.This may help explain the lack of findings of increase local control in clinical studies thus far, which have been based on fewer patients.
We have, in line with earlier work [12,13,21] assumed a fixed value for γ 50 .This implies a fixed number of clonogens per voxel [52].By extension, the D 50 , in our model a function of SUV, is thus exclusively a function of radioresistance.This is difficult to reconcile with the commonly held view that FDG uptake correlates to cell density and proliferation.However, it also correlates to hypoxia and other adverse biologic features [53].Mechanistically the assumption of a fixed γ 50 may thus be questionable, but the model may still be empirically valid.
Comparison to the results presented in [21] would suggest a slightly smaller gain in terms of TCP in PT DPBN as compared to photon VMAT DPBN.It should be mentioned however that the planning criteria used in our study, based on ARTSCAN V, are different and in general stricter.Furthermore, the nature of PT dose planning warrants the selection of a finite number of gantry angles, which is less of an issue in VMAT treatment planning.To facilitate this study, a uniform treatment planning strategy with the same gantry angles were used for all patients, with only removal of some OARs differing from the treatment planning for some patients.This may have been suboptimal for some patients.

Conclusion
We have showed that DPBN with PT for HNC is feasible.The TCP increase by using DPBN instead of conventional, homogenous dose distributions is statistically significant, and this holds even if one optimizes with fixed RBE and evaluates with variable RBE.Optimizing with variable RBE does not yield higher TCP gains than with fixed RBE but the OAR dose burden is reduced.Hence, from the OAR dose burden perspective, optimizing with nanoCluE constitutes as more gentle approach which is worthwhile to consider for a clinical study.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.The TCP increase distributions for the 19 patients.Bottom edges of the boxes represents the 25th percentile, while the top edges represents the 75th percentile; the central mark indicates the median.Whiskers extend to encompass all data not deemed as outliers.Patient numbers on the x-axis were assigned according to the TCP gain for the nominal geometrical scenario in the Fix eval-fix case.

Fig. 2 .
Fig. 2. The difference in TCP through DPBN compared to homogeneous dose, shown as a function of the product of the target volume and the standard deviation of the SUV-distribution.The lines are linear fits; dashed lines represent fits for the randomized SUV distribution, using the case which yielded the lowest TCP.

Fig. 3 .
Fig. 3. TCP distributions for DPBN and homogeneous plans with nominal SUV distributions, along with the randomized distributions.The bottom edges of the boxes represents the 25th percentile, while the top edges represents the 75th percentile; central mark indicates median.Whiskers extend to encompass all data not deemed as outliers.a) Optimized and evaluated using RBE = 1.1.b) Optimized using RBE = 1.1, evaluated using nanoCluE.c) Optimized and evaluated using nanoCluE.

Fig. 4 .
Fig. 4. Avoidance probability plots for (left) brain stem and (right) spinal cord.Solid curves represent the homogeneous case and dashed curves the DPBN case.The vertical magenta line marks 2 % volume for which the ARTSCAN V DVH criteria are defined for brain stem and spinal cord.The curves yield the fraction of scenarios for which 45 Gy in case of the brain stem and 48 Gy in case of the spinal cord are given to a certain volume.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 5 .
Fig. 5. OAR mean dose distributions pooled for all patients.The bottom edges of the boxes represents the 25th percentile, while the top edges represents the 75th percentile; central mark indicates median.Whiskers extend to encompass all data not deemed as outliers, which are represented as circles.Horizontal magenta lines mark the ARTSCAN V criteria.(left column) Homogeneous case.(right column) DPBN case.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 2
Dose planning criteria as baseline for all patients and plans.In practice, for some patients some of OAR ROIs that were not constraints were removed to ensure sufficient target coverage.