1. Introduction
With lung cancer as the leading cause of cancer-related deaths worldwide, innovative and more effective therapies, such as proton therapy (PT), are urgently needed. Improvement in tumor targeting with increased healthy tissue sparing compared to conventional photon beams is especially critical in the treatment of inoperable lung tumors [
[1]Current status of proton therapy techniques for lung cancer.
]. Accordingly, interest in PT application is on the rise for lung cancer, one of several disease sites where proton beams show promise in improving clinical efficacy [
2- Bush D.A.
- Cheek G.
- Zaheer S.
- Wallen J.
- Mirshahidi H.
- Katerelos A.
- et al.
High-dose hypofractionated proton beam radiation therapy is safe and effective for central and peripheral early-stage non-small cell lung cancer: results of a 12-year experience at loma linda university medical center.
,
3- Nguyen Q.N.
- Ly N.B.
- Komaki R.
- Levy L.B.
- Gomez D.R.
- Chang J.Y.
- et al.
Long-term outcomes after proton therapy, with concurrent chemotherapy, for stage II-III inoperable non-small cell lung cancer.
], considering the potential for healthy lung sparing and/or the reduction of the dose to the heart [
[4]James SS, Grassberger C, Lu HM. Considerations when treating lung cancer with passive scatter or active scanning proton therapy. Transl Lung Cancer Res 2018;7:210–5. 10.21037/tlcr.2018.04.01.
].
In practice, several dosimetric limitations and sources of uncertainty arise from treatment planning (TP) to delivery with proton beams for lung cancer. The favorable physical characteristics themselves, e.g. Bragg-peak with reduced lateral penumbra, make PT an ideal candidate for high-precision treatment delivery in heterogeneous anatomical sites such as the lung, but may also make these treatments increasingly prone to uncertainties in range and temporal effects due to breathing/organ motion [
5- Grassberger C.
- Dowdell S.
- Lomax A.
- Sharp G.
- Shackleford J.
- Choi N.
- et al.
Motion interplay as a function of patient parameters and spot size in spot scanning proton therapy for lung cancer.
,
6- Grassberger C.
- Daartz J.
- Dowdell S.
- Ruggieri T.
- Sharp G.
- Paganetti H.
Quantification of proton dose calculation accuracy in the lung.
]. Clinical consideration through increased target margins, 4D-robust TP, gating and/or rescanning delivery techniques, has demonstrated promising results to reduce sensitivity to delivery uncertainties and/or interplay effects [
7- Inoue T.
- Widder J.
- van Dijk L.V.
- Takegawa H.
- Koizumi M.
- Takashina M.
- et al.
Limited impact of setup and range uncertainties, breathing motion, and interplay effects in robustly optimized intensity modulated proton therapy for stage III non-small cell lung cancer.
,
8- Teoh S.
- George B.
- Fiorini F.
- Vallis K.A.
- Van den Heuvel F.
Assessment of robustness against setup uncertainties using probabilistic scenarios in lung cancer: a comparison of proton with photon therapy.
]. Nonetheless, significant discrepancy between planned and delivered dose may occur if dose algorithms are not properly managed by the clinical TP systems (TPSs). For instance recent works highlight relevant dosimetric differences between measured and predicted dose by conventional approaches in most commercial TPSs [
[9]Intensity modulated proton therapy and its sensitivity to treatment uncertainties 1: the potential effects of calculational uncertainties.
], in part due to improper modeling of radiation transport in highly heterogeneous tissues such as the lung, consisting of complex bone-air-tissue interfaces, anatomic/geometric complexities and sub-voxel Hounsfield-Units (HU) variations. “Monte Carlo (MC)-versus-analytical algorithm” remains a common debate topic and correspondingly, thoracic subcommittees aim to develop urgently needed consensus and guidelines for quality assurance (QA), TP and delivery of particle therapy for thoracic malignancies [
[10]- Chang J.Y.
- Zhang X.
- Knopf A.
- Li H.
- Mori S.
- Dong L.
- et al.
Consensus guidelines for implementing pencil-beam scanning proton therapy for thoracic malignancies on behalf of the PTCOG thoracic and lymphoma subcommittee.
].
Excluding uncertainties attributed to the breathing cycle/dose delivery system, this work specifically focuses on inherent accuracy of an analytical dose algorithm in thoracic regions, which in most other clinical scenarios can provide acceptable plan calculations within well-established tolerances. Despite being subject to scrutiny, the pencil beam algorithm (PBA) in PT provides fast speeds at the potential sacrifice of accuracy in complex tissue inhomogeneities. The gold standard for accuracy is the MC simulation and related codes are only recently introduced to the clinics, fostered by past and on-going efforts to develop clinical dose engines [
11- Jia X.
- Schümann J.
- Paganetti H.
- Jiang S.B.
GPU-based fast Monte Carlo dose calculation for proton therapy.
,
12- Schreuder A.N.
- Bridges D.S.
- Rigsby L.
- Blakey M.
- Janson M.
- Hedrick S.G.
- et al.
Validation of the RayStation Monte Carlo dose calculation algorithm using a realistic lung phantom.
].
In the case of thoracic treatment sites, several studies demonstrate limitations of clinical TPSs where accuracy is critical [
13- Tommasino F.
- Fellin F.
- Lorentini S.
- Farace P.
Impact of dose engine algorithm in pencil beam scanning proton therapy for breast cancer.
,
14- Langner U.W.
- Mundis M.
- Strauss D.
- Zhu M.
- Mossahebi S.
A comparison of two pencil beam scanning treatment planning systems for proton therapy.
]. Most notably, investigations with anthropomorphic lung phantoms suggest that application of commercial TPSs using analytical algorithms for the treatment of lung tumors should be deemed unfit for clinical use or used with extreme caution [
[15]- Taylor P.A.
- Kry S.F.
- Followill D.S.
Pencil beam algorithms are unsuitable for proton dose calculations in lung.
]. The lateral dose distribution computed by PBA in the lung and in bone interfaces may reach a level of inaccuracy around 30% [
16- Widesott L.
- Lorentini S.
- Fracchiolla F.
- Farace P.
- Schwarz M.
Improvements in pencil beam scanning proton therapy dose calculation accuracy in brain tumor cases with a commercial Monte Carlo algorithm.
,
17Saini J, Traneus E, Maes D, Regmi R, Bowen SR, Bloch C, et al. Advanced Proton Beam Dosimetry Part I: Review and performance evaluation of dose calculation algorithms. Transl Lung Cancer Res 2018;7:171–9. 10.21037/tlcr.2018.04.05.
], while commercial MC-based TPSs can improve the accuracy of dose calculation in the lung/bone interfaces or through inhomogeneities within ~5% [
12- Schreuder A.N.
- Bridges D.S.
- Rigsby L.
- Blakey M.
- Janson M.
- Hedrick S.G.
- et al.
Validation of the RayStation Monte Carlo dose calculation algorithm using a realistic lung phantom.
,
17Saini J, Traneus E, Maes D, Regmi R, Bowen SR, Bloch C, et al. Advanced Proton Beam Dosimetry Part I: Review and performance evaluation of dose calculation algorithms. Transl Lung Cancer Res 2018;7:171–9. 10.21037/tlcr.2018.04.05.
]. Despite being considered clinically tolerable, another source of uncertainty in proton therapy of lung originates from the heterogeneous structure of the lung itself, which leads to a degradation of the Bragg peak and a wider distal fall-off [
[18]- Baumann K.S.
- Flatten V.
- Weber U.
- Lautenschläger S.
- Eberle F.
- Zink K.
- et al.
Effects of the Bragg peak degradation due to lung tissue in proton therapy of lung cancer patients.
].
Several other works have investigated the accuracy of analytical algorithms and/or MC codes in clinically-relevant scenarios, especially in head-and-neck phantoms [
16- Widesott L.
- Lorentini S.
- Fracchiolla F.
- Farace P.
- Schwarz M.
Improvements in pencil beam scanning proton therapy dose calculation accuracy in brain tumor cases with a commercial Monte Carlo algorithm.
,
17Saini J, Traneus E, Maes D, Regmi R, Bowen SR, Bloch C, et al. Advanced Proton Beam Dosimetry Part I: Review and performance evaluation of dose calculation algorithms. Transl Lung Cancer Res 2018;7:171–9. 10.21037/tlcr.2018.04.05.
,
19- Winterhalter C.
- Zepter S.
- Shim S.
- Meier G.
- Bolsi A.
- Fredh A.
- et al.
Evaluation of the ray-casting analytical algorithm for pencil beam scanning proton therapy.
,
20Quantifying the effect of air gap, depth, and range shifter thickness on TPS dosimetric accuracy in superficial PBS proton therapy.
,
21- Mein S.
- Kopp B.
- Tessonnier T.
- Ackermann B.
- Ecker S.
- Bauer J.
- et al.
Dosimetric validation of Monte Carlo and analytical dose engines with raster-scanning 1H, 4He, 12C, and 16O ion-beams using an anthropomorphic phantom.
], however, few studies perform comprehensive testing for thorax-based treatments beyond commercial approaches.
In this work, dose calculation for lung cancer patients is investigated using the FRoG system, a GPU-accelerated dose calculation platform for particle therapy with both an enhanced physics-engine and rapid computation speed made possible via task-parallelization for multiple particle species [
22- Choi K.
- Mein S.B.
- Kopp B.
- Magro G.
- Molinelli S.
- Ciocca M.
- et al.
FRoG—A new calculation engine for clinical investigations with proton and carbon ion beams at CNAO.
,
23- Mein S.
- Choi K.
- Kopp B.
- Tessonnier T.
- Bauer J.
- Ferrari A.
- et al.
Fast robust dose calculation on GPU for high-precision 1H, 4He, 12C and 16O ion therapy: the FRoG platform.
]. The capabilities and limits of the FRoG approach are tested via dose calculation for thoracic malignancies. Through extensive benchmarking against
in-silico references (clinical TPS and general purpose (gp)-MC simulation), as well as experimental validation through end-to-end QA tests in an in-house built heterogeneous phantom equipped with ionization chambers (ICs), we verify whether analytical methods for PT dose calculation are indeed unsuitable for clinical activity.
Furthermore, we explore dose-averaged linear energy transfer (LETD) in the context of lung cancer patients, to assess flexibility and feasibility of relating innovative bio-effect quantities with clinical efficacy. This analysis has yet to be presented in the literature. Further efforts are made here to validate FRoG as a secondary dose engine for supporting clinical decision-making at CNAO, the Heidelberg Ion-beam Therapy-center (HIT) and the Aarhus Danish Center for Proton Therapy, where the platform has been installed for rapid physical, LETD, and bio-dose prediction. Here, state-of-the-art GPU-accelerated analytical and CPU-based MC dose engines are rigorously tested in scenarios where conventional approaches to clinical dose calculation often fail.
4. Discussion
Through systematic evaluations of PT dose calculation in thoracic treatment sites, we provide benchmarks for an innovative approach of the PBA as well as both clinical and research-based MC systems. In the context of results available in recent literature on dosimetric investigations using lung phantoms, this study demonstrated substantially higher PRs and lower dose deviations for an analytical system in spite of the relatively strict γ-criteria. We believe several points regarding lung dose calculation need formal clarification and here we will shed light on potential causes for deviations commonly observed with commercial systems, as well as suggested next steps for collaboration with industry towards improved TP for thoracic cancers.
First, a brief technical aside: higher deviations in
D98 and
D2 were observed for both FRoG and RS-MC in the DVH analysis against reference gp-MC. Apart from possible explanations like statistical fluctuation exhibited by MC engines [
35- Bauer J.
- Sommerer F.
- Mairani A.
- Unholtz D.
- Farook R.
- Handrack J.
- et al.
Integration and evaluation of automated Monte Carlo simulations in the clinical practice of scanned proton and carbon ion beam therapy.
,
36- Paganetti H.
- Jiang H.
- Parodi K.
- Slopsema R.
- Engelsman M.
Clinical implementation of full Monte Carlo dose calculation in proton beam therapy.
], the discrepancy on the coverage of the 98% (
D98) of the GTV affected the smallest geometry available for the selected database. This patient case, in particular, exhibited also higher GTV
D50 ratios (FRoG = 1.029, RS-MC = 1.032), with overestimations by both dose engines ~3% (see
Fig. 2, bottom panels): this may confirm that deviations can be attributed to challenges arising from small target volumes (~6cc), which may amplify possible initial discrepancies on the parameterization of the beam lateral spread in air throughout the beamline. In fact, thoracic treatment sites often host small targets with densities higher than the surroundings normal tissues and the low-density thoracic tissues may prevent the beam from spreading enough to smear out potential differences inherited from the patient entrance surface.
Overall, FRoG matches well with the gp-MC predictions. Considering all resultant values extracted from the DVHs (
Table 1), FRoG reproduces gp-MC within 2.2%, on average. Limiting analysis on the GTV, ΔHI was consistent between both the dose engines, with respect to gp-MC baseline:
D̅ to the GTV was 1.8% and 2.3% greater for FRoG and the RS-MC TPS, respectively.
For both the dose engines, the average dose to the Dlung agreed within 0.4%, with a relative standard deviation of the Dlung-ratio < 2% for both the dose engines.
Considering the challenging nature of thoracic treatments, a Gγ-analysis with distance-to-agreement DTA = 1 mm, global dose-difference GDD = 3% and dose-threshold DT = 5% may be considered sufficiently strict for validation purposes, considering the 5 mm/7% criteria applied in recent works [
[15]- Taylor P.A.
- Kry S.F.
- Followill D.S.
Pencil beam algorithms are unsuitable for proton dose calculations in lung.
]. In this regard,
Fig. 3(A) visibly demonstrates that all patient plans in the investigated cohort recalculated by FRoG satisfy the clinically optimal 95%-tolerance level. No failures were detected for the RS-MC TPS as well, with the totality of cases above the 95%-tolerance level. In turn, FRoG and RS-MC TPS exhibited comparable dose calculation performance.
Investigations were additionally conducted by adopting local dose normalization for the γ-analysis (
Fig. 3(B)), a stringent method to validate dosimetric performances of FRoG specifically in lower dose areas. While maintaining the same evaluation criteria, RS-MC TPS exhibited slightly poorer performances when compared to local outcomes for the reference gp-MC. The observed Lγ-PR, in fact, lowers to a mean value of (92.7 ± 2.0)%, with only a single patient case with results above the 95%-tolerance level. FRoG performs better against the gp-MC, even locally, with a mean Lγ-PR of (95.8 ± 1.8)%. Interestingly, for FRoG, 100% of patients exceed the Lγ-PR > 90%, with 2 patients (among the larger GTVs in the cohort and reduced beam path of traversed heterogeneous lung tissue in entrance) above 98% γ-PR.
As far for the SFUD plan, the 3 systems under study were highly comparable, with a similar average deviation to experimental data (<0.1%) and nearly identical minimum/maximum discrepancies of ~2.5%/~2%. For the IMPT plan, FRoG exhibited predictions closest to measurements, on average, out of the 3 engines, while both the gp-MC and RS-MC TPS underestimated by ~1%. With respect to |%Δ|, gp-MC, FRoG and RS-MC TPS performance were in agreement with values of (3.2 ± 2.3)%, (3.3 ± 2.7)% and (3.4 ± 2.4)%, respectively. Referencing routine patient-specific QA protocol at our facility, these results are clinically acceptable within the 5%-tolerance level of the mean |%Δ| and standard deviation.
In regard to LETD prediction, FRoG was within a few tenths of keV/µm, compared to reference gp-MC, for both the average and the near-to-maximum LETD. Moreover, results verified that for the selected beam configuration (2 orthogonal ports), the LETD was, on average, ~3keV/µm in the target and in the tissue nearby, reducing to ~1 keV/µm in the remaining portions of the lung. Maximum LETD values within and outside the target volume were ~4 keV/µm and < 8 keV/µm, respectively.
In this work, FRoG was rigorously benchmarked against dosimetric measurements and gp-MC prediction, as well as in the context of state-of-the-art MC clinical TPSs via end-to-end tests within a thoracic treatment scenario. Recent studies strongly suggest IMPT in the thoracic regions, by means of MC-based calculations, may be necessary to maintain a consistent and acceptable clinical practice for treating lung lesions. Here, we demonstrate that a well-designed analytical PB proton dose engine can effectively predict dose in lung tumors, comparable with gp-MC algorithms, consistent with MC-based clinical TPSs and in good agreement with experimental measurements. This has been verified despite approximations during calculation, e.g.
Dw, and neglecting PB degradation in lung, with the latter recently deemed clinically tolerable in most circumstances [
[18]- Baumann K.S.
- Flatten V.
- Weber U.
- Lautenschläger S.
- Eberle F.
- Zink K.
- et al.
Effects of the Bragg peak degradation due to lung tissue in proton therapy of lung cancer patients.
]. FRoG is an analytical PB-class dose engine with clinically acceptable performance in dose calculation for lung lesions. FRoG predictions are in line with the conclusions reported by [
[37]- Nenoff L.
- Matter M.
- Jarhall A.G.
- Winterhalter C.
- Gorgisyan J.
- Josipovic M.
- et al.
Daily adaptive proton therapy: is it appropriate to use analytical dose calculations for plan adaption?.
], promoting the use of analytical calculation methods for rapid plan adaptation with NSCLC treatments.
Furthermore, FRoG uniquely offers MC-validated predictions of patient-specific LET
D distributions. Recently, FRoG is supporting the clinical activity at select Varian© PT facilities as an auxiliary dose engine and investigations in which aim to establish novel beams and multi-ion treatment strategies [
38- Kopp B.
- Mein S.
- Dokic I.
- Harrabi S.
- Böhlen T.T.
- Haberer T.
- et al.
Development and validation of single field multi-ion particle therapy treatments.
,
39- Mein S.
- Dokic I.
- Klein C.
- Tessonnier T.
- Böhlen T.T.
- Magro G.
- et al.
Biophysical modeling and experimental validation of relative biological effectiveness (RBE) for 4He ion beam therapy.
]. Future efforts involve multi-institutional collaborations to investigate treatment delivery with lung phantoms using heavier ions.
A root cause for the relatively poor performance of PT dose engines in lung phantom studies is the TPS beam model design and execution. Lung treatments can be considered a pinnacle of complexity in PT and therefore, proper dose calculation using analytical methods in heterogeneous anatomy requires sophisticated PB deformation procedures, e.g. high order PB subdivision, and in turn, propagation and handling of lateral dose penumbra. More specifically, to model PB distortion in lateral heterogeneities, RS v.8, for example, decomposes each spot by 19 beamlets, while FRoG, on the other hand, reconstructs the PB with ~350 unique beamlets. The issue at hand is simply a computational feat that can most efficiently be performed via GPU-accelerations and similarly, other recently developed systems further demonstrate the potential of accelerated codes, performing both optimization and calculation procedures well under a minute [
[40]- Matter M.
- Nenoff L.
- Meier G.
- Weber D.C.
- Lomax A.J.
- Albertini F.
Intensity modulated proton therapy plan generation in under ten seconds.
].
Results from the (15)’s study draw attention to important issues regarding mainstream commercial systems which, for thoracic sites, insufficiently describe lateral dose evolution and substantially under-sample the necessary PB decomposition. Not all approaches to dose calculation (analytical or simulation) will be suitable for treating anatomically complex and sensitive cases. Other non-commercial systems, however, using analytical or MC approaches have demonstrated similar accuracy, as FRoG, with significant reductions in calculation time for various anatomic treatment sites [
41- da Silva J.
- Ansorge R.
- Jena R.
Fast pencil beam dose calculation for proton therapy using a double-Gaussian beam model.
,
42Beltran C, Tseung HWC, Augustine KE, Bues M, Mundy DW, Walsh TJ, et al. Clinical Implementation of a Proton Dose Verification System Utilizing a GPU Accelerated Monte Carlo Engine. Int J Part Ther 2016;3:312–9. 10.14338/ijpt-16-00011.1.
] and may consider following end-to-end tests for TP lung lesions as performed here.
The employment in the clinical routine of benchmarked fast analytical dose engines, such as FRoG, using also log files reporting the actual delivered spots, could be a valid alternative to QA-program relying solely on dose measurements in a homogeneous water-like phantom [
[43]Johnson JE, Beltran C, Wan Chan Tseung H, Mundy DW, Kruse JJ, Whitaker TJ, et al. Highly efficient and sensitive patient-specific quality assurance for spot-scanned proton therapy. PLoS One 2019;14. 10.1371/journal.pone.0212412.
] or to time-consuming measurements with thermoluminescent dosimeters and radiochromic film in anthropomorphic phantoms.
The findings of this work, along with key results from the literature, suggest that to improve confidence in TP for lung cancer, prompt redesign of current commercial analytical approaches is warranted. The authors suggest adoption of FRoG or similar approaches may be necessary to balance speed and accuracy for clinical viability, most feasible through GPU-accelerated architecture.