MBST-Driven 4D-CBCT reconstruction: Leveraging swin transformer and masking for robust performance.
Objective: This research developed an innovative Mask-based Swin Transformer network (MBST) to enhance the quality of 4D cone-beam computed tomography (4D-CBCT) reconstruction. The network is trained on 4D-CBCT reconstructed under limited scanning conditions, enabling its application to a broad range of 4D-CBCT reconstruction scenarios, including those with high scanning speeds.
Methods: 4D imaging data from 20 patients with thoracic tumors were used to train and evaluate the deep learning model. 15 cases were used for training, and 5 cases were employed for simulation testing. The Feldkamp-Davis-Kress algorithm was employed to simulate 4D-CBCT from downsampled 4D-CT data to mitigate the uncertainties associated with respiratory motion between treatment fractions, and the 4D-CT data served as the ground truth for training. The study reconstructed 4D-CBCT images under 11 different scanning intervals including full angle acquisition at 1°, 2°, 3°, 4°, 5°, 6°, 12°, 18°, 24° intervals, and 1/3 full angles acquisition at 5°, 10° inrevals respectively for capturing 4D-CBCT projections. The test results were quantitatively evaluated using the structural similarity index measure (SSIM), peak signal-to-noise ratio (PSNR), mean error (ME), and mean absolute error (MAE), and image quality was qualitatively assessed. Real clinical patients who were not included in the training were tested to evaluate the network's ability to generalize. Moreover, the proposed method was compared with other deep learning approaches, and statistical analyses were performed.
Results: Simulation data assessment revealed that with small projection acquisition interval, such as the 4°interval, the 4D-CBCT images optimized by MBST showed a considerable improvement over the original 4D-CBCT images in terms of SSIM (42.3% increase) and PSNR (10.8 dB increase), and the ME and MAE values approached 0. The improvements were statistically significant (P < 0.001). Compared with other deep learning methods, MBST demonstrated superior performance with improvements of 1.4% in SSIM and 1.21 dB in PSNR and a reduction of 0.94 in MAE. With large projection intervals, such as the 24°interval, MBST outperformed other deep learning methods. Specifically, its SSIM, PSNR, and MAE increased by 3.8%, 0.81 dB, and 10.34, respectively, compared with those of other deep learning methods, and the improvements were statistically significant (P < 0.01). In addition, MBST could reconstruct bone tissue and optimize the quality of 4D-CBCT images even when the number of projections was small (12°, 18°, 24°intervals). Clinical data evaluation revealed that after optimization by MBST, the SSIM, PSNR, ME, and MAE of 4D-CBCT compared with those of 4D-CT registration improved from the original 22.8%, 15.49 dB, -345.5, and 432.2 to 81.5%, 27.93 dB, -53.79, and 73.77, respectively. Moreover, MBST exhibited the most pronounced improvement among all the compared methods. MBST could accurately recover high-density structure, lung structures, and tracheal walls.
Conclusions: This study comprehensively demonstrated the ability of MBST to reconstruct 4D-CBCT images under various scanning conditions. When the method was tested on clinical patient datasets, its CT values and image quality achieved satisfactory results. Thus, MBST can serve as a highly generalized reconstruction network for improving the quality of 4D-CBCT images.