Example：10.1021/acsami.1c06204 or Chem. Rev., 2007, 107, 2411-2502
HD-RDS-UNet: Leveraging Spatial-Temporal Correlation between the Decoder Feature Maps for Lymphoma Segmentation. IEEE Journal of Biomedical and Health Informatics (IF5.772), Pub Date : 2021-08-05, DOI: 10.1109/jbhi.2021.3102612 Meng Wang,Huiyan Jiang,Tianyu Shi,Yu-Dong Yao
Lymphoma is a group of malignant tumors originated in the lymphatic system. Automatic and accurate lymphoma segmentation in PET/CT volumes is critical yet challenging in the clinical practice. Recently, UNet-like architectures are widely used for medical image segmentation. The pure UNet-like architectures model the spatial correlation between the feature maps very well, whereas they discard the critical temporal correlation. Some prior work combines UNet with recurrent neural networks (RNNs) to utilize the spatial and temporal correlation simultaneously. However, it is inconvenient to incorporate some advanced techniques for UNet to RNNs, which hampers their further improvements. In this paper, we propose a recurrent dense siamese decoder architecture, which simulates RNNs and can densely utilize the spatial-temporal correlation between the decoder feature maps following a UNet approach. We combine it with a modified hyper dense encoder. Therefore, the proposed model is a UNet with a hyper dense encoder and a recurrent dense siamese decoder (HD-RDS-UNet). To stabilize the training process, we propose a weighted Dice loss with stable gradient and self-adaptive parameters. We perform patient-independent fivefold cross-validation on 3D volumes collected from whole-body PET/CT scans of patients with lymphomas. The experimental results show that the volume-wise average Dice score and sensitivity are 85.58% and 94.63%, respectively. The patient-wise average Dice score and sensitivity are 85.85% and 95.01%, respectively. The different configurations of HD-RDS-UNet consistently show superiority in the performance comparison. Besides, a trained HD-RDS-UNet can be easily pruned, resulting in significantly reduced inference time and memory usage, while keeping very good segmentation performance.