Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Yanmei Luo, Yan Wang, Chen Zu, Bo Zhan, Xi Wu, Jiliu Zhou, Dinggang Shen, Luping Zhou

Abstract

To obtain high-quality positron emission tomography (PET) image at low dose, this study proposes an end-to-end 3D generative adversarial network embedded with transformer, namely Transformer-GAN, to reconstruct the standard-dose PET (SPET) image from the corresponding low-dose PET (LPET) image. Specifically, considering the convolutional neural network (CNN) can well describe the local spatial features, while the transformer is good at capturing the long-range semantic information due to its global information extraction ability, our generator network takes advantages of both CNN and transformer, and is designed as an architecture of EncoderCNN-Transformer-DecoderCNN. Particularly, the EncoderCNN aims to extract compact feature representations with rich spatial information by using CNN, while the Transformer targets at capturing the long-range dependencies between the features learned by the EncoderCNN. Finally, the DecoderCNN is responsible for restoring the reconstructed PET image. Moreover, to ensure the similarity of voxel-level intensities as well as the data distributions between the reconstructed image and the real image, we harness both the voxel-wise estimation error and the adversarial loss to train the generator network. Validations on the real human brain PET data show that our proposed method outperforms the state-of-the-art methods in both qualitative and quantitative measures.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87231-1_27

SharedIt: https://rdcu.be/cyhVa

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

In this paper, the authors proposed an 3D GAN embedded with transformer (Transformer-GAN) for low-dose PET denoising, taking advantage of the transformer’s ability to capture long-range semantic information. Improved results were obtained compared with the other state-of-the-arts methods.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

It is innovative to combine transformer with GAN encoder to capture long-range semantic information.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The dataset used in evaluation is too small, with only 16 subjects in total (including training and testing). The authors did not provide any detail about the PET dataset, for example, acquisition protocol and dataset descriptions. There is no information about how the low-dose images were generated and how low the doses are. Without these details, it is hard to evaluated the effectiveness of the proposed method.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Not enough details about the dataset.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. Expand the dataset would make the paper stronger. Also please provide more details about the dataset, as stated in the weakness.
2. In Section 2.1, for EncoderCNN, “The extracted feature maps are further processed by a convolutional layer to reduce the channel numbers.”, it seems that the reduced channel number was not reported in the paper.
3. I suggest also adding the PSNR, SSIM and NMSE for LPET in Table 1.
Please state your overall opinion of the paper

reject (3)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Small evaluation dataset. Not sufficient details about the dataset.
What is the ranking of this paper in your review stack?

5
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

The paper proposed a 3D TRansformer-GAN to generate the standard-dose PET image from the low dose PET image using EncoderCNN-Transformer-DecoderCNN as the architecture of the generator.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper combined the transformer and CNN together as a structure of EncoderCNN-Transformer-DecoderCNN to generate standard-dose PET image, Which can capture both the global semantic dependencies and the low-level spatial information of local connectivity.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

In Fig2, there are artifacts below the arrow for the proposed method.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper provides sufficient details about the models/algorithms, datasets, and evaluation.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The picture is too small to see the difference in detail
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is well-organized and the method is novel.
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

The authors proposed an EncoderCNN-Transformer-DecoderCNN network model for improving the image quality of low-dose PET. The main contributions are 1) extending the network onto 3D dataset; 2) combine CNN with transformers so that both local and global information can be utilized; and 3) using the adversarial loss to generate realistic images.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors involved the transformer into their image reconstruction model, which has not been implemented in other studies. The result looks promising, especially when the proposed method overcome some other CNN-based methods. The authors showed the contributions of the transformer component and the adversarial in the ablation study.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The authors didn’t provide too many details about the implementation of other methods. Are their hyper-parameters tuned based on the dataset used in this paper? How many parameters each network have?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

This paper can be easily reproduced since the authors described their method very clearly.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The authors should consider adding more details about the implementation of other methods. It is also suggested to list the number of parameters of each network in Table I for fair comparison. It is hard to find differences between different results in figure 3. A better result demonstration should be considered.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Some factors like the novelty of the proposed method, experiment design and result analysis.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes Transformer-GAN for generating standard-dose PET from low-dose PET. The entire work is straightforward to understand. There are several issues regarding parameters, etc., which should be addressed in rebuttal. Meanwhile, R1 has raised concerns over the dataset. The authors need provide more information about their data, and make clarification accordingly.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Author Feedback

We thank all the reviewers for their acknowledgement about our methodological contribution, and their constructive comments for further clarification. To Reviewer 1: Q1: Detail about the PET data A1: The subjects in our PET dataset were administered an average of 203MBq of [18F] FDG. All data were acquired on a Siemens Biograph mRI PET-MR system. Standard- (SPET) and low-dose PET (LPET) images were acquired consecutively based on standard imaging protocols. Specifically, the SPET images were obtained in a 12-minute period within one hour of tracer injection. And the LPET scans were acquired in a 3-minute short period to simulate the acquisition at a reduced dose of radioactive trace. The simulation is equivalent to a quarter of the standard dose. PET reconstruction was carried out iteratively with the ordered subsets expectation maximization (OSEM) method with 3 iterations, 21 subsets, and post-reconstruction filtered with a 3D Gaussian with FWHM of 2 mm. The SPET and LPET for the same subject used the same attenuation map computed by the Dixon fat-water method provided by the scanner manufacturer.

Q2: The scale of PET dataset A2: Please note that, the number of subjects (16) in our dataset is comparable to that of similar works in this field, e.g., 9 subjects used in [6] in MICCAI2020. More importantly, despite 16 subjects in total, we can provide a sufficient number of samples to train a good model by extracting 729 large patches of size 64×64×64 from the whole image of size 128×128×128. Also, to enhance the stability of the model with limited samples, we used the leave-one(subject)-out cross-validation (LOOCV) strategy, i.e., the training-test procedure is repeated for 16 times while each time one subject is used in turn for testing and the other 15 subjects are for training, the averaged performance is reported to avoid potential bias. In this manner, the total training samples are increased from 15 to 10935, which is sufficient to train our model. It is also noteworthy that, despite 16 subjects, our dataset caters for the variation of both the pathology and the healthy.

Q3: Report the reduced channel number A3: To reduce the computational overhead and match the channel dimension of the fixed position encoding, the extracted feature maps are further processed by a convolutional layer to reduce the channel numbers from 512 to 192. We will report the reduced channel number in the final paper.

Q4: Add the PSNR, SSIM and NMSE for LPET in Table 1. A4: As suggested, we have calculated three metrics (i.e., PSNR, SSIM, NMSE) for LPET. The results show that the average quality of LPET images is 20.684 PSNR, 0.979 SSIM, and 0.053 NMSE for normal control (NC) subjects, and 21.541 PSNR, 0.976 SSIM, and 0.058 NMSE for mild cognitive impairment (MCI) subjects, respectively. We will add these results to Table 1 in the final paper.

To Reviewer 2 Q5: Artifacts below the arrow in Fig. 2 A5: The artifacts below the arrow in Fig. 2 may be incurred by the low-resolution image in current paper. We will improve the quality of Fig. 2 in the final paper.

Q6: The picture is too small to see the difference in detail. A6: For a clearer display we will zoom in the saliently different regions in the final paper.

To Reviewer 3 Q7: Details about the implementation of other methods. A7: For a fair comparison, we had carefully tuned the hyper-parameters of the comparison methods on the same dataset and trained the networks with the same LOOCV strategy. For each network, the number of parameters is 41M for auto-context CNN, 127M for 3D-cGANs, 127M for LA-GANs, and 76M for our method, respectively. We will provide these details in Table 1 in the final paper.

Q8: The differences between different results in Fig. 3 A8: To clearly see the differences among the different results in Fig. 3, we will zoom in the regions with salient differences, and provide the error maps between the reconstructed images and the corresponding real images in the final paper.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposes Transformer-GAN for reconstructing standard-does PET from low-dose PET. It’s a typical work following Transformer and quite straightforward. There are several issues regarding parameters, etc., from the reviewers. Meanwhile, R1 raised questions for dataset and implementation details of this paper. In rebuttal the authors made proper response.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This article proposes a method to generate the standard dose PET image from the low dose PET image. The method is based on a 3D Transformer-GAN which takes advantage of the transformer’s ability to capture long-range semantic information. The proposed framework is interesting. The weakness of the work is the size of the data, because it is small for a 3D architecture which is composed of a 3D encoder + a 3D decoder + transformer + discriminator. Despite this weak point, I propose to accept this paper.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

4

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The idea of using transformer architecture in 3D GAN for reconstructing standard-dose PET image from the low dose PET image seems interesting and promising. In the rebuttal, authors addressed reviewers’ concerns regarding dataset details and promised to include higher resolution figures in the final version. Although the scale of the dataset may still be a limitation, I tend to accept this paper after the authors’ rebuttal.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

11

back to top

3D Transformer-GAN for High-quality PET Reconstruction