Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Junshen Xu, Elfar Adalsteinsson

Abstract

Image denoising is of great importance for medical imaging system, since it can improve image quality for disease diagnosis and downstream image analyses. In a variety of applications, dynamic imaging techniques are utilized to capture the time-varying features of the subject, where multiple images are acquired for the same subject at different time points. Although signal-to-noise ratio of each time frame is usually limited by the short acquisition time, the correlation among different time frames can be exploited to improve denoising results with shared information across time frames. With the success of neural networks in computer vision, supervised deep learning methods show prominent performance in single-image denoising, which rely on large datasets with clean-vs-noisy image pairs. Recently, several self-supervised deep denoising models have been proposed, achieving promising results without needing the pairwise ground truth of clean images. In the field of multi-image denoising, however, very few works have been done on extracting correlated information from multiple slices for denoising using self-supervised deep learning methods. In this work, we propose Deformed2Self, an end-to-end self-supervised deep learning framework for dynamic imaging denoising. It combines single-image and multi-image denoising to improve image quality and use a spatial transformer network to model motion between different slices. Further, it only requires a single noisy image with a few auxiliary observations at different time frames for training and inference. Evaluations on phantom and in vivo data with different noise statistics show that our method has comparable performance to other state-of-the-art unsupervised or self-supervised denoising methods and outperforms under high noise levels.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87196-3_3

SharedIt: https://rdcu.be/cyl1t

Link to the code repository

https://github.com/daviddmc/Deform2Self

Link to the dataset(s)

https://www.creatis.insa-lyon.fr/Challenge/acdc/databasesTraining.html

Reviews

Review #1

Please describe the contribution of the paper

The paper proposed an end-to-end deep framework for denoising dynamic imaging. The proposed Deformed2Self model consist of three modules and have shown effective performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The end-to-end framework proposed for dynamic imaging denoising which is challenging in terms of image size and motion information.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The parameter estimation needs to be discussed more and how training has been done.
2. The method still working on denoising on 2D images but can be improvised to take care availability of 3D data in itself.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
1. The paper needs to have more insights for parameter estimation and training part for reproducibility in future by Readers.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. The paper presented interest idea Deformed2Self for denoising.
2. A more detailed analysis is required about STN framework and its role in the pipeline.
3. It is mentioned that first module is UNet for individual frames for denoising. Authors must show how much improvement is done from output of first module to the last one in multi image denoising.
4. For better measure, the experiments should be conducted few more number of times and must have standard deviation in Table 1.
5. Can there be a temporal information of the noise to used in model? Since the work is still boils down to single image denoising or motion information to some extent.
6. The Dynamic imaging was the focused in the paper. It will be great if author(s) can highlight how to go ahead for single image or 3D imaging.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. The idea is good and apt for the conference interest.
2. Experimental is sufficient to demonstration. However, some information about training part needs to be highlighted.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

This paper presents a framework for removing the noise in dynamic medical images. In general, the proposed framework consists of three major modules, including a single-image denoising module for coarse noise reduction, a spatial transformer network for warping the frames to the target one, and a multi-image denoising network for fine noise reduction of the target frame with the information from all frames. The proposed framework is evaluated using two datasets, PINCAT and ACDC, and is compared with different methods including DIP, Self2Self, BM3D, and VBM4D.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The main strengths of this work lie in:
1. The proposed framework, Deformed2Self, is totally self-supervised, avoiding the need for paired noisy-and-clean images.
2. Deformed2Self is able to exploit the valuable information in dynamic images for removing the noise in a target frame.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The main weaknesses of this work lie in:
1. A registration part, spatial transformer network, is involved in the proposed framework, which raises the concern regarding the noise characteristics.
2. The multi-frame images are correlated with each other, providing useful information for removing the noise. However, they are not exactly the same and possess appearance differences from each other.
3. It is a mistake to add the noise to the acquired images, which are already noisy.
4. The proposed framework is unable to provide unbias noise reduction.
5. Deformed2Self is a large framework with three stages and multiple sub-networks. It should be very challenging to train the whole framework.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The whole framework is very complex, raising my concern about the reproducibility of this work.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. In general, registration induces the interpolation procedure, which can change the noise distribution and induces bias to the denoising result. This raises my concern regarding the change of noise characteristics and unbiased noise reduction.
2. The multi-frame images are not exactly the same and possess appearance differences from each other. How to exploit their correlation without influenced by the inter-image difference? Please clarify.
3. Fig. 1 is not clear. What are f_s and f_m? Figures should be self-contained. More information should be provided.
4. In general, it is fine to add noise to the synthetic ground truth data, but not the real data. Adding noise to real data will change the noise distribution and induces significant bias to the results.
5. The single-imaging denoising network designed based on Noise2Void and Self2Self is suitable for removing the Gaussian noise, not sure if it is able to remove the Rician noise, which has a much complex distribution than the Gaussian.
6. The authors should provide more details on how they train the whole framework. Does the training performed in an end-to-end manner or stage-by-stage manner?
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

There are some technical contributions in this work. However, both methods and experiments suffer from several major drawbacks, reducing the practical use of this work.
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

3
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

The paper proposed an end-to-end self-supervised deep learning framework for dynamic image denoising. The method outperformed BM3D, VBM4D, DIP and Self2Self based on phantom and in vivo datasets with different noise statistics.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

A spatial transformer network was included in the proposed method, which can estimate the deformation field between frame k and the target frame. The is the main strength as the proposed method can suppress the motion between different frames while do denoising based on the spatial information from the other frames.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The authors performed ablation studies to evaluate different components in their framework. It seems that without single image denoising or without registration the performance didn’t drop much. What if without both of them?
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper provides sufficient details about the models/algorithms, datasets, and evaluation.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

It would be better to make it clear which dataset Fig2 and Fig3 belong to. The verb is missing in this sentence “In all the experiments, we another four noisy images from the same sequences as auxiliary observations”
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper was well-organized. The method is novel and easy for application.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposed a self-supervised denoising network by considering the correlation among different time frames. Overall, research on self-supervised network in medical imaging is interesting. The paper got scores of 7, 5, 8. The major concerns of reviewers are the extensions to 3D data instead of 2D images, noise distribution change by adding noises to real data, practical usefulness, etc. The authors are invited to clarify these concerns.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

Author Feedback

First of all, I would thank the reviewers for their constructive feedback. In the rebuttal, we focus on the major concerns raised by reviewers.

Training process of the proposed method. (R1, R2) Since our method is fully self-supervised, our method can be trained on one sample (the target frame with N other frames). The network is optimized end-to-end with Adam optimizer by minimizing loss function L = lambda_s * L_s + lambda_r * L_r + L_m, where the details of the loss can be found in eqn 3-5. The selections of hyperparameters are listed in Sec. 3.1. As for the complexity of our architecture, it worth noting that the networks of the three submodules are all UNet-based, which are common in deep learning. The architecture of the whole method is a stacked UNet with a grid sample operator in the middle which can be easily trained end-to-end with optimizers like Adam.

Appearance differences in different frames. (R2) To address the appearance differences in different frames, two strategics are used in our method. First, instead of using the whole image sequence, we only use a few adjacent frames to help denoise the target frame, i.e., use small N, when image contrast is stable over a relatively short period. Second, in the multi-image denoising stage, the network also takes the original noisy target frame, y_0, as input, so that the network is conditioned the contrast of the target frame.

Unbiased noise reduction. (R2) The output of the network is trained with the blind-spot technique with dropout similar to Self2Self. According to [21], when the noise is of zero mean, the expectation of masked MSE loss is the MSE between the network output and the underlying clean image. So that the expectation of output is E(x|y) = x. For more complex noise model, such as Rician noise, other techniques like variance-stabilizing transformation can be applied before using our proposed denoising method.

The concern regarding the noise characteristics in the registration part. (R2) First, the registration module may be sensitive to the noise in the input images. Therefore, we introduce a single image denoising stage to improve the registration accuracy by reducing noise using internal information in each image. Second, the grid sample operator in registration may change the pdf of noise in the output image. But our method doesn’t make assumption on the pdf of noise. Besides, as we mentioned before, the multi-image denoising network is conditioned on the original noisy images which have the original noise statistics.

Add noise to real data. (R2) Since our method is self-supervised, the real data is only used as a reference when computing the quantitative metrics but not seen during training. To avoid the noise problem in quantitative results, we also provide visual results for the real dataset and we also did experiments on a numerical phantom dataset which is completely noise-free.

Extensions to 3D. (R1) Although 3D data is out of the scope of this paper, we would like to give a brief discussion about extending Deform2Self to 3D in the rebuttal, since we are working on applying the Deform2Self method to other applications where 3D data are common, such as fetal imaging. On the architecture level, UNets can be extended to 3D UNets to handle volumetric data. As for the registration parts, many works have demonstrated the feasibility of performing self-supervised registration with 3D neural network (e.g., Voxelmorph [1]).

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposed a self-supervised dynamic medical image denoising network by joint single image denoising, deformation estimation and multi-frame denoising. The major concerns of reviewers are on computational complexity, extensions to 3D data, noise distribution change, adding noises to real data, etc. The authors clarified on these concerns, and the responses are mostly convincing, except on the reasons of adding noises to real data to me. It should be more convincing if results are provided in the paper with pure real noises of medical images. Overall, this basic idea of unsupervised learning for denoising is interesting, and may provide a methodology basis for the further investigation of real noise removal in medical applications.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Authors did provide more details on open questions from reviewers in their rebuttal. For this reviewer, enthusiasm of this paper is still somewhat limited by questions related to major questions on noise characteristics (interpolation, registration), complexity of approach w.r.t. ensuring reproducibility, This paper may not be fully ready for MICCAI as some major questions in particular rev#2 remain.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors have addressed the major concerns from reviewers including noise distribution, 3D extension, and complexity of training the three-stage model.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

back to top

Deformed2Self: Self-Supervised Denoising for Dynamic Medical Imaging