Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Pujin Cheng, Li Lin, Yijin Huang, Junyan Lyu, Xiaoying Tang

Abstract

Fundus image quality is crucial for screening various ophthalmic diseases. In this paper, we proposed and validated a novel fundus image enhancement method, named importance-guided semi-supervised contrastive constraining (I-SECRET). Specifically, our semi-supervised framework consists of an unsupervised component, a supervised component, and an importance estimation component. The unsupervised part makes use of a large publicly-available dataset of unpaired high-quality and low-quality images via contrastive constraining, whereas the supervised part utilizes paired images through degrading pre-selected high-quality images. The importance estimation provides a pixel-wise importance map to guide both unsupervised and supervised learning. Extensive experiments on both authentic and synthetic data identify the superiority of our proposed method over existing state-of-the-art ones, both quantitatively and qualitatively.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_9

SharedIt: https://rdcu.be/cyl9O

Link to the code repository

https://github.com/QtacierP/ISECRET

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper proposes a novel semi-supervised approach for images enhancement, specifically for fundus images. The approach provides a pixel-wise importance map to guide both supervised and unsupervised learning.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

A novel semi supervised method for fundus enhancement, with pixel-wise image translation using supervised loss. Good use of adversarial loss to distinguish between enhanced and authentic images. Very well implemented and elaborative study.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

At some place, the approach is too complicated to understand. For e.g., in Section 2.2, it is difficult to follow the study step by step. The framework is a combination of several prior approaches but addresses a very important clinical problem.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

At some places, it is difficult to follow the framework, otherwise paper seems to be reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. Good literature survey.
2. Novel implementation of semi-supervised approach for an important clinical application.
3. In Section 2.1, the use of decoder branch needs to be elaborated more and also the implementation and parameters associated with it.
4. Section 2.2 is complicated to understand. Please divide it in steps.
5. Use of LS-GAN based objective function is interesting.
6. Fonts of images should be increased, especially in Fig 1.
7. The proposed metric FIQA is nicely implemented by using MCF-Net. I am not sure about the quality and resemblance of fundus images using in MCF-Net.
8. Results are promising.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Very well structured paper, addressing an important issue in medical domain. The framework is novel and well implemented.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

3
Reviewer confidence

Somewhat confident

Review #2

Please describe the contribution of the paper

This paper introduces I-SECRET, a semi-supervised method for enhancing retinal fundus images, which involves two stages: an unsupervised stage using unpaired high-quality and low-quality images (from a public source), and a supervised stage using paired high-quality images and their artificially-degraded low-quality image counterparts. The large (unpaired) EyeQ dataset is used as the unsupervised input, while the smaller (high-quality) DRIVE dataset is used (with artificially degraded pairs) as the supervised input. Improvements over previous GAN-based methods were observed on various full-reference (PSNR, SSIM, VSD) and non-reference (FIQA) metrics.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The main strength of the paper is its leveraging of both abundant & authentic data for unsupervised learning, and paired but synthetic and limited data for supervised learning, within a single framework. In particular, this in theory allows authentic degradations to be generalized/taken into account despite being unpaired, which is a potential improvement over purely-unpaired approaches such as CycleGAN. The empirical evaluation was also fairly comprehensive.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Some weaknesses include that the presentation of the method could possibly be improved/clarified, and that perhaps the most interesting comparison - against a plausible fully-supervised approach involving artificial degradation of the abundant large dataset currently used for unsupervised training - was unexplored. Details are given in the comments.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The datasets used are publicly-available, and hyperparameters used for the degradation process are provided in the appendix, but the model architecture is relatively complex. The authors appear to have hosted the code at GitHub from a placeholder link at the end of the Introduction method, but it is not available for inspection as yet.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. In general, the methodology and functioning of the proposed I-SECRET model could have been more clearly described. In particular: a) The framework figure (Figure 1) might be adjusted to correspond to notation provided in the text. For example, enhancement network G and adversarial discriminator D are prominently featured in the textual description; such labels might thus also be used in the figure (e.g. is G the “Encoder+Enhancement Decoder+Importance Decoder(s)” in Figure 1?) b) The architecture of enhancement network G (as described in Section 2.2) might moreover be explained in greater detail (i.e. how many feature maps per layer, etc) c) For G, it is stated that pairs of randomly selected patches of different sizes were chosen for contrastive loss computation. It might be explicitly stated as to how many pairs, and of what sizes. d) While three stages (unsupervised, supervised, adversarial) are described in the framework, with an overall loss function (Equation 6) comprising loss elements from all three stages, it is not very clear whether and how these stages take place: simultaneously (as suggested by the overall loss function), or sequentially (as with each of the stages & its encoders/decoders/models being trained separately, or even in a cycle such as the unsupervised stage for X iterations, then the supervised/adversarial stage for Y iterations)? e) If the training of the stages is simultaneous, the training inputs to the full I-SECRET model might be specified. In particular, would each training instance include one (unpaired) low-quality image and one (paired) artificially-degraded image, with the (paired) high-quality image as ground truth? f) While the contrastive IS-loss computation(Equation 1) is placed within the importance-guided contrastive constraining stage in Figure 1, that stage appears to be unsupervised with the (unpaired) low-quality image (as also supported by the patches being drawn from the low-quality image). In this case, how might the IS-loss be calculated, since the true ground truth (i.e. the high-quality version of the low-quality image) for MSE computation purposes appears unavailable? This might be clarified. g) It does not seem clear as to whether the authentic (unpaired) high-quality/good images are utilized within the framework. Are these high-quality images not used during training, or are they used alongside the low-quality images in the unsupervised training? This might be clarified. h) Two importance decoders for producing the importance map are shown in Figure 1. It might be clarified as to whether there is an independent decoder for each of the unsupervised/supervised stages/paths, or whether they are actually the same decoder (i.e. single importance decoder), and if so, whether the importance decoder is trained during only the (unsupervised) contrastive constraining stage.
2. While the trade-off coefficients are stated to be set to 1 for simplicity in Section 2.3, these hyperparameters appear plausibly further optimized.
3. It might be considered to apply the FIQA to the (degraded) DRIVE dataset after enhancement by I-SECRET as well, to provide a point of comparison against the non-reference EyeQ dataset.
4. The enhancement of retinal images with relatively rare features (e.g. microaneurysms, haemorrhages, exudates) might be commented on; would I-SECRET be able to preserve these (rare but true) features after enhancement, or would they be incorrectly identified as artifacts, and removed?
5. For Section 4, the title “Conlusion” might be “Conclusion”
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The semi-supervised method is innovative, but concerns remain over its presentation as detailed in the comments. The paper would probably deserve a higher recommendation if these concerns could be clarified.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

6
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

The research presents a framework to enhance low-quality fundus photographs using a semi-supervised approach consisting of supervised component, unsupervised component and importance estimation component. The approach is compared with other state-of-art techniques both quantitatively and qualitatively and it shows better performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This research is the first semi-supervised approach for the enhancement of fundus photographs. The enhanced fundus photographs can be used in future work to determine the degree and severity of ophthalmic diseases. Various evaluation metrics have been used to compare the performance against other approaches, which shows superior performance using the proposed approach.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Implementation details of other methods: CycleGAN and cutGAN are not provided. What kind of network was used for the generator/discriminator of CycleGAN? Was it based on U-Net or ResNet? Was the result from the proposed approach statistically significant in comparison with other approaches? It would be interesting to at least qualitatively visualize the performance on the “Reject” quality images of the EyeQ dataset, which has not been used.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The training code for the research has been provided, The hyper-parameters for the proposed approach has been described and the details of the training/testing set has been provided.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

This work provides a novel approach to enhance low quality fundus photographs. In future work, it would be interesting to see the actual application of the work to determine the severity of ophthalmic diseases, how this work helps to accurately diagnose the cause even from low quality images.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The research has potential to assist in future work for better diagnosis from low quality fundus images. In emergency situations, the fundus photographs may be more readily available, and the novel enhancement approach may enable proper diagnosis from degraded fundus images.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper received positive comments. The whole paper is interesting and verifies the effectiveness and novelty of the proposed method. The major concerns of reviewers are details of implementation and experiment. The authors should address these issues in the final version. Overall, it reaches the minimum requirement for publication.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

1

Author Feedback

#1: Q3 & Q5: Due to the space limitation of MICCAI, we cannot present all details in the current version. We will release all our source codes in the final version so that readers can easily reproduce all our experimental results and fully understand the proposed pipeline. We will also provide more details in our future journal version. #2: Q6.1.(a): We will modify Fig. 1 accordingly. Q6.1.(b): Due to the space limitation of MICCAI, we cannot present all details of the network architecture in the current version. We will release all our source codes in the final version so that readers can access the entire network architecture which mainly follows CycleGAN. Q6.1.(c): We choose 256 patches, including 255 negative samples and one positive sample. For the patch size, it is related to the down-samples. We recommend readers to follow details in CutGAN (taesungp/contrastive-unpaired-translation) or our officially-published github codes in the final version of this work. Q6.1.(d): The aim of these three stages is to optimize the enhancement network. Therefore, we train these stages simultaneously for a better co-optimization. Q6.1.(e): Yes. There are three images in one batch, with one being an authentic poor-quality image, one being an authentic high-quality image and the other one being the artificially-degraded image of the authentic high-quality one. Q6.1.(f): The IS-loss is only calculated in the supervised stage. In the unsupervised stage, we only predict the importance but do not calculate the IS-loss, since there is no ground truth. We utilize the estimated importance to get the ICC loss, which is an unsupervised contrastive loss. Q6.1.(g)：The authentic high-quality images are used in the adversarial training stage. We train the discriminator to distinguish authentic high-quality images and enhanced ones in an unpaired fashion. Q6.1.(h): They are the same decoder. The importance decoder is only trained in the supervised stage. In the unsupervised stage, we freeze the gradient in the importance decoder (the detach operation in PyTorch). We only need to predict the importance map in the unsupervised stage and use it to assign a weight to the contrastive loss. Q6.2: We have tried to fine-tune these trade-off coefficients, and found there is no significant performance gap between different selections. Therefore, we choose the simplest 1:1 combination. Q6.3: The number of images in the DRIVE dataset is limited (only 40). Therefore, we choose VSD as the evaluation metric, which calculates each pixel as an instance. It may be much fairer and more significant than FIQA when evaluating the performance improvement on the DRIVE dataset. Q6.4: Honestly, it is the most challenging issue in the fundus enhancement task. Although our proposed I-SECRET is more likely to avoid such information modification compared to other methods and work much better, it may still fail in specific cases. Most fundus image enhancement methods don’t seem to have a positive impact on diagnosing diabetic retinopathy which depends on the rare features pointed out by the reviewer. There is still a long way to go, and we hope to have a better solution in the future.

#4 Q3.1: Due to the space limitation of MICCAI, we cannot provide all details of CycleGAN/CutGAN in the current version. We refer readers to their original papers provided in the reference. We choose the ResNet backbone in CycleGAN with some modifications. We will release all our source codes in the final version so that readers can access all our implementation details. Q3.2: The proposed approach is significantly better than other compared ones in terms of each metric (p-value < 1e-5). We will provide more results at GitHub and our future journal extension. Q3.3: Due to the space limitation of MICCAI, we cannot provide such visualization results. The rejected images are mainly those who lack key anatomical information (e.g., optic cup) or overexposure.

back to top

I-SECRET: Importance-guided fundus image enhancement via semi-supervised contrastive constraining