Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Ruifei Zhang, Sishuo Liu, Yizhou Yu, Guanbin Li

Abstract

Biomedical image segmentation plays a significant role in computer-aided diagnosis. However, existing CNN based methods rely heavily on massive manual annotations, which are very expensive and require huge human resources. In this work, we adopt a coarse-to-fine strategy and propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation. Specifically, we design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting, respectively. In the first phase, only the segmentation branch is used to obtain a relatively rough segmentation result. In the second step, we mask the detected lesion regions on the original image based on the initial segmentation map, and send it together with the original image into the network again to simultaneously perform inpainting and segmentation separately. For labeled data, this process is supervised by the segmentation annotations, and for unlabeled data, it is guided by the inpainting loss of masked lesion regions. Since the two tasks rely on similar feature information, the unlabeled data effectively enhances the representation of the network to the lesion regions and further improves the segmentation performance. Moreover, a gated feature fusion (GFF) module is designed to incorporate the complementary features from the two tasks. Experiments on three medical image segmentation datasets for different tasks including polyp, skin lesion and fundus optic disc segmentation well demonstrate the outstanding performance of our method compared with other semi-supervised approaches. The code is available at https://github.com/ReaFly/SemiMedSeg.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87196-3_13

SharedIt: https://rdcu.be/cyl1D

Link to the code repository

https://github.com/ReaFly/SemiMedSeg

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces the unsupervised inpainting task and designs a GFF module to enhance the network’s representation learned from unlabelled data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) This paper introduces the unsupervised inpainting task and designs a GFF module to enhance the network’s representation learned from unlabelled data. (2) The proposed method outperformed several semi-supervised methods on three different tasks.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) The motivations of using the inpainting task and GFF module are not clear, especially for the GFF module. (2) It is hard to understand the results in Fig.4.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Most key experiment details are included so that it is possible to replicate this work.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    (1) The motivations of using the inpainting task and GFF module are not clear, especially for the GFF module. GFF module includes many operation steps, it is hard to understand the principle of each step.

    (2) Inpainting decoder and Segmentation decoder have different goals. So, it is more reasonable to design different GFF modules for different branches instead of using the same GFF module on both branches.

    (3) There are questions on the results of Fig.4. What’s the difference between the red line and blue line?

    Why the performance gain between the green line and blue line closes to 0 when using 100% labeled data?

    (4) How does the number of unlabelled data impact the performance?

    (5) The paper claims “We propose a novel self- supervised semi-supervised learning paradigm for general lesion region segmentation of medical imaging, and verify that the pretext self-supervised learning task of reconstructing the lesion region at the pixel level can effectively enhance the feature learning and greatly reduce the algorithm’s dependence on large-scale dense annotation.”. The statement is overclaimed since the superiority of pixel-level reconstruction on semi-supervised learning has been proved in reference [2].

    (6) It is better for this paper to employ more extra unlabelled data instead of only using part labelled data as unlabelled data, which contributes to a more accurate segmentation performance. For example, for the skin lesion segmentation task, it is easy to collect a large number of unlabelled data from the ISIC official website (https://challenge.isic-archive.com/).

  • Please state your overall opinion of the paper

    borderline reject (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although some parts of this paper are interesting, I still have some comments:

    (1) The motivations of using the inpainting task and GFF module are not clear, especially for the GFF module. GFF module includes many operation steps, it is hard to understand the principle of each step. (2) Inpainting decoder and Segmentation decoder have different goals. So, it is more reasonable to design different GFF modules for different branches instead of using the same GFF module on both branches. (3) There are questions on the results of Fig.4. What’s the difference between the red line and blue line? Why the performance gain between the green line and blue line closes to 0 when using 100% labeled data? (4) How does the number of unlabelled data impact the performance? (5) Some statements are in this paper.

    The authors should address these comments during feedback.

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    6

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    This paper proposes a multi-task framework for addressing semi-supervised biomedical image segmentation. Concretely, a dual-task network is introduced for segmentation and lesion region inpainting by a coarse-to-fine strategy, and a gated feature fusion module is designed to enhance representation learning between the two tasks. Experimental results on three benchmark segmentation datasets demonstrate the effectiveness of the proposed method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) This paper is well-written and easy to understand. 2) In semi-supervised learning, lesion segmentation is one of the most challenging tasks. This paper proposes an efficient framework for this task. 3) Although cascade architecture is common in deep learning, the Gated Feature Fusion module in this paper is interesting. 4) Quantitative experiments well validate the core idea.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) My main concern is on loss functions. The authors adopt deep supervision strategy to train the deep network, what if only one loss at the output layer is used? According to my experience, intermediate supervision (i.e., deep supervision strategy used in this paper) is a very efficient trick for semantic segmentation. However, there is no ablation study on this in the paper. 2) There are some unclear descriptions in the formulations. For example, how to mask the original image in Eq. (1)? Is y_hat is a binary image or a probability map? In Eq. (2)-(11), y_hat’s seems to be probability maps.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    1) The proposed method is validated on three public datasets. 2) The authors have promised to release the source code, once the paper is accepted. 3) The implementation details, including hyper-parameter configuration, are enough for the reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1) Ablation analysis on deep supervision strategy should be conducted in the experiments. 2) Some implementation details should be added, such as mask operation in Eq. (2), ambiguous notations in equations.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The basic idea is interesting, and the experiments are sufficient.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Somewhat confident



Review #3

  • Please describe the contribution of the paper

    They proposed a novel self-supervised semi-supervised learning framework for lesion segmentation. Their framework uses dual-task objective (lesion segmentation, inpainting) which can contribute to nourish feature learning mutually. They evaluated their method on three datasets (kvasir-SEG, ISBI 2016, Rim-one r1). They examined effectiveness of their self-supervised learning framework varying ratio of labeled data and effectiveness of their GFF module.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Their novel framework uses dual-task objective (lesion segmentation, inpainting) which rely on similar feature information. This dual-task objective could contribute to enhance feature learning mutually and they used this mutual learning to tackle the lack of annotated data. We can apply this concept with other two related tasks which rely on similar feature information in self-supervised learning. For example, on fundus image, vessel segmentation task and lesion future prediction task.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The authors evaluated their performance on three datasets. But they only provide detail comparison with kvasir-SEG dataset (Table 1.). Also, in kvasir-SEG dataset case, it is hard to say that their method is superior to other method because of the small difference of performance. They evaluated their effectiveness for self-supervised learning varying ratio of labeled data. But they didn’t compare with other self-supervised learning method (Fig. 4.). So, it is impossible to compare with their method and other method in various condition varying ratio of labeled data.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    They mention the code will be available online.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    In Fig.2. if there is a sign or word for convolution process W_r,W_s, it would be better for understanding. In Gated Feature Fusion (GFF) module, if there is a detail explanation like kernel size for convolution process W_r,W_s, it would be better for understanding. In Gated Feature Fusion (GFF) module, if there is an example feature image for r_i,s_i,e^(~i),e_seg^i,e^(~i), it would be better for understanding how GFF module works. In ISBI 2016 dataset and Rim One r1 dataset, if there is a table-type comparison to show the advantage of proposed method, it would be better for convincing. In Table 1., the difference of performance with other method is too small. In Fig 4. if there is a performance of other method, it would be better for convincing in various condition varying ratio of labeled data.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    They proposed a novel framework which uses dual-task objective (lesion segmentation, inpainting) to tackle the lack of annotated data. This framework looks versatile to apply with other related tasks in self-supervised condition. Although their improvement of performance using kvasir-SEG is not dramatic, but if they attach another comparison result using ISBI 2016 skin lesion dataset and Rim One r1 dataset and emphasize the advantage of their method, their method would be more valuable.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper presents a dual-task framework for semi-supervised biomedical image segmentation. The main strength of proposed method lies in introducing unsupervised inpainting task and design a GFF module for enhancing network’s representation from unlabelled data. The experimental results on three datasets demonstrate consistent improvements. The authors are required to address concerns raised by reviewers in the rebuttal, such as the motivation of using inpainting task, the reasonability of designing GFF module, ablation studies on loss functions, marginal performance improvement over other methods, etc.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    4




Author Feedback

R1:Concerns about motivations of our network. The inpainting branch acts as a bridge to connect labeled data and unlabeled data. For labeled data, the inpainting encoder can help exploit more residual lesion features to aid segmentation based on our designed two-stage framework. Meanwhile, the lesion region inpainting task is also able to create supervision signals for massive unlabeled data, further enhancing the feature representation learning of the lesion regions. We believe the lesion region inpainting and segmentation rely heavily on similar features and can be processed simultaneously by attaching different decoders to a shared encoder.
To better integrate features from the two tasks in the second stage, tailor-designed GFF modules are incorporated to filter the information of non-lesion region adaptively and select more beneficial lesion features from the inpainting encoder for better feature fusion with the features from the segmentation encoder. The reset gate and select gate are responsible for filtering and fusing respectively. The principle of each step in GFF is ensuring that the integrated features are more significant or at least not worse than the original features. As mentioned above, we believe that the inference of two tasks depends on similar features, so we design shared GFF modules, while reducing the amount of model parameters to reduce the risk of over-fitting. R1:Questions about Fig.4. To verify that our proposed framework can mine residual lesion features and enhance the lesion representation by GFF modules in the second stage, we conduct experiments and draw the blue line in Fig.4. The blue line denotes that our method uses the same labeled data with the baseline (the red line) to perform the two-stage process, without utilizing any unlabeled data. Note that we only calculate the segmentation loss for the labeled data. The performance gains compared with the baseline show that our network mines useful lesion information in the second stage. The green line means that our method introduces the remaining as unlabeled data for the inpainting task, further enhancing the feature representation learning of the lesion regions and improving the segmentation performance. When using 100% labeled data, the green line is equivalent to the blue line since no additional unlabeled data is utilized to do the inpainting task, thus maintaining the same results. R1/R3:Concerns about performance under different ratios of data. The performance of our method under different ratios of labeled data and unlabeled data is shown in Fig.4. We conduct more comparison experiments with other methods. Our approach shows superior performance and outperforms the second-best approach (MASSL) by 2.72%, 0.69%, 0.92%, 0.74% and 0.62% in terms of Dice score using 10%, 20%, 30%, 50% and 80% of the training set as the labeled data and the remaining as the unlabeled data respectively. We can conclude that our method has more significant performance with extremely less labeled data. R1:Suggestions about collecting extra unlabeled data. We collect more extra unlabeled data (1000 images in total) for skin lesion segmentation task. We utilize the original training set as labeled data and collected images as unlabeled data, obtaining the Dice of 92.14%, improving the performance by 0.76% compared with the baseline only using the original training set in a fully-supervised manner. R2:Concerns about deep supervision. We supplement an ablation study of deep supervision and obtain 87.14% and 86.83% Dice scores with and without deep supervision.
R3:Concerns about experiments on other two datasets. The detailed experimental results are listed in the supplemental material. R1/R2/R3:Concerns about unclear descriptions and statements. The initial segmentation map used to mask the original image is a binary image. We will revise the claimed statements to highlight the contribution of our introduced inpainting task instead of general reconstruction.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed major concerns in the rebuttal.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    2



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors responded to the major concerns, e.g., motivation and experimental setting. Overall, it reaches the minimum requirement for publication. I prefer to accept.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    4



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper introduces the unsupervised inpainting task and designs a GFF module to enhance the network’s representation learned from unlabelled data. Two reviewers give positive comments while one reviewer concerns about the motivation of using inpainting task, the reasonability of designing GFF module.

    In the rebuttal, the author addressed these issues. Therefore, acceptance is recommended.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    12



back to top