Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Xiangyun Zeng, Rian Huang, Yuming Zhong, Dong Sun, Chu Han, Di Lin, Dong Ni, Yi Wang

Abstract

Semi-supervised learning has been recently employed to solve problems from medical image segmentation due to challenges in acquiring sufficient manual annotations, which is an important prerequisite for building high-performance deep learning methods. Since unlabeled data is generally abundant, most existing semi-supervised approaches focus on how to make full use of both limited labeled data and abundant unlabeled data. In this paper, we propose a novel semi-supervised strategy called reciprocal learning for medical image segmentation, which can be easily integrated into any CNN architecture. Concretely, the reciprocal learning works by having a pair of networks, one as a student and one as a teacher. The student model learns from pseudo label generated by the teacher. Furthermore, the teacher updates its parameters autonomously according to the reciprocal feedback signal of how well student performs on the labeled set. Extensive experiments on two public datasets show that our method outperforms current state-of-the-art semi-supervised segmentation methods, demonstrating the potential of our strategy for the challenging semi-supervised problems. The code is publicly available at https://github.com/XYZach/RLSSS.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87196-3_33

SharedIt: https://rdcu.be/cyl2D

Link to the code repository

https://github.com/XYZach/RLSSS

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes to address an important problem for medical image analysis, i.e. semi-supervised learning for insufficient manual annotations. Authors develop a reciprocal learning strategy with a pair of student and teacher networks. Experimental results on two public datasets are presented to validate the proposed framework.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The semi-supervised learning for insufficient manual annotations addressed in this paper is of great significance for further image analysis.
    2. Experimental results demonstrate its effectiveness.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I think the proposed framework is somehow interesting, but as a scientific research, the experimental results in this paper is inadequate to validate its effectiveness. The writing and architecture design of the paper needs to be polished. Therefore, I think this paper is below the bar of MICCAI.

    1. Lack of comparisons to previous baseline methods for semi-supervised learning such as Li et al. [5]
    2. Ablation study for the main steps of the proposed framework and hyper-parameters should also be introduced.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    almost reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. Lack of comparisons to previous baseline methods for semi-supervised learning such as Li et al. [5]
    2. Ablation study for the main steps of the proposed framework and hyper-parameters should also be introduced.
    3. The writing and architecture design of the paper needs to be polished.
  • Please state your overall opinion of the paper

    reject (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Few innovation comparing with previous work Li et al. [5] MT [11] and Yu et al. [15].

  • What is the ranking of this paper in your review stack?

    5

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    The main contribution of the paper is to fully utilize the limited labeled data by updating parameters of the teacher and the student model in a reciprocal learning way.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This idea is innovative. And the writing of the paper is good.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I wonder if the value of using labeled data is greater than the value of using unlabeled data. On the one hand, in the actual semi-supervised scenario, the unlabeled data is far more than the labeled data; on the other hand, the student network has already made great use of the information of labelled data.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Although the code is not yet public, the results are supposed to be easily repeated.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. In the paper, all methods trained with 20% labeled images and 80% unlabeled images. Has the author tried the results of other ratios?
    2. The traditional architecture is equivalent to the feedback loss used to update the student model, and the teacher model is also from the student model. Therefore, the teacher model in the traditional method actually contains the optimization according to the segmentation effect. Moreover, the design of the two networks also greatly increased the number of parameters, so can the authors prove that the improvement in network performance is due to the new network framework design, rather than the increase in the number of parameters?
    3. Is it unnecessary for teacher model to use the information of label data again?
  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea is sound, therefore it could be accepted with some modifications.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    The author proposed a novel reciprocal learning strategy for medical image segmentation, which achieved better performance than some existing semi-supervised SOTA methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The description of the optimization process is clear.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1, I think the experimental results on the cropped images used by the author have no clinical guiding significance and are not convining. Because for those data without ground truth, you cannot crop centering at region of interests (ROIs). The author should train their model and compare their results on original images rather than cropped images. 2, In figure 1, what is the difference between the output for unlabeled set and the unlabeled set for input in Student Model? The author should clearly describe it.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    According to the description in the paper, I think it is difficult to reproduce the proposed approach. Although the architecture is simple, how to train this teacher-student model is not clear. However, It is a good thing because the author states they will open the code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1, I think the experimental results on the cropped images used by the author have no clinical guiding significance and are not convining. Because for those data without ground truth, you cannot crop centering at region of interests (ROIs). The author should train their model and compare their results on original images rather than cropped images. 2, In figure 1, what is the difference between the output for unlabeled set and the unlabeled set for input in Student Model? The author should clearly describe it.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The manuscript is well-organized. The proposed strategy is simple yet efficient.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    3

  • Reviewer confidence

    Confident but not absolutely certain




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper is well-written with good explanation of the novelty and algorithm. The concern from reviewers are mainly on the experiment design, R3-4-1, R2-7-1, R2-7-2, R1-7-2. Please try to address them in the rebuttal.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    10




Author Feedback

We cordially thank you for your time and efforts on the review of this submission. We have carefully studied your valuable suggestions and addressed your main concerns in this rebuttal. To sum up, we have presented a series of new comparison experiments to corroborate the efficacy of our reciprocal learning strategy.

1) R3-4-1: ROI issue This study mainly focused on the challenging problem of semi-supervised learning for insufficient annotations. Several semi-supervised segmentation studies used cropped images for validations, e.g., UAMT [15] used cropped left atrium images, and [r1] used cropped pancreas images. We followed their experimental settings. In our experiments, we directly used the left atrium images from [15]. As for the pancreas segmentation, it already is a challenging task even using fully supervised models because the ROI only occupies a small percentage of the whole abdominal CT. Most studies [r2-r5] used detection networks to roughly locate ROIs for further segmentation. To directly show the efficacy of our semi-supervised strategy, we cropped pancreas images for validations. We followed your suggestion to try on whole abdominal CT. The V-Net1 (20% labels) and V-Net2 (100%) got 51.58% Dice with 35.23% Jaccard, and 76.86% Dice with 63.15% Jaccard. UAMT got 67.74% Dice and 51.91% Jaccard. Our method had 72.51% Dice and 57.33% Jaccard. All above results are degraded than using cropped data. Our method still outperformed V-Net1 and UAMT, which shows the efficacy of our semi-supervised strategy. [r1] AAAI2021, arXiv:2009.04448 [r2] MICCAI2018, 10.1007/978-3-030-00937-3_55 [r3] MedIA2018, 10.1016/j.media.2018.01.006 [r4] MICCAI2019, 10.1007/978-3-030-32245-8_23 [r5] TMI2019, 10.1109/TMI.2019.2911588

2) R2-7-1: the ratio of labeled images in training We followed the typical semi-supervised experiments to report results obtained using 20% annotations. We also evaluated all methods by using 10% and 30% annotations. Our method consistently outperformed other SOTA. Pancreas: Dice, Jaccard, ASD, 95HD 10%/30%: V-Net1: 58.67/79.46, 42.48/66.38, 13.53/2.55, 40.00/8.73 MT: 64.43/80.35, 48.48/67.49, 6.81/3.93, 23.16/13.72 SASS: 63.47/78.73, 47.06/65.48, 9.33/2.71, 28.30/9.64 UMCT: 65.77/79.41, 49.66/66.25, 12.80/6.76, 38.02/24.00 UAMT: 68.16/79.78, 52.52/66.66, 7.91/4.24, 23.89/14.68 TCSM [5]: 68.47/80.05, 52.61/67.10, 6.91/4.79, 22.87/18.25 Ours: 71.31/81.17, 56.41/68.58, 5.36/2.67, 17.73/10.29

Left Atrium: Dice, Jaccard, ASD, 95HD 10%/30%: V-Net1: 68.45/89.32, 55.03/80.85, 8.9/2.24, 31.69/7.77 MT: 84.84/89.97, 74.2/81.85, 4.67/2.50, 17.4/8.63 SASS: 85.7/90.28, 75.19/82.36, 5.50/1.77, 21.38/6.68 UMCT: 85.27/90.63, 75.47/82.93, 3.93/1.69, 13.45/5.42 UAMT: 84.69/89.93, 74.02/81.79, 3.90/2.01, 13.64/7.05 TCSM: 84.14/90.02, 74.57/81.96, 2.56/1.89, 10.69/6.63 Ours: 86.23/90.34, 76.47/82.46, 3.23/2.84, 11.94/8.62

3) R2-7-2: network parameters We increased the number of filters in the backbone of UAMT [15] to make it has the same order of parameters as ours. UAMT got 80.37%/87.21% Dice, 67.63%/77.86% Jaccard, 3.19/2.51 ASD and 10.24/9.41 95HD on pancreas/left atrium images. Our method consistently outperformed UAMT. Furthermore, UMCT [13] used three different views, and its parameters is 1.5 times larger than ours. But our method had better performance. All above results show the performance improvement is due to our strategy, rather than the increase of parameters.

4) R1-7-2: ablation study To do ablation study, we discarded our reciprocal learning strategy by fixing teacher model after it was well pretrained. The results were 73.82%/86.82% Dice, 59.38%/77.27% Jaccard, 4.62/3.69 ASD and 17.78/12.29 95HD on pancreas/left atrium images, which shows our reciprocal learning contributes to the performance improvement. We also added a comparison with TCSM [5] by Li et al. TCSM got 78.17%/86.26% Dice, 64.95%/76.56% Jaccard, 5.06/2.35 ASD and 17.55/9.67 95HD on pancreas/left atrium images. Our method consistently outperformed TCSM.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In this paper, the authors proposed a semi-supervised method by adding the feedback loop from the student network to the teacher network. The motivation and explanation of the algorithm is easy to understand. The experimental results are promising. The reviewers raised questions on the detail of exp setting which are well addressed in the rebuttal. Recommend to accept and ask the authors to add the exp details to the supplementary if finally accepted.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    8



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Authors have addressed the concerns raised by the reviewers regarding the empirical validation of their method. In particular, authors have added experiments without pre-computed ROIs, changing the ratio of labeled/unlabeled data and including the method TCSM, suggested by one of the reviewers. Despite the technical contribution is rather limited, the experimental validation (pre and after rebuttal) is comprehensive, demonstrating superior performance over existing approaches.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    1



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Reviewers are generally positive on the idea of the proposed method. Major concerns include insufficient evaluations and comparison with other methods, ablation studies on major components and label ratios, and the data used in experiments (ROIs). Therefore, the current manuscript is not sufficiently convincing to readers. In the rebuttal authors made clarification regarding the ROI question, also added additional experiments on label ratio and other ablations (which is not appropriate for rebuttal). I think although ROI is used for some other works, it is still somewhat “conflicting” with the idea of training without labeling, since the detection effort for organs like pancreas can be significant for annotators. Major part of authors’ rebuttal is adding extra experiments, so I would not call it a good rebuttal. Hence, considering the overall quality of current manuscript, I would suggest rejection.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    11



back to top