Back to top List of papers List of papers - by topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Reda Abdellah Kamraoui, Vinh-Thong Ta, Nicolas Papadakis, Fanny Compaire, Jose V Manjon, Pierrick Coupé
Abstract
Semi-supervised learning (SSL) uses unlabeled data to compensate for the scarcity of annotated images and the lack of method generalization to unseen domains, two usual problems in medical segmentation tasks. In this work, we propose POPCORN, a novel method combining consistency regularization and pseudo-labeling designed for image segmentation. The proposed framework uses high-level regularization to constrain our segmentation model to use similar latent features for images with similar segmentations. POPCORN estimates a proximity graph to select data from easiest ones to more difficult ones, in order to ensure accurate pseudo-labeling and to limit confirmation bias. Applied to multiple sclerosis lesion segmentation, our method demonstrates competitive results compared to other state-of-the-art SSL strategies.
Link to paper
DOI: https://doi.org/10.1007/978-3-030-87196-3_35
SharedIt: https://rdcu.be/cyl2F
Link to the code repository
N/A
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
In this paper (POPCORN: Progressive Pseudo-labeling with Consistency Regularization and Neighboring), the authors propose a method combining consistency regularization and pseudo labeling for image segmentation.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
None.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Highlights of previous limitations to provoke the authors to conduct this study were not clearly been described.
- The proximity graph that was used to select the new unlabeled data at each selection step has an updated latent distance from the previous selection and thus has the limitation. Especially, the included pseudo labels are not of the similar quality as ground truth and can limit the further improvement of unlabeled data.
- Not sure, how the confirmation bias is limited with an increase of difficulty in data selection (especially, when the pseudo labeled data of previous selection is included in the updated training set).
- Authors state that to utilize the interesting properties of CR on the learned features, using the same input data under different perturbations could not be sufficient and is not adapted for segmentation. However, this paper also utilized similar kinds of perturbations for the data augmentation.
- the slight improvement of the results reported when compared to state-of-the-art strategies were also supporting the above limitations.
- Please rate the clarity and organization of this paper
Poor
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
None
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- Lines 1-4 and 8-10 of the abstract should be rephrased.
- (Introduction) Among SSL works … three main categories: (this needs rephrasing).
- (Section 2.3) Thus, pseudo-labeling using the trained model is more accurate for .. training data (in complete).
- Please go through the entire manuscript to double check and ensure English grammar errors-free.
- Please state your overall opinion of the paper
reject (3)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Reject. The article has some limitations, especially it’s method. Also, the objective of the paper has not been clearly described. From technical and results point of view, the impact is limited. Moreover, the paper has several linguistic errors. To sum up, there are main issues that need more attention in this particular manuscript.
- What is the ranking of this paper in your review stack?
2
- Number of papers in your stack
3
- Reviewer confidence
Very confident
Review #2
- Please describe the contribution of the paper
The authors propose a novel way of combining two known strategies of semi-supervised learning: consistency regularization and pseudo-labeling. For the former one a new strategy is introduced that helps to ensure that images with similar segmentation results are also close in the latent feature representation space. For the latter one a new approach is introduced that could help to reduce confirmation basis which is the known issue for such strategies.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- I think the paper is very well written and easy to follow
- this topic is definitely of interest and this approach could potentially be used not only in medical imaging but in other domains too
- proper ablation studies and comparing with the state-of-the-art are making this submission very strong
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
in case of consistency regularization it’s not really clear if it’s correct to assume that features should be close for two different images of the same region. anatomies could differ significantly especially in case of anomalies
-
not completely clear how the pseudo labeling procedure would handle the situation if after a number of steps only very different anomalous patches are left and if they are quite different to the extended training set
It would be helpful and strengthen the submission if the authors could clarify on these two issues a bit more.
-
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
I think most information needed for reproducibility of the method is provided.
The only thing that would make this submission stronger is the following:
“An analysis of statistical significance of reported differences in performance between methods.”
it could be useful to add this check in the ablation studies especially to study differences for the cases of the presented approach “without CR” and “without proximity graph”
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
only few minor remarks:
- the Fig. 1 is a bit difficult to follow. If space in the paper allows maybe splitting of the figure could help.
- would be nice if the authors added a bit clarity on the issues mentioned in the “weaknesses” section.
- Please state your overall opinion of the paper
accept (8)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
First, I think the paper is very well written and easy to follow. Second, this topic is definitely of interest and this approach could potentially be used not only in medical imaging but in other domains too. And what is especially nice that the authors provided most or all what is needed to be able to reproduce training scenario on the same or other datasets. Third, the approach is properly evaluated against state-of-the-art approaches. Ablation studied are also provided.
- What is the ranking of this paper in your review stack?
2
- Number of papers in your stack
5
- Reviewer confidence
Very confident
Review #3
- Please describe the contribution of the paper
The authors propose a novel semi-supervised method for image segmentation that combines consistency regularization with pseudo-labeling. In the framework described, consistency regularization ensures proximity in the latent space for similar images, whereas pseudo-labeling selects unlabeled samples of increasing difficulty, limiting the confirmation bias. Applied to multiple sclerosis lesion segmentation, POPCORN obtains superior performance compared to other semi-supervised state-of-the-art approaches.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
*The semi-supervised framework proposed is novel and could be applied to a variety of image segmentation tasks. *Intriguing idea of selecting the unlabeled samples, through the latent space, with increasing difficulty, which follows intuition, and shows significant improvements in the ablation studies. *Strong evaluation comparing POPCORN with three other semi-supervised methods. *The paper is well-written and easy to follow, with a specific paragraph per each contribution proposed in the framework.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
*Although the ablation study evaluates well each contribution added in the framework, it is not clear how the performance scale increasing the amount of unlabled samples. A large dataset of 2901 unlabled images is considered in the study, it would be interesting to see how the performance varies when only a fraction of this dataset is used. This would also allow to speculate the applicability of POPCORN to other image segmentation tasks with a lower number of unlabled samples available. *The hyperparameters tuning or the use of a validation dataset is not specified.
- Please rate the clarity and organization of this paper
Excellent
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Besides some hyperparameters optimization, the framework is very well explained and the authors will make the code publicly available.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
*It is not clear whether the hyperparameters K (number of unlabeled samples) and p (closest samples selected) were chosen empirically or optimised. This should be specified. *What was the threshold applied to the output of the 3D U-Net to obtain the binary predictions? *In the ablation study a model could be added where only half of the unlabled samples are used. *If space allows, aknowledge the limitations of the study in the Conclusions.
- Please state your overall opinion of the paper
accept (8)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The semi-supervised framework proposed is novel, has a strong evaluation and achieves competitive performance. As it can be easily applied to different segmentation tasks, it could be of interest of many researchers in the field. Therefore, I recommend accepting this manuscript.
- What is the ranking of this paper in your review stack?
1
- Number of papers in your stack
5
- Reviewer confidence
Confident but not absolutely certain
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
The idea proposed in the manuscript appeared of interest and was overall well explained. Deeper clarity in the explanation of the choice of hyperparameters is still needed and would help get a proper evaluation of the solidity of the demonstration. Similarly, the lack of statistical analysis for the results interpretation diminishes the evaluation of the impact of the method
- What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).
6
Author Feedback
First, we would like to sincerely thank the reviewers for their fruitful and constructive comments.
R1-3: We added a statistical analysis in the manuscript as follows: “We conducted a Wilcoxon test between methods on Dice scores. The significance is established at p-value<0.05. First, POPCORN had a significantly higher Dice compared to the other state-of-the-art approaches. Second, during the ablation study, POPCORN had a significantly higher Dice than the baseline, baseline with CR, the version without proximity graph, and the version without CR.” Contrary to R2’s claim, this analysis demonstrates that POPCORN brings significant improvements with respect to state-of-the-art strategies.
R1 / R4: Details about hyperparameters setting were added to the manuscript: “The method hyperparameters were chosen empirically according to the size of labeled and unlabeled datasets. First, 200 from the M=2901 unlabeled images were chosen after each training cycle that ran for 2 epochs (K=200, N=2) to limit computational burden. Second, the number of neighbors p=5 was selected considering the initial training data of 21 labeled images. We suggest that this value is a good compromise in order to consider relevant near neighbors while avoiding far neighbors which mislead data selection”.
R4: We also added the performance of our method with only a fraction of the dataset (M=1400) to see how the performance scales. We obtained: Dice=70.59%, Precision=68.26%, Sensitivity=75.91%. For reference, POPCORN with M=2901 obtained: Dice=73.09%, Precision=73.33%, Sensitivity=75.29%.
R3: About POPCORN handling of very different and anomalous patches, we did not reject them in order to show the robustness of our approach to these very different patches (these patches should be added in the end due to their difference). We only performed a quality check on our unlabeled data at the image level, to remove any corrupted images. However, if necessary, it is possible to omit anomalous patches by setting a maximum latent space distance during the selection.
R3: Concerning the applicability of the consistency regularization in the case of anatomical difference, it is true that the assumption of feature proximity at bottleneck level is not guaranteed for traditionally trained networks. However, our CR actively enforces images of the same region with similar segmentation to produce close latent representations. This constraint ensures the proximity even with anatomical diversity.
R4: Concerning the question about the threshold, we used a threshold of 0.5 to obtain the binary segmentation. This is better explained in the revised version.
R2: We disagree with the statement that using an updated latent distance for the proximity graph is a flaw. Our choice is intentional and was made by design. Indeed, our method is meant to work even when the available labeled data are limited. POPCORN ensures that data selection is performed with the latest trained model (latent space is considered more meaningful especially with labeled data scarcity).
R2: We do not claim that confirmation bias is limited using an increasing difficulty in data selection, contrary to R2 comment. Instead, we argue that pseudo-labels produced by the trained model are more accurate for unlabeled images similar to the training data (easiest samples) at each selection step. Consequently, error propagation when producing the pseudo-labels is reduced as assessed by our results.
R2: We do not criticise the use of perturbations on input images as a data augmentation technique for robustness. Instead, we consider them insufficient for CR in SSL segmentation tasks. Thus, in addition to realistic perturbations on input data, we also enforce images with similar segmentation to produce close latent representation (as mentioned in our original submission). The latter property ensures that latent representation also considers a higher level of abstraction as opposed to using only the input.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The framework proposed in this paper is of interest overall well explained and the rebuttal added details help considerably in solidifying the overall message of the paper.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).
6
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The authors addressed the concerns on the results well. Given the overall positive feedback from the reviewers, I would recommend accept.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).
10
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
This work presents a framework to combine within a single framework pseudo-labeling and consistency regularization. The paper is sound and the ideas are novel. During the reviews, minor issues were raised regarding a better presentation of the experimental results, which included providing statistical analyses and a better description of the setup.
These points have been satisfactorily addressed during the rebuttal and thus I recommend acceptance of this paper. The authors are advised to include the remarks of the reviewers in the paper’s final version.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).
9