Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Siqi Wu, Chang Chen, Zhiwei Xiong, Xuejin Chen, Xiaoyan Sun

Abstract

Mitochondria segmentation from electron microscopy images has seen great progress, especially for learning-based methods. However, since the learning of model requires massive annotations, it is time and labour expensive to learn a specific model for each acquired dataset. On the other hand, it is challenging to generalize a learned model to datasets of unknown species or those acquired by unknown devices, mainly due to the difference of data distributions. In this paper, we study unsupervised domain adaptation to enhance the generalization capacity, where no annotation for target datasets is required. We start from an effective solution, which learns the target data distribution with pseudo labels predicted by a source-domain model. However, the obtained pseudo labels are usually noisy due to the domain gap. To address this issue, we propose an uncertainty-aware model to rectify noisy labels. Specifically, we insert Monte-Carlo dropout layers to a UNet backbone, where the uncertainty is measured by the standard deviation of predictions. Experiments on MitoEM and FAFB datasets demonstrate the superior performance of proposed model, in terms of the adaptations between different species and acquisition devices.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87199-4_18

SharedIt: https://rdcu.be/cyl3U

Link to the code repository

https://github.com/ngchc/DA-MitoSeg

Link to the dataset(s)

https://github.com/ngchc/DA-MitoSeg/tree/main/fafb-valid


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper utilizes pseudo labeling scheme for domain adaptive mitochondria segmentation with experimental results outperforming some baseline methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper addresses the problem that the pseudo labels of target domain generated by pretrained model on source domain are so noisy. This paper proposes an uncertainty objective for pseudo label rectification. Adding Monte Carlo dropout layers in network and measuring the model prediction uncertainty by standard deviation. Extensive experiments on MitoEM and FAFB datasets show performance improvements. Besides, this paper carefully studies how uncertainty objective helps domain adaption, which is remarkable.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed uncertainty-aware method is based on existed approaches, so its novelty is incremental. More specific design can be further integrated in the uncertainty objective.

    Experiments on comparison with sota methods and ablation studies are a little weak. Extra experiments should be further done.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method can be reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    On page 4, “Monte-Carlo dropout enables stochastic sampling in both training and inference phases”. Why dropout should be used in inference phases? Can I regard inference as generating pseudo labels for target domain when training or test trained model on target domain? Dropout layers should be usually closed when testing. Or if you have some specific designs.

    On page 4, “we generate the prediction p for a target sample x by T times inferences”. Is here p representing for model probability output or hard pseudo labels? If you need to further filter out unreliable pseudo labels via confidence and uncertainty, this should be soft probability.

    This paper measures the model prediction uncertainty by standard deviation. However, it does not present the mathematical formulation of the uncertainty objective. In other words, how you calculate uncertainty in specific?

    In experiments section, what does ‘Baseline’ this method stand for? It seems this is confusing in the paper. Is it using trained model on source domain to test on target domain?

    In the U-Net architecture, why you just add dropout layers in encoder? What if further incorporate dropout layers in decoder or just add dropout layers in decoder?

    It is obvious that for pseudo labels filtering, four threshold hyperparameters \tau_p, \tau_n, u_p, and u_n are very important for model performance. Have you studied these hyperparameters in your experiments on how they influence the model performance? This is an important point.

    This paper has not compare proposed method with previous sota domain adaption methods. More comparison method should be added for a complete comparison. For example, 1) Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation (https://arxiv.org/abs/2003.03773); 2) Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (https://arxiv.org/abs/2101.10979); 3) MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation (https://arxiv.org/abs/2103.05254); 4) Content-Consistent Matching for Domain Adaptive Semantic Segmentation (https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123590426.pdf)

  • Please state your overall opinion of the paper

    borderline reject (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Incremental novelty and weak SOTA comparison experiments.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #2

  • Please describe the contribution of the paper

    The paper proposes a new method for label rectification in unsupervised domain adaptation, based on estimating the source network uncertainty through Monte-Carlo drop-out. The method is applied to the problem of mitochondria segmentation in electron microscopy images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • interesting and conceptually simple idea. While Monte-Carlo dropout has already been used for unsupervised domain adaptation ([Kurmi et al, Curriculum-based dropout…, BMVC 2019] or [Ringwald & Stiefelhagen, BMVC 2020]), prior work does not go beyond exploiting it for feature alignment.
    • extensive evaluation of domain adaptation performance for significant domain gaps
    • well-illustrated paper with key graphics for all important points
    • a new mitochondria dataset becomes available to the community if the authors release the mitochondria annotations for FAFB.
    • although the authors only evaluate mitochondria prediction task, the method is clearly applicable to other use cases in microscopy
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • No discussion of adversarial dropout and the corresponding domain adaptation methods [Lee et al, Drop to Adapt, ICCV 2019].
    • Minor: the language could be improved
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The MitoEM part would be reproducible, but reproduction of FAFB results would require the authors to share their annotations - a potentially important contribution to the community.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • I do not understand why the authors choose to use the abbreviation MitoSeg for the segmentation problem itself. Conventionally, new methods receive new names, abbreviated or not, but old problems do not. I have never seen anyone re-term image classification to ImClass and use this term in their paper.
    • As the method is not at all mitochondria-specific, the paper would benefit from a more thorough discussion of potential applicability to other microscopy problems.
    • Minor: page 3 “standard derivation” should be “standard deviation”, “perdition” should be “prediction”. These are just two examples, there are many other typos.
  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    A simple new idea for label propagation, contributing to the important field of unsupervised domain adaptation.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #3

  • Please describe the contribution of the paper

    This paper describes a domain adaption method for mitochondria segmentation in EM images. The essential spirit of the described domain adaption method is making the neural network being able to predict uncertainty and use the uncertainty to rank the pseudo labels in new domain, then fine-tune the model on new domain using the filtered pseudo labels.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    I think the main contribution of this paper is presenting a very simply, yet effective high-level strategy for domain adaption. I completely agree with the authors on the logic of this high-level strategy, but there are still quite a few core steps under this high-level strategy not fully justified in this paper, see the next section for more details.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The assumption about “a pseudo label with lower uncertainty has a higher probability to be the correct one” (claimed to be verified in Fig 4 and Table 3) is not justified. How is the number of residual noisy labels calculated? Do you have ground truth? If so, need to explain in the paper. Also, these is a binarization step in calculating incorrect psedolabel. What is the cutoff being used? The same for confidence based and uncertainty based method? Should they be the same? If not, how to decide? Also, it is obvious that the less pseudo label we keep, the less noisy label you will have in the labels. I think an important metric would be “Suppose you use 40% rank to filter the pseudo labels, among the remaining labels, the percentage of false positive labels.”

    • the statement of “we propose an uncertainty-aware model” or something similar has been used a lot in the paper. My understanding is that the authors are claim this as part of the novelty of this paper, which I cannot agree on. First, if you simply search “uncertainty aware segmentation”, you can already find a lot of prior works. Also, why choosing the specific combination of Monte Carlo dropout and UNet as the uncertainty aware model? Why not other methods, like Probabilistic U-Net or other uncertainty estimation methods? I think this worth at least some conceptual discussion or even emprical comparisons.

    • Is there any reason or consideration why this method is only applicable on mitochondria segmentation? But, the whole paper is describing the method with a sense of specific to mitochondria.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    As mention in “we manually annotate 8 sections with a size of 1736x1736 for quantitative evaluation.”. Will the annotation be released for public access to reproducce the results in the paper? How were these manual annotations done? By experts or students? how much can we trust these annotation?

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    This paper starts with a very very impressive high-level strategy (making the neural network being able to predict uncertainty and use the uncertainty to rank the pseudo labels in new domain, then fine-tune the model on new domain using the filtered pseudo labels). But, the actual methods presented in the paper are lack of justifications, in certain steps, such the point above about correlation between low uncertainty and low errors in pseudolabels. I would high recommend the authors re-visit the method step-by-step, and provide either valid conceputal justification or empirical experiments to support why the proposed method is the best choice or the most reasonable choise.

    In addition, I would recommend to re-phrase the method as a general method and test its applibility in different scenarios, not restricting the mitochondria segmentation.

    Also, I would suggest to plan the release of the code for reproducibility (not mentioned in the current submission).

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Based on the pros and cons, and constructive feedback described above, I would consider this paper more like a technical report instead of a scientific research paper.

  • What is the ranking of this paper in your review stack?

    5

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes an uncertainty-based loss to rectify labels in unsupervised domain adaptation. The method is applied to mitochondria segmentation is shown to be effective on two datasets (MitoEM and FAFB).

    Reviewers think the idea is simple yet effective. Empirical evaluation sufficiently demonstrates the uncertainty objective does help domain adaptation.

    While some reviewers consider the contribution is incremental technically, others thinks application of uncertainty objective to label rectification is novel. The authors should clarify this in the rebuttal and discuss citations given by R1 and R2.

    R2 think the release of FAFB is an important contribution to the community. Could the author please confirm?

    Some questions on technical details are also raised and need to be addressed in the rebuttal.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3




Author Feedback

R1: Clarification of novelty. Reply: The usage of existed UNet and Monte-Carlo dropout does not weaken the novelty of this paper, as we study the importance of label rectification for domain adaptive EM segmentation, for the first time. We point out a new avenue for the advanced domain adaptation, i.e., uncertainty-aware label rectification. This novel strategy is conceptually compatible with different uncertainty-aware models, while we propose the first simple yet effective embodiment.

R1&R2: Discussion of the given citations. Reply: We thank the reviewers for providing these related works and will add discussion in the revised paper. R2 mentioned 1 paper about the adversarial dropout, whose non-stochastic mechanism is different from our proposed stochastic sampling. R1 mentioned 4 papers and requested for comparison. However, all of them are designed for natural image segmentation, where the streetscape images are substantially different from the EM images discussed in this paper. Thus, it may not be suitable to recognize these methods as the off-the-shelf ones for EM images.

R2&R3: Confirmation of data and code release. Reply: Yes. We will make the annotations for FAFB publicly available and share the code for reproducibility.

Other questions on technical details are addressed as follows. R1: The usage of Monte-Carlo dropout. Reply: We detailed the preliminaries in the first paragraph of Sec.2, which clarifies why the stochastic sampling should be used in the inference phase. R1: Is the prediction p a soft model probability? Reply: Yes. R1: Mathematical formulation of standard deviation. Reply: We will clarify the formulation of u(p). R1: Is ‘Baseline’ using trained model on source domain to test on target domain? Reply: Yes. R1: Adding dropout layers in decoder. Reply: It is an alternative implementation for the proposed method. However, the goal of this paper is not to find an optimal implementation by grid-searching the network architecture. We empirically set up a reasonable model and pay more attention to whether and how uncertainty objective helps the adaptive mitochondria segmentation. R1: Have you studied different settings for thresholds? Reply: Yes. In Tables 1 and 2, we studied the performance at different soft thresholds, i.e., from 20% to 80%.

R3: How is the number of residual noisy labels calculated? Do you have ground truth? Reply: Yes. We utilize the ground truth annotations provided on MitoEM dataset for the calculation. R3: What is the cutoff being used in the binarization step? Should they be the same for confidence-based and uncertainty-based methods? Reply: The thresholds for cutoff are determined by ranking, i.e., the soft threshold, as detailed in the penultimate paragraph of Sec.2. Since we adopt the soft thresholds, the absolute cutoff values are not needed to be the same. R3: Measuring the remaining labels. Reply: We believe it is not very reasonable to measure the remaining labels, since only the selected pseudo labels are adopted for fine-tuning on the target domain. R3: A conceptual discussion for other uncertainty-aware models, such as Probabilistic U-Net. Reply: The goal of Probabilistic U-Net is to model the annotation uncertainty from multiple labelers, in a form of distribution. As it requires multiple annotations in the training phase, such a kind of uncertainty-aware model is not applicable to the setting of label rectification for domain adaptation. R3: How were the manual annotations done? Reply: The annotations are first made by three students and then proofread by an expert.

R2&R3: Potential applicability to other microscopy problems. Reply: We thank the reviewers for recognizing the merit of our proposed method. We adopt the mitochondria segmentation as a representative application scenario, mainly because the available datasets support a relatively comprehensive investigation, in terms of different species and acquisition devices.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Reviewers generally consider the paper a solid contribution. The concerns were reasonably addressed. I am not fully convinced by the authors’ answer regarding novelty (raised by R1). Ideally, there should be discussion regarding the challenges in the biomedical image context, that were not considered by previous natural-image-focused methods. But overall, the proposed method is reasonable and effective. I recommend the paper to be accepted.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    8



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper introduces an unsupervised domain adaptation method for mitochondria segmentation, which uses Monte-Carlo dropout layers in a U-Net architecture to measure the uncertainty of network predictions and rectify pseudo-labels in the target domain. The rebuttal has addressed most concerns from reviewers, e.g., the contribution of the paper and discussion of other related work. In addition, the source codes and data will be released to the public (the authors promise). Thus, the paper is recommended for acceptance.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    6



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After reviewing the paper, the reviewes and the rebuttal, I am not convinced by the experimental section of the paper.

    Primarily following the aguments from R1 which are hihgly rellvant the main critisicm is that the novelty is questionable. To support this statement, a number of very relevant publications have been given here that are extremely relelvant. The authors claim that “However, all of them [refs] are designed for natural image segmentation, where the streetscape images are substantially different from the EM images discussed in this paper. Thus, it may not be suitable to recognize these methods as the off-the-shelf ones for EM images.”

    This argumentation is quite weak. The paper is written in a feneral formulation, void of explicit mention of microscopy-specific information in the methodology. Yet the authors argue that the references are not applicable as they are too general. The authors should have compared their method to more approporaite baselines.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    17



back to top