Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Marvin Lerousseau, Marion Classe, Enzo Battistella, Théo Estienne, Théophraste Henry, Amaury Leroy, Roger Sun, Maria Vakalopoulou, Jean-Yves Scoazec, Eric Deutsch, Nikos Paragios

Abstract

The vast majority of semantic segmentation approaches rely on pixel-level annotations that are tedious and time consuming to obtain and suffer from significant inter and intra-expert variability. To address these issues, recent approaches have leveraged categorical annotations at the slide-level, that in general suffer from robustness and generalization. In this paper, we propose a novel weakly supervised multi-instance learning approach that deciphers quantitative slide-level annotations which are fast to obtain and regularly present in clinical routine. The extreme potentials of the proposed approach are demonstrated for tumor segmentation of solid cancer subtypes. The proposed approach achieves superior performance in out-of-distribution, out-of-location, and out-of-domain testing sets.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_24

SharedIt: https://rdcu.be/cymaf

Link to the code repository

N/A

Link to the dataset(s)

https://portal.gdc.cancer.gov/

https://paip2019.grand-challenge.org/

https://digestpath2019.grand-challenge.org/


Reviews

Review #1

  • Please describe the contribution of the paper

    Authors propose an approach for segmentation of WSI using weakly supervised MIL. The approach is compared against the supervised method, an attention based MIL, and an alpha beta MIL. The method achieves the highest AUC in most of the cases. The method can be used for weakly supervised segmentation or as a tool to suggest regions to the expert and speed up the process of pixelwise annotation of WSIs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Using a weakly supervised approach a relative improvement in WSI segmentation is achieved. Authors have performed extensive tests with large amount of data and have better results compared with the state of the art.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Judgment of the improvement in AUC would have been easier if experiments were repeated to report standard deviation.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Data is obtained from TCGA dataset. In the manuscript, the method and parameters are explained, however. Since the code is not released, it would take some effort to reproduce the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • Quality of figure 1 can improve a lot.

    • It would be interesting to investigate the performance of the method on different kinds of cancer. Is the model performing equally good on all 32 cancer subtypes?

    • The four parameters relevant for taking noise in tumor percentage into account are not evaluated. Maybe by simulating some uncertainty in the tumor percentage authors could evaluate the effectiveness of them.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Authors are approaching a useful and interesting topic for segmentation of the WSI and it seems to be useful tool to speed up the expert in pixel wise annotation.

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    5

  • Reviewer confidence

    Somewhat confident



Review #2

  • Please describe the contribution of the paper

    This paper proposes a general tumor segmentation algorithm for histopathology whole slide images based on weakly supervised learning. In particular, the authors seek to leverage the weak label of the percentage of tumor in a slide, which is information that is easier to come by than carefully pixel-wise annotation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The problem that the authors explore is very timely and relevant.
    • The proposed methodology is surprisingly simple, elegant and efficient.
    • The numerical results are very encouraging.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The presentation of the method could be improved and clarified.
    • The reported results could be expanded.
    • The assigned name of out-of-distribution seems incorrect
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors provide extensive details on their implementation.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • It took me a while to convince myself that the loss that the authors propose could indeed result in the desired output. The authors should consider expanding on their explanation: e.g., the ‘signal’ of the problem still comes from having both positive and negative cases, so that for some slides p_k is indeed 0, forcing f to recognize only tumor areas.
    • I think this paper could be improved further by showing more details about the obtained model. For example: to what extent do the predicted areas agree with the labels pk?
    • The authors evaluate the generalization of the obtained models in a list of different sets, which is very insightful. However, their “out-of-distribution” error should be called “in-distribution error”. Indeed, these samples are randomly drawn from the same distribution that the training data is sampled from; thus, these are samples from the same distribution. This is simply a held-out test set, which is perfectly fine. What the authors refer to “out-of-domain” is what is typically referred in the literature as “out of distribution”, because such samples are drawn from a different distribution from that of the training set.
    • The held-out test set is drawn by keeping 5 WSI from each of the 29 sub-types - are these drawn from different cases/patients from those in the training data? I’m hoping this is the case, as otherwise patients are used for both training and testing. Please clarify this in the revised version of the manuscript.
  • Please state your overall opinion of the paper

    strong accept (9)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a very simple and elegant idea, very nice work.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    4

  • Reviewer confidence

    Confident but not absolutely certain



Review #3

  • Please describe the contribution of the paper

    The authors propose weakly supervised method for segmenting cancer vs normal tissue in whole slide images. The method is based on Multiple Instance Learning, and uses rough percentage estimates of cancer tissue as ground truth. By selecting the highest confidence first pixels, the method proposes a segmentation mask that is recursively improved. The proposed method obtains remarkable performance on the various test sets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The first paragraph clearly explains the motivation of the paper.
    • The method is simple in principle, but very effective.
    • The method is architecture agnostic.
    • The train and test splits are correctly done (at patient level) which, even if it should be the norm, it is sadly not.
    • Training on the data that is already publicly available without requiring additional annotations (apart from the ones needed to test and compare with other methods)
    • Very good evaluation in general.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weakness of the paper might be related to the space limitations, I would have liked to have a more in detail explanation of the process, maybe with a figure showing the pipeline. The literature review could be improved, by describing what others did in more depth. But again, I understand the space limitations.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The optimisation setup is well defined in section 3.3, with more detail than one commonly finds in similar papers. However, since the train/test/validation splits are not provided, it would be hard to reproduce the experiments and obtain the same values. A cross validation scheme would have allowed to obtain mean and standard deviations on the reported values, which would allow comparison even if the splits are not the same.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    I commend the authors for the clearly written paper, and the idea of using MIL to leverage large amounts of data that are just there. There is a noticeable improvement in the performance of the proposed method, even if it’s not extremely novel. I think they make a good contribution to the community

  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The main reasons for recommending acceptance of this paper are the importance of the problem and the excellent evaluation done.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a general tumour segmentation algorithm for histopathology whole slide images based on the percentage of the tumour area as a weakly supervised label. The method is extensively evaluated on TCGA data with various cancer types. All reviewers consider the method to be novel, elegant and efficient, the evaluation is comprehensive and recommend acceptance. Authors could adapt reviewers’ constructive comments in the updated manuscript if possible and consider releasing their code. This is quite helpful for the community to use their tool.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    1




Author Feedback

We would like to thank the reviewers and the meta-reviewer for their detailed, constructive, and pertinent comments which improved the quality of this paper.

As raised, the space limit indeed prevents us from adding further experimental analyses which are covered in our future work including the understanding of: (i) the impact of the 4 parameters of the approach that consider the uncertainty in percentage annotations, (ii) the impact of approximations in annotations, and (iii) the performance per cancer subtype. Besides, we plan to perform experiments on other classes such as stromal, normal, and necrosis tissue types. About further corrections of the current paper, we do agree that the “out-of-distribution” term might be inappropriate and will be replaced by “in-distribution”; however, we think that the “out-of-domain” denomination for FFPE whole slide images is appropriate since FFPE slide have a radically different aspect from snap-frozen slides which entirely constitutes the training set.

Finally, we would like to provide our insights about one remark from the reviewers and in particular reviewer #2. We do not think that “the ‘signal’ of the problem still comes from having both positive and negative cases” holds for the proposed method, although this is true for all the other benchmarked approaches (alphabetaMIL, attentionMIL) which are precisely based on those binary labels. What our approach needs is some variability in the annotated percentages: one pitfall would be when all slides have the same percentage of tumor cells implying that the underlying segmentation system cannot be properly trained. Our formulation should be able to grab a non-random signal if there is some variability in annotations; for instance, if some slides are annotated as having 10% of tumor and others with 100%, then a signal can be extracted by our approach but not by competing ones since both slides have the same label of 1. We feel that this is an important pitfall of competing methods since there exist concrete situations where all slides are positive: for instance, in The Cancer Genome Atlas, all FFPE slides are tumoral, making the competing methods not applicable. Furthermore, in a real-world context, due to WSI size and the cost of hardware, some cancer centers chose not to keep normal slides (i.e., with no apparent tumor), which implies the same consequences.



back to top