Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Saloni Agarwal, Mohamedelfatih Eltigani Osman Abaker, Ovidiu Daescu

Abstract

Current methods for whole slide image (WSI) histopathology subregion classification and survival prediction rely on phenotype clustering from randomly sampled image tiles or on analyzing key tiles selected by experts from the much larger in size WSIs. These approaches do not capture the whole tissue region present in a histopathology image, also missing the spatial distribution of features that could be critical for good survival predictors. We propose a novel method that extracts a whole slide feature map (WSFM) in the first step and then uses it to train the survival prediction model. Specifically, we partition the WSI into tiles, and for each tile extract InceptionV3 features followed by PCA dimension reduction. The low dimension features of each tile are stored as the channel information in the WSFM. The resulting WSFM preserves the tile adjacency information and captures the entire tissue in the WSI. To overcome the small-size data set concern inherent to previous methods, we design a siamese survival convolutional neural network (SSCNN) that takes the WSFM and multivariate clinical features as input and predicts the survival score. We train the SSCNN using a novel loss function that combines a modified pairwise ranking loss and a bounded inverse term. The key advantages of the proposed method are that it does not require pixel-level annotations, a notorious bottleneck, and it can be easily adapted for any type of tumor without performance dependence on other parameters like the number of clusters. Experimental results demonstrate the effectiveness of the proposed SSCNN over other state-of-the-art survival analysis approaches.


Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_73

SharedIt: https://rdcu.be/cyl6Q

Link to the code repository

https://github.com/saloniagarwal0403/SSCNN

Link to the dataset(s)

https://github.com/saloniagarwal0403/SSCNN/tree/main/DataDistribution


Reviews

Review #1

  • Please describe the contribution of the paper
    • Use of whole slide data via whole slide feature map
    • Loss function based on pair-wise ranking loss for survival CNN analysis
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors present an integrated mechanisms for survival analysis: whole slide feature map, loss based on survival ranking, and siamese network for training. The whole slide feature map addresses some weaknesses of patch sampling based approaches. Loss based on ranking and siamese networks for training allow for a larger training dataset.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The proposed approach is presented as a general method for survival analysis with any tumor type, but the authors use glioblastoma multiforme cases for experimental evaluation. The claim on generalizability of the method is not evaluated.
    • The whole slide feature map, a core contribution of the work, is similar to the method in [4] (Jaber, M.I., et al. “A deep learning image-based …” . The authors should better articulate the differences in their approach.
    • The ranking based loss is similar to that proposed in [5] (Jing et al. A deep survival analysis method based on ranking..). The authors should better articulate how their loss function differs.
    • Two other deep learning methods [16,19] are used in the experimental evaluation and comparison. The authors should have also compared with the methods by Yao et al. [17] and Wulczyn et al [14], which are more recent deep learning approaches.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The datasets are publicly accessible. The description of the methods and experimental setup appear to be presented in sufficient detail. The results and experiments could be reproduced with some help from the authors.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    The paper targets an important problem and presents an integrated collection of methods for survival prediction. The proposed method performs better than the other methods used in the experimental evaluation. The authors could improve the paper by addressing the following weaknesses:

    • The proposed approach is presented as a general method for survival analysis with any tumor type, but the authors use glioblastoma multiforme cases for experimental evaluation. The claim on generalizability of the method is not evaluated.

    • The whole slide feature map, a core contribution of the work, is similar to the method in [4] (Jaber, M.I., et al. “A deep learning image-based …” . The authors should better articulate the differences in their approach.

    • The ranking based loss is similar to that proposed in [5] (Jing et al. A deep survival analysis method based on ranking..). The authors should better articulate how their loss function differs.

    • Two other deep learning methods [16,19] are used in the experimental evaluation and comparison. The authors should have also compared with the methods by Yao et al. [17] and Wulczyn et al [14], which are more recent deep learning approaches.

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The technical approach of the paper appears to be sound and the experimental evaluation shows performance improvement with the proposed method. However, the lack of evaluation with additional cancer types and experimental comparisons with more recent methods (e.g., [14] amd [17]) weakens the work.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Somewhat confident



Review #2

  • Please describe the contribution of the paper
    • In this work, the authors propose to make use of the entire whole slide image (WSI) for feature extraction, instead of using a subset of patches across the WSI for survival prediction. - Clinical features are also added to the model to provide more information about the patients.
    • The multi-modality model is trained like a siamese network to make the training of larger models with smaller datasets feasible.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The authors look to extract and utilize all the information present in the WSI, instead of the traditional use of a subset of all patches. While this can be challenging, the authors make clever use of dimensionality reduction, in the form of PCA, in order to make the proposed model more tractable.
    • The use of training similar to siamese networks for tackling the low sample problem makes the model more feasible and is a clever solution to a natural problem in the WSI setting.
    • The authors compare their method against several other methods, including linear and deep learning models, in order to provide a comprehensive comparison.
    • The novel survival loss function enables the authors to train the siamese network, while enforcing the ordering/ranking of samples.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • It is unclear whether PCA of the concatenated features in Fig. 1 will still capture the necessary variation across the different magnification scales. Perhaps one of the scales dominates? Is some form of normalization done prior to PCA?
    • In table 1, only one metric of concordance index is reported across all models. Metrics like integrated brier score, time-dependent AUCs should also be included to compare and contrast the different models on more than one metric. A concordance index of 0.62 is still low, since the maximum it can be is 1. No comments were made on why their proposed, complex model is still unable to achieve higher concordance indices (e.g. limitation of survival data).
    • Further, the concordance index is reported only for one split/run of the training process. A 5-fold cross validation, or the variation of multiple runs should be reported in order to perform a complete comparison.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • Details of the model are provided. However, the description of the hyper-parameter tuning process is not given.
    • The memory requirements and processing times for the proposed model (which uses the entire WSI) is not noted in the paper.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • Performance of the different models across different metrics should be reported across different folds/runs to provide a better experimental comparison of the different models.
    • Given the construction of their model, the authors can explore the use of post-hoc explainability to understand which of the clinical variables are important, and which sections of the WSI are important for the survival prediction.
    • A complete description in the caption of the figures will make the paper more accessible to the general audience.
    • The use of the proposed loss function could be illustrated with examples
    • Why does the performance of the 16 PCA method dip at 6000 training pairs?
  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses an important issue, of utilizing the full WSI image for prediction tasks. However, the evaluation is limited- there is no performance comparison across different runs/folds, or across different metrics, which makes the results incomplete.

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    5

  • Reviewer confidence

    Somewhat confident



Review #3

  • Please describe the contribution of the paper

    This paper proposes a method that targets to learn a whole slide feature map for survival prediction. The survival model is trained with pairwise ranking loss and a bounded inverse term. The method is claimed to be independent of cluster numbers and pixel-level annotations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The loss function from (1) - (3) can be treated as novel formulations in this paper since it provide a practical way of training survival model with ranking loss with censored data.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The proposed WSFM can still be treated as a tile based method. Although it encodes multi-scale features, I cannot see any difference from the tile based method in terms of processing WSIs;

    2) The authors also need to show which part really contribute to the improvement of the performance. It’s hard to tell which part works from the experiments.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Looks like reproducible when the split of training and test data is provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Do some ablation study on the proposed methods.

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    novelty of the proposed method; writing; experimental setting

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    4

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a survival prediction model based on the learned whole slide feature map, which is trained with pairwise ranking loss and a bounded inverse term. The reviewers have brought up well constructed arguments to the limitations of the paper. In the rebuttal, please clarify the key difference between the proposed method and the methods in [4, 5]; It is not clear whether the whole slide feature map in the proposed method is similar to that in [4] and the rank loss function is the same as that in [5], or not. While not compare the proposed method with more recent baseline methods in [14, 17]? Please clarify how each component in the proposed method contribute the performance improvement.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    7




Author Feedback

Q1) Please clarify the key difference between the proposed method and the methods in [4, 5]; It is not clear whether the whole slide feature map in the proposed method is similar to that in [4] and the rank loss function is the same as that in [5], or not.

Individual tiles (small patches from the WSI) are studied independently in most histopathology image-based methods, including in [4]. In our proposed method we extract a multiscale feature vector for each tile, as in [4], but instead of analyzing them independently we combine the tile feature vectors to form 2-dimensional whole slide feature maps (WSFM). We use the WSFM, that captures whole tissue, for survival prediction instead of using information from a single, independent tiles.

This paper predicts the relative survivability using a Siamese network. There are subtle differences between our loss function and the one in [5]. We propose a loss function for training the Siamese network, which includes a ranking loss component (L1) that is the same as the ranking loss in [5]. The authors in [5] added an extended mean squared error term in their total loss, while we do not use the mean squared error term. Instead, we add an inverse term (L2) to obtain a loss function that increases the loss drastically if the model predicts close survival time (less than a constant c) for two instances with significant differences in the actual survival time (more than ddiff).

Q2) Why not compare the proposed method with more recent baseline methods in [14, 17]?

In [14], the authors train the final model using Censored Cross-Entropy loss. They model survival prediction as a classification problem instead of a regression or ranking problem in the Censored Cross-Entropy loss. Since we formulate survival prediction as a regression-based ranking problem, comparing our model performance with the classification-based method in [14] is unsuitable. For example, suppose there are two instances where the actual survival time of the first instance is larger than the actual survival time of the second, and they lie in the same category after interval formation as in [14]. According to the classification-based method [14], the prediction for this pair will be considered correct since they fall in the same interval. If the predicted survival time of first instance is less than the predicted survival time of the second instance, it is considered incorrect in our problem setting.

The DeepAttnMISL proposed in [17] depends heavily on the choice of the number of phenotype clusters. From a range of 6 to 10 clusters, we find a significant difference in the c-index value of 0.102. There is no one-fits-all solution for the number of clusters in various tumor types, and it is a difficult but critical hyperparameter to set for the model’s success. Similarly, in our method we need to make a choice on the number of PCA features, however the maximum difference between the c-index value from 8 to 16 PCA features is 0.004.

Q3) Please clarify how each component in the proposed method contributes to the performance improvement.

We trained this model with the pairwise ranking loss alone, without the inverse term, and it did not converge well. A combination of ranking loss and the inverse term as proposed in the paper led to convergence with good validation results. Further, we also analyzed the performance of the model using the average of the tile feature vectors, instead of forming WSFM, but simply averaging the PCA generated tile features for a WSI resulted in low performance. After experimenting with different settings, the best validation results were obtained by combining both techniques, the WSFM construction and the training of the Siamese network with the proposed loss function.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a survival prediction model based on the learned whole slide feature map, which is trained with pairwise ranking loss and a bounded inverse term. The reviewers have brought up well constructed arguments to the limitations of the paper. In the rebuttal, the authors have clearly clarified the contribution of proposed model and analyzed the performance.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a method that targets to learn a whole slide feature map for survival prediction. The method is somewhat novel. The rebuttal sufficiently address the concern raised by the reviewers, such as the ablation study to show the contribution of each component, and the comparison with other similar methods.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    8



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I am not recommending acceptance. The major issue to me is the novelty compared with existing work [4,5]. The clarification in the rebuttal is not convincing enough. As R1 correctly put, this is “an integrated collection of methods for survival prediction”.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    12



back to top