Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Yuqing Liu, Weiwen Wang, Chuan-Xian Ren, Dao-Qing Dai

Abstract

Colorectal cancer (CRC) patients who are detected as microsatellite instability (MSI) can receive precise targeted therapies, but existing MSI detection methods are not available to all patients due to various restrictions. The achievements of deep learning in image processing provide the possibility of using pathological images for MSI detection. However, traditional deep networks can not achieve satisfied performance due to discrepancies between MSI patients, which reduces the generalization ability of deep learning models. Noisy labels also hinder the learning of an accurate model. To address these issues, we propose a model in a meta contrastive learning framework (MetaCon) accompanied with an attention-based feature fusion block. In MetaCon, we iteratively train a backbone with a cross entropy loss and a contrastive loss to learn a patient-independent MSI classifier for patches segmented from pathological images. We then blend features of patches from the same patient in an attention-based way, automatically focusing on reliable patches. Finally, we make a patient-level prediction by voting. Experiments on two public datasets from The Cancer Genome Atlas show superiority of our model over previous methods. The patient-level AUC is improved by 8% on average compared to the baseline model. Ablation studies prove effectiveness of each component in our model.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_26

SharedIt: https://rdcu.be/cymah

Link to the code repository

N/A

Link to the dataset(s)

https://doi.org/10.5281/zenodo.2530835

https://doi.org/10.5281/zenodo.2532612

Reviews

Review #1

Please describe the contribution of the paper

The manuscript introduces a meta-learning framework for the classification of microsatellite instability (MSI) from the H&E whole-slide images of patients with colorectal cancer. The approach performs cross-entropy-based binary classification of MSI in the meta-training stage using a regular CNN, and in the meta-testing stage the classifier from the meta-training stage is used as a feature extractor on which a projection head is trained with contrastive loss. Features extracted by both regular CNN and the contrastive learning are combined by an attention fusion block to make WSI level predictions. Comparison with other multiple-instance learning (MIL) approaches and ablation studies are conducted.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The clinical motivation is stated clearly.
2. The task assignment for meta-learning and meta-testing is very interesting.
3. Experiment results are strong when compared with previous approaches.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The motivation of using meta-learning is not strong enough.
2. The patch-level label accuracy is not justified, and the sampling strategy in training is ambiguous.
3. It is not fully convincing contrastive learning is necessary for solving the MIL problem.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The method is described clearly and details on data preparation are also provided. Given the nature of the meta-learning model more details could be more helpful but in general the manuscript has an adequate level of reproducibility for readers who are familiar with the topics.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
The strengths of the manuscript include the following:
1. The manuscript aims to solve a problem with clear clinical impact as MSI is used as a regular biomarker in classifying patients and assessing treatment options for patients with colorectal cancer. 2. The election to work on datasets from TCGA would allow a fair evaluation against previous methods as well as increasing the level of reproducibility of the proposed approach.
2. The integration of regular cross-entropy-based classification with contrastive learning in a meta-training and meta-testing setup is a very interesting idea and has a good level of novelty.
3. The attention fusion improves the performance as it is shown in the ablation studies.
4. Experiments have demonstrated superior performance by the proposed approach when compared with previous MIL approaches.
The reviewer would like to see the following questions be addressed:
1. The authors claim that eliminating the discrepancies between patients is the major motivation of the proposed approach, while it is unclear whether meta-learning or contrastive learning is the basis for addressing the issues. Since contrastive learning is known for its ability to learn more robust features, why meta-learning is used would need more justification. Also, it would be nice if the discrepancies could be demonstrated in a quantitative manner.
2. The sampling strategy in meta-training and meta-testing would need more clarification – it is described that positive to negative patient ratio is 1:6, while the positive to negative ratios at patch level in both meta-training and meta-testing are both 1:1. Please provide details how the ratios were established. This needs more clarity especially because it is unknown from the reference paper [5] how accurate the patch-level labels are.
3. It is unknown whether it is contrastive learning or meta-learning that contributes the most to the improvements on the performance. Another experiment may need to be conducted where a regular CNN with cross-entropy loss is trained in both meta-training and meta-testing. Such experiment could be important since contrastive learning aims to expand the difference between positive and negative samples, while in MIL of digital pathology labels at patch level are not always credible. Sometimes a large portion of negative patches on a WSI could be labeled as positive due to the nature of subject-level labeling. Whether contrastive learning based on weak labels could reinforce false labeling would be worth to be explored.
4. The histogram in Figure 3 is delivering similar information as that in Figure 2.
5. It would be useful to visualize the patches with their predicted probabilities, associated with the effect of attention mechanism.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed framework especially the idea to assign regular CNN classification and contrastive learning tasks to meta-training and meta-testing, respectively, is novel and could have more applications in the field. However, more justifications could be needed for such an approach to be suitable for application in the multiple-instance learning problem where labels for training samples could be highly unreliable.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This study attempts to overcome current limitations in predicting patients with colorectal cancer who have microsatellite instability (MSI) or not. Based on histological images, the investigators proposed a new model known as meta contrastive learning framework (MetaCon) accompanied with an attention-based feature fusion block. The goal was to: 1) alleviate inter-patient discrepancies; and 2) remove the noisy image patches automatically, to improve accuracy. Finally, the model provided patient-level predictions by majority voting. Experiments on two public datasets from The Cancer Genome Atlas appeared to show an increased performance of the proposed model over previous methods, with patient-level AUC improved by 8% on average compared to the baseline model. Ablation studies tested the impact of the proposed model components.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) Model development and testing used 2 large datasets, each with > 300 patients in training, and >100 patients in testing, which helps increase the reliability of the method. 2) The proposed model architecture is new, and results have shown improvement over ‘baseline’ approaches according to ablation studies and over several existing methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) The implementation details are not intuitive to understand, including the inclusion of the various equations, especially when they are not explained in depth. 2) The proposed methods have shown the highest values in accuracy and AUC, but not much – approximately 0.03 to 0.06 in accuracy, compared to the baseline method with contrastive loss only. Statistical analysis may not show significance.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The methodologies are not easy to follow and therefore likely difficult to reproduce. There is no indication of the availability of code either.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

This paper includes some interesting ideas in the design of the proposed model, which may eventually help address the current challenges in identifying colorectal cancer patients with MSI or not, thereby improving treatment outcomes. However, several areas deserve further clarification and updates.

1) Fig. 1 shows the overall design of the model, but not easy to follow, particularly part b. In figure caption, using the symbols alone can add confusion.

2) In page 4, under ‘Meta-Learning Framework’, it says that “…discrepancies between patients leads to the insufficient generalization ability of common deep learning models’. Wouldn’t machine learning/deep learning techniques designed to address inter-individual discrepancy? Please clarify.

3) In the same page (page 4), most of the equations are not explained, including some newly introduced symbols (e.g. theta, equations #1 and #2). Likewise, in the caption of Fig. 2, ‘t-SNE’ is not defined.

4) In page 5, it says that “…And the weight are determined by two factors: a response with the patch under reconstruction and a score of MSI probability.” It is unclear what ‘response’ means here, despite the inclusion of an ‘explanatory’ sentence right after it. In addition, in the same page, is ‘ReLU’ the process used to remove noisy labels? Please mention it implicitly if so.

5) In page 6, this sentence is unclear “In feature fusion, with F(theta) fixed, the patch-level classifier C(alfa) and the projection operator P(beta); P(beta-bar) were fine-tuned for …”, where the projection operator does not seem to be defined explicitly either.

6) In the bottom of page 6, it says “Meta- Con achieved … and had a huge improvement compared to other methods”. ‘Huge’ may be overstated.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method seems to be much more complex than some existing methods (e.g. Kather’s study), but the performance is only slightly higher, especially using the CRC-KR dataset (mean AUC was 0.87 of the current model versus 0.84 of Kather’s). In addition, the implementation details are not intuitive to understand, and the equations are not explicitly explained for most.
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

The authors proposed MetaCon, a novel method for MSI detection using pathological images based on meta contrastive learning. Specifically, a contrastive loss was introduced to learn a patient-independent feature extractor and an adaptive MSI classifier. A feature fusion block was also proposed to deal with noisy patch-level labels and effectively aggregate patches from the same patient for patient-level MSI prediction. MetaCon outperformed existing methods, and extensive ablation studies were performed to demonstrate the contribution of contrastive learning and the feature fusion block.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The authors proposed a novel meta contrastive learning framework that could learn patient-independent patch-level features while effectively distinguish patches from different categories. The authors further demonstrated that learning patient-independent features could lead to improvement in MSI detection from WSIs.
2. One issue when utilizing WSIs is that the patch-level labels may not be consistent with the patient-level labels. While this is often addressed through multiple instance learning methods, the authors proposed a novel feature fusion block to address the noisy patch-level labels. The effectiveness of the proposed feature fusion block was further demonstrated through ablation studies.
3. The authors performed extensive experiments to demonstrate the superiority of the proposed method. Specifically, the authors provided important visualization and discussions for effective interpretation of the results.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The meta contrastive learning framework and the feature fusion block are the two key components in the proposed method. While the authors provided useful analysis for understanding the contrastive learning part in Fig.2 and Fig.3, the author could also provide intuitive visualizations to help understand the feature fusion block.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Overall the reproducibility information is good. The authors provided detailed introduction of the proposed framework and provided important implementation details. The description of the datasets is also clear.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. While the authors already demonstrated the effectiveness of the feature fusion block through ablation studies, it would be better if the authors could include analysis to help understand the feature fusion block more intuitively. For example, for a patch, what does patches with high weights looks like visually compared with patches with low weights?
2. In the ablation study, what was the difference between ConL and MetaCon w\o FB? Specifically, does the contrastive loss in ConL calculated from the features after the feature extractor F? In Figure 3, could the authors provide feature similarity visualizations of MetaCon w\o FB to further demonstrate the difference between the proposed meta contrastive learning framework and ConL?
3. While the authors used a fixed training/testing split, there was still a large difference between the results from the bootstrap sessions, indicating by the confidence interval. While the confidence interval of MetaCon seems to be smaller than other methods, it would be helpful if the author could provide some brief discussions about the difference between the sessions with lower and higher performances. It might be helpful for understanding the robustness of the proposed framework.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors proposed a method based on the meta contrastive learning framework and the feature fusion block for MSI detection from WSIs. The proposed method demonstrated novel technical contributions. The authors demonstrated the effectiveness of the proposed method and the key components in proposed method through extensive evaluations.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

6
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The authors proposed MetaCon, a novel method for MSI detection using pathological images based on meta contrastive learning. The strengths of the paper include:1) the novel contrastive learning framework for patch level feature learning; 2) extensive experiments on two large data sets; 3) good clinical motivation; 4) clear writing.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

1

Author Feedback

We would like to thank the reviewers for their valuable comments and constructive suggestions. R#1-Q4.1：The motivation of using meta-learning is not strong enough. Patches from different patients are assumed to subject to different distributions because of discrepancies between patients. By dividing the patients in training set into meta-train and meta-test sets, we hope the meta-learning process can simulate the distribution shift between patients. R#1-Q4.2：The patch-level label accuracy is not justified, and the sampling strategy in training is ambiguous. Due to the way of assigning labels, patch-level labels are not reliable. The patient-level ensemble strategy is to reduce the impact of unreliable patches. The sampling ratio is designed to ensure the balance of positive and negative samples at the patch-level level based on the actual proportion of MSI patients. For example, we randomly select one positive patient and six negative patients in one epoch. Then we randomly select 256 patches from the positive patient and 256 patches from the six negative patients. R#1-Q7.3：It is unknown whether it is contrastive learning or meta-learning that contributes the most to the improvements on the performance. In MetaCon, we recast two steps of contrastive learning into two phases of meta learning. We compared MetaCon w\o FB and ConL in the ablation study, which shows that using meta-learning and contrastive learning at the same time is better than just using contrastive learning. But we did not compare the effect of just using meta-learning. It is worth to supplement this ablation experiment later. Thanks for your constructive and insightful comments. R#2-Q7.2：In page 4, under ‘Meta-Learning Framework’, it says that “…discrepancies between patients leads to the insufficient generalization ability of common deep learning models’. Wouldn’t machine learning/deep learning techniques designed to address inter-individual discrepancy? Please clarify. Individual differences are objective and ubiquitous in medical images. In general machine learning/deep learning, the assumption that the training and testing data are independent and identically distributed may not be applicable to this scenario, because patches from different patients are more likely from different distributions, thus the general methods have poor generalization. By dividing the patients in training set into meta-train and meta-test sets, we hope the meta-learning process can simulate the distribution shift between patients. R#2-Q7.3：In the same page (page 4), most of the equations are not explained, including some newly introduced symbols (e.g. theta, equations #1 and #2). Likewise, in the caption of Fig. 2, ‘t-SNE’ is not defined. Thank you for pointing this out. The theta and alpha denote learnable parameters of the feature extractor F and the classifier C, respectively. The t-SNE is a widely used method for dimensionality reduction, and we will add it to the references. R#2-Q7.4：It is unclear what ‘response’ means here, despite the inclusion of an ‘explanatory’ sentence right after it. In addition, in the same page, is ‘ReLU’ the process used to remove noisy labels? Please mention it implicitly if so. The response indicates a certain relationship between the two patches, which is calculated by a network. And ‘ReLU’ excludes some ‘noisy labels’ based on the principle of ‘the lower the probability is, the less reliable’. R#2-Q7.6：In the bottom of page 6, it says “Meta-Con achieved … and had a huge improvement compared to other methods”. ‘Huge’ may be overstated. Thank you. We will correct that. R#3-Q7.2：In the ablation study, what was the difference between ConL and MetaCon w\o FB? In ConL we train the feature extractor F followed by the projection operator P with contrastive loss. Then we replace the P with a classifier C and fix the parameters of F and fine-tune C. In MetaCon w\o FB, we used contrastive loss in a meta-learning framework.

back to top

MetaCon: Meta Contrastive Learning for Microsatellite Instability Detection