Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

# Authors

Adalberto Claudio Quiros, Nicolas Coudray, Anna Yeaton, Wisuwat Sunhem, Roderick Murray-Smith, Aristotelis Tsirigos, Ke Yuan

# Abstract

Deep learning based analysis of histopathology images shows promise in advancing the understanding of tumor progression, tumor micro-environment, and their underpinning biological processes. So far, these approaches have focused on extracting information associated with annotations. In this work, we ask how much information can be learned from the tissue architecture itself.

We present an adversarial learning model to extract feature representations of cancer tissue, without the need for manual annotations. We show that these representations are able to identify a variety of morphological characteristics across three cancer types: Breast, colon, and lung. This is supported by 1) the separation of morphologic characteristics in the latent space; 2) the ability to classify tissue type with logistic regression using latent representations, with an AUC of 0.97 and 85% accuracy, comparable to supervised deep models; 3) the ability to predict the presence of tumor in Whole Slide Images (WSIs) using multiple instance learning (MIL), achieving an AUC of 0.98 and 94% accuracy.

Our results show that our model captures distinct phenotypic characteristics of real tissue samples, paving the way for further understanding of tumor progression and tumor micro-environment, and ultimately refining histopathological classification for diagnosis and treatment

# Link to paper

SharedIt: https://rdcu.be/cymbl

# Reviews

### Review #1

• Please describe the contribution of the paper

The authors proposed an unsupervised method, Generative Adversarial Network with an extra encoder, allowing to project real tissue onto the model’s latent space. The topic is interesting and this is an attempt to partially address the increasingly ground truth demands in digital histopathology.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1). The method is semi-novel, introducing an encoder to map back generated tissue to the GAN’s latent space. 2). The authors have conducted thorough experiments on three different types of cancer to show the latent representation learned by the encoder is distinct and informative, and results are comparative with supervised methods in terms of AUC.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

My main concern about the proposed method is how much improvement can be brought by the encoder. The authors need to justify in the experiment part, e.g. to compare the latent representation before and after adding the encoder.

• Please rate the clarity and organization of this paper

Good

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The link to the code and pretrained models provided in the paper is not accessible. The following error appears: The repository is not found.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

1). It would be better to show the details of the encoder architecture as the encoder is the main contribution/novelty in terms of methodology. 2). The authors need to compare the latent representation before and after adding the encoder, to provide more confidence/justification in the proposed method. 3). The breast cancer data is TMA, while in Figure 2 (left), the UMAP looks like a WSI, please explain this. In addition, in Figures 2 & 4, please clarify which column/row corresponds to real images. This is unclear. 4). Figure 3 should have a quantitative evaluation to make the comparison more solid and meaningful, rather than just visualization. 5). The authors used ‘Figure’ in the main contents while ‘Fig.’ in the captions, please be consistent.

• Please state your overall opinion of the paper

borderline accept (6)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper has addressed an interesting topic and the application of unsupervised learning would be helpful in histopathology. The authors have conducted thorough experiments on different cancers, adding values to its generalization ability. However, the key experiment to show the improvements achieved by the encoder is lacking. Further solid justification based on results is needed.

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

6

• Reviewer confidence

Confident but not absolutely certain

### Review #2

• Please describe the contribution of the paper

The authors proposed a novel generative adversarial network with representation learning properties to effectively extract features from WSI patches of cancer tissues. The author also demonstrated that the learned representations of the tissue images captured meaningful information related to the tissues through a series of experiments.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Introducing the representation learning properties into the generative adversarial network is an interesting idea for learning effective latent representations of the tissue images. The author also demonstrated the effectiveness of the learned representations in different aspects through different experiments, including latent space visualization and tissue image reconstruction, tissue type classification, and tumor prediction.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

While the author demonstrates the usefulness of the proposed method through different applications, one major weakness of this paper is the lack of comparisons with baseline and existing methods. Based on the current results, it is hard to establish the effectiveness of the proposed method compared to some widely adopted approaches. For example, some existing works directly use ResNet pre-trained with ImageNet to extract features from WSI patches.

• Please rate the clarity and organization of this paper

Very Good

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility information provided in this paper is good. The authors provided a detailed description of the proposed method and the used datasets. It would be better if the authors could provide more implementation details of the proposed network.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. For the results shown in Fig. 2, besides visually inspecting the clustering structures, the authors could also use quantitative analysis to evaluate the clustering results. For example, the authors could use the silhouette score to evaluate whether different tissue types in the colorectal cancer tissue are well separated using the learned representation.
2. Comparisons with baseline and existing methods need to be added in the results section to fully demonstrate the effectiveness of the proposed method. One simple baseline comparison could be features extracted from a ResNet that is pre-trained on ImageNet. The authors could expand the results in Fig. 2 and Table 1 by adding evaluation results of the compared methods.
3. The proposed method focused on learning meaningful representations of the tissue images without the information from labels or annotations. However, applications in Section 3.2 and 3.3 are all supervised tasks. While these results in some way could reflect the effectiveness of the learned representations, there are other existing methods that could perform these tasks in a supervised setting, probably more effectively. Therefore, it would be better if the author could provide more applications in the unsupervised setting to fully demonstrate the effectiveness of the proposed method.
4. For the application in Section 3.3, both the attention-based deep MIL network and the learned representations from the proposed method could contribute to the high accuracy and AUC values reported by the authors. Therefore, an ablation study is needed to demonstrate the contribution of the learned representations to the tumor presence prediction task.
• Please state your overall opinion of the paper

probably reject (4)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

While the authors proposed a novel method based on GAN with representation learning properties for WSI patches, the lack of comparisons with baseline and existing methods in the results section made it hard to demonstrate the effectiveness of the learned representations.

• What is the ranking of this paper in your review stack?

4

• Number of papers in your stack

6

• Reviewer confidence

Confident but not absolutely certain

### Review #3

• Please describe the contribution of the paper

An adversarial learning model is designed for representative tissue extraction in an unsupervised manner. The representations are identified with high morphological characteristics.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed adversarial method is implemented for three tasks, namely reconstruction, classification and MIL. The experinments large patient coherts reveals the effiency of the proposed unsupervised learning model. Good visualization. The reference are adequent.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

A few typos e.g., ‘representation’ in contribution 2) should be ‘representations’.

• Please rate the clarity and organization of this paper

Very Good

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Very convincing. The codes and pre-trained model is available. The used datasets and hyper-parameters are well explained.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. ‘Total AUC’ in Table 1 represents macro-AUC or micro-AUC? Please clarify. In the same table, why AUC and Accuracy for Stroma are missing?
2. Confusion matrix may be a better way to illustrate the classification than Table 1.
3. $G(\omega)$ in Eq (1-2) is not a probability distribution. A better way should be $G\circ M (P_z)$.
• Please state your overall opinion of the paper

accept (8)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The model is novel. The paper is well written and organized. I firmly believe the paper should be accepted for publication.

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

5

• Reviewer confidence

Very confident

# Primary Meta-Review

• Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper proposed a novel generative adversarial network with representation learning properties to effectively extract features from WSI patches of cancer tissues. The strengths of the paper include: 1) extensive experiments; 2) using an encoder to map back generated tissue to the GAN’s latent space. The points should be addressed in the rebuttal:1) ablation study of improvement by encoder; 2) comparison with baseline and existing methods;

• What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6

# Author Feedback

We would like to thank the reviewers and AC for their effort in providing constructive and meaningful reviews. Please find our replies to the questions raised below.

1.‘Comparison baseline and existing methods needed’: We further provide a comparison baseline with existing methods in Section 3.2 ‘Tissue type classification over latent representations’. We already included in the text the results from the best state-of-art performance, a Bayesian DNN [1] with accuracy/AUC of 0.995/0.992, we will move these to Table 1 too. We will also add the performance of an RBF-SVM [2] (accuracy/AUC 0.968/0.874) to the table. These results reflect baseline performance of supervised deep learning and non-deep learning approaches. The high performance (accuracy/AUC 0.976/0.854) of our representations on logistic regression without any transformation or projection, and the fact that they are comparable with top supervised performance, demonstrate the effectiveness of our representations. In addition, we provide another reference method as a comparison to ours in Section 3.3 ‘Multiple Instance Learning on latent representations. The Inception-V3 network from Coudray et al [3] was tested on the same dataset and achieves an accuracy/AUC of 0.975/0.993, comparable to our accuracy/AUC results of 0.980/0.940. Collectively, these results provide evidence of how our unsupervised representations can be as competitive as state-of-art supervised approaches in terms of discriminative signals they capture.

2.’Ablation study of the improvement of the encoder/how much improvement can be brought by the encoder’: We would like to take this opportunity to further clarify what are improvements/contributions made by our encoder. The main contribution of our encoder is allowing us to create representations of real tissue, without the encoder, the GAN cannot quickly and effectively produce any representation of real images to perform the three tasks studied in the manuscript. The question raised about the contribution of the learned representations to the MIL classifier and the proposal of an ablation study is interesting. However, we argue that our tissue representations are already discriminative for the three separate kinds of labels. This is supported by the clear separations in the latent spaces across the three tasks. These separations are intrinsic properties of the representations, achieved without any label. The classifiers including logistic regression and MIL are selected to best demonstrate the discriminative signal in the representations. In the case of the attention MIL, the attention is set up to explain the prediction rather than improvement in performance, in our version we removed the CNN specified by the original model and replaced it with our presentations. This evidence shows performance mainly due to our novel unsupervised representations, therefore already addressing the need of an ablation study.

We provide details of the baseline and all network architectures of our model can be found in the anonymized GitHub link. They will also be included in the final version as an appendix: https://anonymous.4open.science/r/Adversarial-learning-of-cancer-tissue-representations-1C87

[1] Rączkowsk et al. ‘ARA: accurate, reliable and active histopathological image classification framework with Bayesian deep learning’ Scientific Reports 2019 [2] Kather et al. ‘Multi-class texture analysis in colorectal cancer histology’ Scientific Reports 2016. [3] Coudray et al. ‘Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning’ Nature Medicine 2018.

# Post-rebuttal Meta-Reviews

## Meta-review # 1 (Primary)

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The paper proposed a generative adversarial network with representation learning properties to effectively extract features from WSI patches of cancer tissues. The rebuttal sufficiently addresses the major concern of comparison with SOTA methods and ablation study for the encoder methods.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

9

## Meta-review #2

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

he authors propose a novel GAN based network to learn representations from histological images in a unsupervised manner. Apart from synthetisize images, the learnt representation is also useful for down-stream tasks such as classification or MIL classification. I think the paper is generally well-written, with well-justifed motivation and sufficient evaluations from multiple aspects. Reviewers concern (e.g. role of the encoder) are also addressed in the rebuttal and I therefore support paper acceptance.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6

## Meta-review #3

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The presented idea of representation learning through an encoder to map back GAN outputs is indeed quite interesting. However, with most evaluations being in the form of qualitative (Umap) illustrations, what one can truly achieve with learned representations in a clinical setting is not apparent. There is limited evaluation, and although the results added in the rebuttal improve this situation, the comparative improvements are not significant with even SVM doing quite well (although being supervised).

This paper is a narrow call for me, but it can perhaps foster discussions and ignite other ideas in histopathology circles in MICCAI.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

9