Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Churan Wang, Xinwei Sun, Fandong Zhang, Yizhou Yu, Yizhou Wang

Abstract

Learning disease-related representations plays a critical role in image-based cancer diagnosis, due to its trustworthy, interpretable and good generalization power. A good representation should not only be disentangled from disease-irrelevant features, but also incorporate the information of lesion’s attributes (e.g., shape, margin) that are often identified first during cancer diagnosis clinically. To learn such a representation, we propose a Disentangle Auto-Encoder with Graph Convolutional Network (DAE-GCN), which adopts a disentangling mechanism with the guidance of a GCN model in the AE-based framework. Specifically, we explicit separate the encoded features into disease-related features and others. Among such features that all participate in image reconstruction, we only employ the disease-related features for disease prediction. Besides, to account for lesions’ attributes, we propose to leverage the attributes and adopt the GCN to learn them during training. Take mammogram mass benign/malignant classification as an example, our DAE-GCN helps improve the performance and the interpretability of cancer prediction, which can be verified by state-of-the-art performance on DDSM and three in-house datasets.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_5

SharedIt: https://rdcu.be/cyl5x

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This work describes a novel Disentangle Auto-Encoder with Graph Convolutional Network, which explicitly encodes image extracted features along three hidden factors: macroscopic (e.g., qualitative descriptors), microscopic (e.g., radiomic/texture features), and irrelevant. The network was trained and validated using the DDSM and three other in-house cohorts. The approach outperformed other methods used for comparison across all datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The idea of using a disentanglement mechanism to explicitly encode the image along three different factors seems novel and promising.
    • The work is demonstrated using the publicly available DDSM and three in-house datasets.
    • The approach achieved superior performance compared to baseline methods across all datasets.
    • The authors performed an ablation study and attempted to interpret the learned representation.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The distinction between “microscopic” and “macroscopic” attributes is not clearly made: are they defined to be mutually exclusive?
    • Specific technical details about how the network architecture and training strategy are lacking (details were not given in the supplementary, as stated)
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    One dataest (DDSM) is publicly available, and the authors have given the training/testing split. Little detail on network architecture and hyperparameter specification is provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. It would be helpful to identify which datasets are screen-film (i.e., DDSM) versus full-field digital mammograms. The number of benign and cancer cases used during training/test should be clearly indicated.
    2. It would be helpful to report confidence intervals associated with reported performance metrics, but the difference in accuracy compared to other methods is substantial. The authors should also consider metrics such as precision/recall, particularly if a class imbalance exists.
  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The work demonstrates that using a disentangling mechanism and explicitly encoding along three hidden types improves classification performance. While the work could be strengthened by providing additional details on how the network was implemented and applied to each of the four datasets, the presented results seem promising, and the investigation, including ablation and interpretability studies, is comprehensive.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #2

  • Please describe the contribution of the paper

    The paper utilizes auto-encoder and GCN to disentangle the latent vector into macroscopic-related, microscopic-related, and disease-irrelevant subset of features for breast cancer mass classification. The authors compared their results on 4 datasets, 3 in-house datasets, and also publicly available DDSM. Additionally, the authors evaluated different components of their network by eliminating each module and reporting the effect on AUC for each dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The idea of disentangling the latent vector is nice and very well established by previous works and it brings interpretability, essential for medical applications, to deep learning models. 2) The utilization of GCN to incorporate and enforce certain knowledge about the disease is very interesting and useful. 3) The provided comparisons are extensive and clearly show the improved performance using the proposed model.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) One of the problems that I have is the utilization of auto-encoder. Since their generated latent vector might not be continuous and therefore smooth interpolation becomes difficult if not impossible. However, VAE doesn’t have this problem. The authors didn’t provide any explanation regarding the utilization of AE instead of VAE. 2) While the ability to classify masses is important, the extraction and separately feeding them to the network during inference is cumbersome and error-prone due to requiring human intervention. Additionally, turning the attention of the radiologists to the abnormal lesion using the whole mammogram is very important and helpful. That is the reason that most of the recent works use lesion detection and classification to avoid overlooking the suspicious lesion. This issue affects the applicability of the proposed model for this specific application. 3) One important aspect of lesion classification is classifying the lesions into mass or calcification in addition to benign or cancerous. The authors didn’t consider the calcification and mass classification which in turn affects the applicability of the proposed model.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors didn’t provide any details about the architecture of the encoder, decoder, and GCN as well as the hyper-parameter settings. This significantly decreases the chance to reproduce the work.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Please refer to weak points for a detailed explanation for my suggestions below. a) Evaluate VAE instead of VE and report the results. b) consider the lesion classification to calcification and mass in addition to cancerous or benign. c) Adding a lesion suggestion network to extract the abnormal area from the whole image will significantly increase the effectiveness of the model for the breast cancer classification.

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the work has some shortcomings, noted in weakness, I believe the methodology in latent vector disentanglement using GCN is sound and provides interpretability to black-box DL models. The proposed model is not limited to this work only and therefore, readers can benefit from this work.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    The paper performs an improved disease prediction on mammogram images using a network that enables feature disentanglement using three separate losses focusing on 1) Image reconstruction using an Auto-Encoder, 2) Attribute Learning using a Graph Convolutional Network 3) disease prediction based on a classic feature-based prediction. Here, the losses are applied to different parts of the feature representation learned by the encoder. This process effectively splits the feature representation into three distinctive parts, features relevant for microscopic and macroscopic aspects of the disease depicted on the images and features irrelevant for the disease classification. The paper shows that this approach provides superior performance compared to other methods and evaluates the meaningfulness of the learned features by evaluating their corresponding classification performance and by visualizing their individual image reconstructions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The idea of leveraging disease description attributes within a GCN setup to improve feature disentanglement for the macroscopic aspects of the image is very interesting. Although in principle taken from [4], the combination with the disentanglement aspect and transfer into the medical domain provides a novelty. The used part of the feature vector is successively increased for the three losses, so that the smallest part is processed within the GCN, a bigger part containing this subset is used for disease classification and the full representation is used for reconstruction, a task that is principally independent from the disease. This setup elegantly disentangles the features, which is also validated by performing a classification on the relevant and irrelevant parts of the feature vector and visualizing the image reconstructions, which mostly show the expected behavior. The validation of the method is also rather extensive and performed on four different datasets.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    In my opinion, certain parts of the paper are not very nicely explained. The explanation of the attribute learning setup using the GCNs within Sec. 2.2 is unclearly written, without knowing [4] it is a little difficult to follow. Also, the table 2 showing the results of the ablation study is not completely clear, e.g. L_G is introduced without referring to it anywhere in the text. Here, it would be good, if the authors could improve the clarity of their writing. For the quality of the validation and an estimation of prediction stability, it could also be helpful to perform a cross-validation on the data instead of a single split into training, validation and testing set.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors have provided a detailed description of their experimental setup as well as their evaluation metrics. The code is not provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    As described in the review, it would be good to explain the mentioned parts of the methodology more clearly, so that the reader can follow the paper more easily. Specifically, the part describing the attribute learning could be improved in my opinion as well as the setup of table 2.

  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea of leveraging disease description attributes within a GCN setup to improve feature disentanglement for the macroscopic aspects of the image is very interesting. This combination seems to be novel and proves effective in the evaluation.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Reviewers came to consensus on the merits of this work. Please check the detailed comments by the reviewers and update the paper accordingly.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3




Author Feedback

N/A



back to top