Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Rodrigo Bonazzola, Nishant Ravikumar, Rahman Attar, Enzo Ferrante, Tanveer Syeda-Mahmood, Alejandro F. Frangi

Abstract

Prospective studies with linked image and genetic data, suchas the UK Biobank (UKB), provide an unprecedented opportunity to extract knowledge on the genetic basis of image-derived phenotypes. However, the extent of phenotypes tested within so-called genome-wide association studies (GWAS) is usually limited to handcrafted features, where the main limitation to proceed otherwise is the high dimensionality of both the imaging and genetic data. Here, we propose an approach where the phenotyping is performed in an unsupervised manner, via autoencoders that operate on image-derived 3D meshes. Therefore, the latent variables produced by the encoder condense the information related to the geometry of the biologic structure of interest. The network’s training proceeds in two steps: the first is genotype-agnostic and the second enforces an association with a set of genetic markers selected via GWAS on the intermediate latent representation. This genotype-dependent optimisation procedure allows the refinement of the phenotypes produced by the autoencoder to better understand the effect of the genetic markers encountered. We tested and validated our proposed method on left-ventricular meshes derived from cardiovascular magnetic resonance images from the UKB, leading to the discovery of novel genetic associations that, to the best of our knowledge, had not been yet reported in the literature on cardiac phenotypes.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_67

SharedIt: https://rdcu.be/cyl6K

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper proposes an unsupervised deep learning method to study the genetic basis of CMR-derived phenotypes. The method contains two-step. The first step is to extract intermediate representations of imaging data, and the second step is to study the association between this intermediate variable and the genetic data.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper proposed an unsupervised learning method to extract the respretations of the imaging data, and then used the phenotype data to conduct a GWAS to analyze the association between imaging and genetics.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The motivation is not clearly presented. It looks hard to interpret the extracted imaging feature.
2. There were many grammatical mistakes.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Fair.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Please clearly present the motivation of this work. Do you want to identify new risk loci? Or do you want to extract a good intermediate imaging features? As an deep learning method, how about its performance improvement compared to conventional GWAS?
Please state your overall opinion of the paper

strong reject (2)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Confident.
What is the ranking of this paper in your review stack?

8
Number of papers in your stack

3
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

The manuscript proposed an innovative autoencoder model based on graph convolutional neural networks and optimized the model using a collaborative approach.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

It is very innovative approach to integrate imaging and genetics using autoencoder with collaborative learning, as presented in Fig 1. The author well described the motivation and contibution of the proposed model.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

It would be better if the author presents the rich comparisons with several baseline algorithm, such as CCA or PCA.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper clearly described their implementations.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The paper is overall nicely written with interesting ideas. However, there are several questions I had that could help me better assess the value of the proposed methods, especially in the experiment. There’s no need for running additional experiments for any of my questions.
Please state your overall opinion of the paper

strong accept (9)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed model is very innovative. In the manuscript, the author proposed a novel cost function to integrate imaging and genetics. The motivation of the cost function was very concrete.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

3
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

The paper presents an auto-encoder of 3D surface mesh models of the heart’s left ventricle that is simultaneously optimized to improve the odds of a significant correlation in a Genome-Wide Association Study. The authors train the encoder in two stages: first, by using graph convolutions on the mesh based on the graph Laplacian, and second by adding a term expressing preference for encoding components that have greater correlation with a selected set of SNPs. The authors train, validate, and test the pipeline on 29000 subjects from the UK Biobank, claiming to discover a new heart-related SNP.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper addresses an important area in imaging genetics, i.e. maximizing discoverability. GWAS is a bit old-hat, but remains a useful screening tool for yet-undiscovered variants, knowledge of which can be passed on to basic science. The choice of spectral embedding with a Chebyshev polynomial kernel is reasonable, and the pre-processing of the data appears to be done thoroughly and well. The learning & GWAS study is done on a sufficiently large sample, and the validation of the method is solid, with varying seeds for splitting the data and assessing the p-value distribution for a each of a set of hyperparameters. That the authors connect their putative discovery with an interpretable trait (sphericity) by way of the auto-encoder component is also a bonus.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The novelty of the method is somewhat limited, although this can be forgiven, given the goal of improving GWAS discovery. The main weakness of the method is that it relies on optimizing the encoder components’ correlation with SNP’s that are already sufficiently strongly correlated to pass Bonferroni correction. This type of approach appears circular in its reasoning. Although not circular in the statistical/ML sense, this is still a recipe for spurious findings. To properly optimize complex traits for “discoverability,” the process has to be agnostic to specific variants, especially from the very same dataset. More sensible approaches target general heritability, defined for example by GCTA https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3014363/. Even approaches like this https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5946904/ use overall GWAS from external data to define “more discoverable” traits. The most likely outcome here is that the same variant would be putatively discovered without the second (genetic) stage of the auto-encoder training. As a more minor point, the autoencoder is comprised of only 4 components in the end (after stage 1), which makes the entire procedure seem somewhat unnecessary, given the low dimensionality to begin with. Also, it seems the initial auto-encoding was done on the entire dataset at once (although this is not totally clear from the algorithm outline), which would obviously bias the results further.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Authors include code in the supplement & describe the tools for pre-processing. Data is UKBB, so can be accessed with some effort by any researcher.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

See weaknesses
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

At the end, the paper addresses an important issue, but the simplistic approach to the genomics aspect makes this a work of limited value.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

3
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The reviewers recognize the interest of the presented approach to enhance discoverability of relevant imaging-genetics associations. The presented autoencoding framework was found sound while the experimental assessment on the UK Biobank is compelling. The paper seems however to suffer from lack of clarity in illustrating the benefits from the proposed two steps approach over the conventional GWAS analysis. There seems to be an issue with the circularity of the analysis, as both screening and validation of variants are performed on the same dataset.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Author Feedback

We thank all the reviewers for their comments. Overall, they seem to concur on the importance of this emerging field and our general approach/novelty. The main comments to be addressed concern circularity and comparison to older statistical methods. To clarify for Reviewer-1, our method in the training consists of 3 (not two as stated in the review) steps: the first step extracts intermediate representations of imaging data, the second step studies the association between the shape features and the genetic data using GWAS, the third step incorporates the salient SNP found back into the estimation of shape features through a new loss function. Finally, during inference, we show the improvement in the strength of the association of the shape features with the genomic loci from the updated shape features.

Reviewer-2: Although it was not clear from the comment, we assume the comparison reference to PCA was for the shape representation and CCA was for the purpose of GWAS association. Our prior work using statistical methods such as PCA instead of the graph convolutional mesh revealed no meaningful GWAS associations. Since the mesh encoder captures the relationship between voxels inherent in the finite element mesh modeling, the resulting embedding is more suitable for finding the associations in GWAS. Our focus was on discovering relevant shape features associated with genetic loci without pre-programming GWAS with selected features as done in conventional GWAS. However, it is possible that CCA will be useful for GWAS itself although sparse sample graph-based CCA on such large dimensions will be computationally expensive requiring matrix deflation techniques.

Reviewer-3: We emphasize that, to our knowledge, our paper is the first work to extract deep shape features for genetic discovery using unsupervised geometric deep learning. It makes two specific novel contributions: 1) use of variationally regularized convolutional mesh autoencoders on graphs for the purpose of discovery of genetic associations of shape features. 2) using a genome-dependent loss function to optimize the shape representation for genetic association. Previous work on 3D facial meshes relied on more traditional ML approaches like hierarchical clustering and PCA on shape for GWAS association.

On the circularity question, we clarify that the purpose of the second training step is to find a refined representation (i.e. a better phenotype) for association with genetics, which stresses the morphological changes that led to increase such association. It was not intended to discover newer gene loci. We found that a term enforcing correlation of a latent variable with SNP dosage is indeed effective in improving the association in an independent sample. This is not obvious a priori due to the small effect of single genetic variants on the phenotype. Finally, there is no circularity in the statistical/ML sense, since no optimization is performed on the GWAS sample. We can assume a uniform distribution of p-values under the null distribution of no effect, which in turn allows us to perform hypothesis testing in the usual way without increasing the risk of spurious findings. While the improvement in association is modest, it is still statistically significant as shown in Fig. 3, with w_{SNP}=0 corresponding to no optimization; this supports its inclusion in the paper. We believe that by taking special measures during training, as well as using a larger sample size, the optimization can be further improved. We further argue that in a more general scenario, this refinement could also help find more true-positive associations for the case of a polygenic trait. We also clarify that the testing of the improved association (i.e. in final GWAS) was performed on a disjoint patient dataset from the one used for initial encoding, thus avoiding the bias due to overlap of datasets. The paper will be thoroughly revised for any grammatical errors in the final version.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The relevance of the approach is discussed, in particular with respect to the optimization results shown in Figure 3. Concerning the circularity issue, it is made clear that the multi-step approach is performed on disjoint discovery and validation sets.

Overall, the rebuttal is satisfactory in addressing the major concerns expressed during the review. Due to the originality of the proposed approach, and the large sample considered in the study, the feeling is that the work could make a nice contribution to the conference.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors have addressed the clarity with respect to the key circularity issue.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposes to learn genetic associations with image-based phenotypes using unsupervised graph auto encoder. Validation is on left ventricular shapes from the uk biobank.

One reviewer finds the motivation unclear.

A second reviewer finds it clear but finds it lacking better comparisons.

A third reviewer appreciates the method but questions whether discoveries may be due to circular reasoning.

Reviews are extremely disagreeing, however, two of which are too short, and in my opinion, fail to meet expectations from a reviewer. The most constructive review brings a point on circularity. In other words, optimizing the encoder components with genetic snp’s may lead to spurious discoveries. From my understanding of the paper and from the author’s rebuttal, the approach is not to necessarily discover new snp’s but rather provided richer, fine-tuning of phenotyping shape descriptors. The algorithmic contribution uses existing methods, less original in that aspect, but their use to fine-tune phenotyping with an extensive validation is, in my opinion, a contribution with potential impact worth publishing.

For these reasons, Recommendation is toward Acceptance.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

11

back to top

Image-derived phenotype extraction for genetic discovery via unsupervised deep learning in CMR images