Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Fan Yang, Rui Meng, Hyuna Cho, Guorong Wu, Won Hwa Kim

Abstract

Given a population longitudinal neuroimaging measurements defined on a brain network, exploiting temporal dependencies within the sequence of data and corresponding latent variables defined on the graph (i.e., network encoding relationships between regions of interest (ROI)) can highly benefit characterizing the brain. Here, it is important to distinguish time-variant (e.g., longitudinal measures) and time-invariant (e.g., gender) components to analyze them individually. For this, we propose an innovative and ground-breaking Disentangled Sequential Graph Autoencoder which leverages the Sequential Variational Autoencoder (SVAE), graph convolution and semi-supervising framework together to learn a latent space composed of time-variant and time-invariant latent variables to characterize disentangled representation of the measurements over the entire ROIs. Incorporating target information in the decoder with a supervised loss let us achieve more effective representation learning towards improved classification. We validate our proposed method on the longitudinal cortical thickness data from Alzheimer’s Disease Neuroimaging Initiative (ADNI) study. Our method outperforms baselines with traditional techniques demonstrating benefits for effective longitudinal data representation for predicting labels and longitudinal data generation.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87196-3_34

SharedIt: https://rdcu.be/cyl2E

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a novel Sequential Autoencoder model which incorporates the graph information via graph convolution operation and jointly models supervised and unsupervised data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1)This paper learns an ideal disentangled representation that separates time-independent content or anatomical information from dynamical or modality information and conditionally generates synthetic sequential data; 2) This study performs semi-supervised tasks which can jointly incorporate supervised and unsupervised data for classification tasks; 3) This paper leverage graph structure to robustly learn the disentangling latent structure.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1.This article does not introduce the biological information of the subjects, such as age, gender, etc., which are very important in the task of medical diagnosis.

    1. Why the reconstruction root mean square error is small, and the classification accuracy is not necessarily high, such as the results of the first and second rows in Table 1.
    2. It is not enough to measure classification performance only by accuracy. Why is there no sensitivity and specificity?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This study lacks a detailed introduction of model parameter settings, which is difficult to reproduce

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1 Introduce the model parameter settings in this article in detail; 2. Enough comparative experiments are needed to prove the superiority of your method

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The experimental results and experimental settings in this article are unsatisfactory

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    3

  • Reviewer confidence

    Somewhat confident



Review #2

  • Please describe the contribution of the paper

    In this work, the author designed a deep framework to deal with brain network representation learning by incorporating variational auto-encoder. With this model, the author claims to encode both time-variant/invariant factors of network topologies as beneficial to longitudinal studies, e.g. for AD progressions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    In general, this paper explicitly addresses technical details with mathematical explanations of each framework component, making this paper easy to understand. It also introduces a semi-supervised model in an end-to-end learning fashion and combines two types of brain structural measurements, i.e. DTI connections and cortical surface morphometry, which leverage multimodal learning.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Though the method is straightforward, there are still a couple of things in the experiments I would like to see more discussions:

    1. According to Fig. 2/3, some key information was missing. For example, what is the color bar indicates? Rather than the absolute reconstruction patterns, I would prefer an error heat map.
    2. Besides, to me, the timestamps t_0~t_4 are ordered along with disease progression, however, it is strange to see similar patterns for t_0/t_2 and t_1/t_4. That is confusing. Any comment?
    3. The author tries to extract the time-variant factor using the proposed model. But in the experiment setting, they deliberately average all DTI networks for all groups, which significantly smooth the longitudinal information. That reduces the contribution of involving network data. In addition, why do you just use 1-hop graph convolution?
    4. In quantitative analysis, the architecture of the baseline models is not properly designed, leading to bias judgment. For example, the supervised version of S3VAE could also be easily modified into semi-supervised learning as in Eq.3, rather than two-stage learning. Meanwhile, it is hard to tell the improvement comes from graph convolution or variational sequential encoding, hence an ablation study with a clear discussion is much needed. I think this piece of information would assist to build a solid conclusion of the true contributions of this study.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No code been found, but I think the whole pipeline is clear to follow.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Please check my comments in the weaknesses part.

  • Please state your overall opinion of the paper

    borderline reject (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Lack of enough convincing evidence in experiments to draw a positive conclusion of its novelty.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    The paper proposed a sequential graph variational autoencoder (VAE) that separates time variant and invariant data. The VAE is semi supervised.The graph is built from an atlas (Destrieux atlas) with 148 brain regions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Inclusion of ROI graph in the sequential graph VAE is a novel contribution.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Many details on the data are not given. How many subjects in each group after mergiing to get two groups? How many timepoints? What is the time varying information (is it cortical thickness?) and the static data? Is there imbalance? Which data are unsupervised? Evaluation metrics are RMSE (for reconstruction) and accuracy (for classification). Accuracy is not a good measure if there is class imbalance. As the authors acknowledge, autoencoders that disentangle time variant and invariant data is not new. The new part seems to be the graph convolutions part. Authors should provide experimental evaluation that using the graph convolution is useful. The comparison with S3VAE may noy be fair. They depend on the “naive neural network” that was used and which is not explained.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code and data are not available. Details such as number of timesteps are not given.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    The authors presents visualization of reconstructed brains but not of the disetangle representations, which should be interesting. In the references, please capitalize words like Alzheimer’s and acronyms such as VAE, MRI and MR. Correct typos such as longidutinal, woule and supervise (Table 1). Highlight the best results in Table 1. Use other metrics like F1-score or AUC if there is class imbalance. Explain the naive neural network used in S3VAE.

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Experimental results are insufficient and are not convincing. Data

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This work proposed a deep framework for brain network representation learning by incorporating variational auto-encoder. It is a novel idea. The authors also present their results on longitudinal brain image studies.

    In the rebuttal, please address the ambiguities and provide more details as response to reviewers’ comments, e.g., why is the construction root mean square error small but the classification accuracy is not necessarily high? Some particular questions from reviewer 2.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    7




Author Feedback

We sincerely thank the reviewers for their thoughtful feedback as they will substantially help improve this paper. We try our best to flesh out all the concerns. Q1 (Rev #1) Reconstruction RMSE small but classification accuracy not necessarily high? A: The objective function in Eq. 4 consists of two terms. The first term focuses on reconstruction given label information, while the second is for classification performance. The hyperparameter α balances these two terms, thus, there is a trade-off between the reconstruction and classification. Furthermore, the reconstruction is conditional on the label information but does not model the label, and the label is considered only in the second term which directly handles classification. Thus, a better reconstruction does not necessarily guarantee a better classification. Q2 (Rev #2) Baseline architecture not properly designed leading to biased judgment. A: To the best of our knowledge, there is no semi-supervised sequential variational auto-encoder (SVAE) model proposed before this paper. Our model is the first approach considering the semi-supervised learning in SVAE. Reviewer did mention that S3VAE can be modified into the semi-supervised learning, however, there are two foreseeable issues with this. First, designing an architecture for the model label information is not trivial, one way is to build another MLP with softmax to directly model the relation between label and latent representations and keep the S3VAE structure. We actually tried this experiment, but unfortunately, both reconstruction and classification performances were worse than those of two-stage learning. Second, if we made significant modifications on the S3VAE to make it sufficiently suitable for our target, it is hard to say that S3VAE is still a baseline, as it would indeed be a newly proposed model. Q3 (Rev #2) Question on similar patterns for t_0/t_2 and t_1/t_3 in Fig 2/3. A: As we applied image processing (tissue segmentation and surface reconstruction) for each time point separately, it is possible that the change of cortical thickness is not consistent along time in some subjects. However, the average change of cortical thickness looks consistent along the neurodegeneration stage in our dataset. To avoid confusion, we will replace the individual result with average reconstruction results in the final version. Q4 (Rev #2) What is the color bar indicating? A: The color bar represents raw measures on the ROIs. We considered the error heat map before, but it is hard to visualize the difference, as the error is very small. Q5 (Rev #2) Why average DTI networks for all groups. A: Graph structure is assumed to be static and shared across all groups as we leverage global graph structure to robustly learn the disentangled latent structure to model the dynamics on the ROIs rather than the DTI networks. Q6 (Rev #1 / #3) Missing details of the dataset. A: The dataset includes 140 subjects with the Pre-AD group (93 subjects/330 records) and the Pro-AD group (47 subjects/170 records). The mean (std) of ages and sex ratio (Male:Famale) in Pre_AD group and Pro_AD group are 74.02(6.72)/ (185:145) and 74.87(6.92)/ (95:75), respectively. We will add this information in the later version. Q7 (Rev #1 / #3) Use of accuracy as classification performance measurement. A: Accuracy is a standard measure, and our training set (100 subjects) is relatively balanced with a ratio of 68: 32 between Pre/Pro AD groups. We will consider Precision and Recall as well in the later version. Q8 (Rev #3) Concern that the comparison with S3VAE may not be fair. A: We think two-stage learning with S3VAE is a reasonable baseline. As Rev #3 pointed out, we use the graph convolution in the model while S3VAE only depends on “naive neural network” without the graph. Therefore, comparing the model results between our model with S3VAE indeed demonstrates that leveraging graph information and integral learning structure (not two stage learning) are useful.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This work proposed a deep framework for brain network representation learning by incorporating variational auto-encoder. It is a novel idea. The authors also present their results on longitudinal brain image studies.

    The weakness is that some algorithm and experimental details are missing. The authors did a good rebuttal by explaining and emphasize their strength. The work provided new ideas on semi-supervised sequential variational auto-encoder model. It may inspire new work on this direction. An “Accept” recommendation is made to recognize this novelty.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    9



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper propses a Sequential Autoencoder model for semisupervised setting by incorporating the graph information via graph convolution operation. This is a novel idea with proper set of experiments. Some points were raised by the reviewers which are addressed in the rebuttal. If the paper is accepted, the authors are advised to address the concerns on the paper too.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    12



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes an autoencoder-based framework for disentanglement of time-varying and time-independent information, with application to modeling multimodal variability in Alzheimer’s disease. This joint use of disentanglement and graph convolution is recognised as novel by the reviewers. The main criticism is about the lack of thorough experimental validation, and assessment of the quality of the results, which makes it difficult to draw a definitive conclusion about the solidity of the contribution.
    The authors reply focuses on the fact that no model in the state of the art similar to the one here proposed, and the modification of the currently existing S3VAE would deviate too much from the original state of the art approach. Other clarifications on the experimental data and accuracy measures are also provided. The overall conclusion is that the work still lacks of a convincing experimental validation. Very little is shown about the disentanglement properties, and on how that time-variant and time-invariant features are handled and modelled. An ablation study would be also important to identify the value of each component on the framework. Finally, the reproducibility of the work appears to be low, model parameters and experimental choices are not clearly illustrated and justified in the text.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    16



back to top