Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Oytun Demirbilek, Islem Rekik

Abstract

Learning how to estimate a connectional brain template (CBT) from a population of brain multigraphs, where each graph quantifies a particular relationship between pairs of brain regions of interest (ROIs), allows to pin down the unique connectivity patterns shared across individuals. Specifically, a CBT is viewed as an integral representation of a set of highly heterogeneous graphs and ideally meeting the centeredness (i.e., minimum distance to all graphs in the population) and discriminativeness (i.e., distinguishes the healthy from the disordered population) criteria. So far, existing works have been limited to only integrating and fusing a population of brain multigraphs acquired at a single timepoint. In this paper, we unprecedentedly tackle the question: “Given a baseline multigraph population, can we learn how to integrate and forecast its CBT representations at follow-up timepoints?” Addressing such question is of paramount in predicting common alternations across healthy and disordered populations. To fill this gap, we propose Recurrent Multigraph Integrator Network (ReMI-Net), the first graph recurrent neural network which infers the baseline CBT of an input population t1 and predicts its longitudinal evolution over time. Our ReMI-Net is composed of recurrent neural blocks with graph convolutional layers using a cross-node message passing to first learn hidden-states embeddings of each CBT node (i.e., brain region of interest) and then predict its evolution at the consecutive timepoint. Moreover, we design a novel time-dependent loss to regularize the CBT evolution trajectory over time and further introduce a learnable normalization layer to generate well-centered CBTs from time-dependent hidden-state embeddings. Finally, we derive the CBT adjacency matrix from the learned hidden graph representation. ReMI-Net significantly outperformed benchmark methods in both centeredness and discriminative connectional biomarker discovery criteria in demented patients.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87234-2_55

SharedIt: https://rdcu.be/cyl8R

Link to the code repository

https://github.com/basiralab/ReMI-Net

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper proposes a method to derive group-level connectome from an ensemble of individual-level connectomes presented as multi-view graphs and at different time points. The aim is to realise a group-level connectome for the population and characterise its evolution over time. This allows comparison of connectomes in different group of subjects, for example AD and healthy controls as performed in the paper.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Inspired by previous works, the authors proposed to realise node embeddings by aggregating information from neighbours as done in graph neural networks. They also propose how to incorporate neighborhood information from the previous time point. This is the right approach to obtain graph embeddings. In addition, the method allows to determine to learn the template of the connectome common to the population. This allows identifying regional biomarkers of two groups of subjects and how these biomarkers vary at different stages of the disease.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The proposed methods have been tested on one dataset with only two time points. The improvements over the competing method are marginal. In order to validate the recurrent model, validation on more datasets with more time points and more experiments are required. The detected biomarkers of AD need validation or justification by literature. The paper is difficult to comprehend and figures need improvement.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The codes are not made publicly available and reproducibility of the methods would not be easy. Though the codes of earlier method they have improved upon are available.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The paper proposes a method to derive group-level connectome from an ensemble of individual-level connectomes presented as multi-view graphs and at different time points. The aim is to realise a group-level connectome for the population and characterise its evolution over time. This allows comparison of connectomes in different group of subjects, for example AD and healthy controls as performed in the paper.

Inspired by previous works, the authors proposed to realise node embeddings by aggregating information from neighbours as done in graph neural networks. They also propose how to incorporate neighborhood information from the previous time point. This is the right approach to obtain graph embeddings. In addition, the method allows to determine to learn the template of the connectome common to the population. This allows identifying regional biomarkers of two groups of subjects and how these biomarkers vary at different stages of the disease.

The proposed methods have been tested on one dataset with only two time points. The improvements over the competing method are marginal. In order to validate the recurrent model, validation on more datasets with more time points and more experiments are required. The detected biomarkers of AD need validation or justification by literature. The paper is difficult to comprehend and figures need improvement.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors improved upon a previous technique and extended the methods for prediction of time evolution of group-level connectome. However, the methods need further experimentations and results need validation. The recurrent method was tested on a dataset with only two time points, which I think insufficient to demonstrate validity of methods. The paper is difficult to read and comprehend.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

3
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This paper proposed an extension to the previous connectional brain template (CBT) estimation method. The novelty of the proposed method is mainly in the usage of a recurrent neural network that integrates information at multiple time points so that the obtained CBT is more consistent across time. The recurrent structure also promotes information to be shared at different time points.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The method is written pretty clear with straightforward illustration figures.
2. The idea of calculating a longitudinal CBT is novel. It can be useful to have CBTs that are different at each time point and compare between them.
3. The method is compared against a decent amount of other methods and shows pretty good improvements.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. It is still unclear whether having a smaller Frobenius distance is enough to call a graph good. A graph has topological structures and Frobenius distance is not able to reflect the structure difference.
2. As this is a method that utilizes the recurrent neural network, it is not enough to evaluate the data on a time length of 2. As only two time points are involved, I am not fully convinced on its ability to handle longer time points.
3. Somewhat relevant to point 1, it is a bit weird to get the median graph by taking the elementwise median. In the end, we want a graph instead of a vector. Some averaging methods that takes the graph structure into consideration is desired.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The framework is evaluated on the open source ADNI dataset although the preprocessing and subject selection details are omitted. No code link is provided. It can be hard to reproduce the results in the paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. Considering the complexity of the algorithm and the size of the dataset (number of subject, number of time points), the framework’s advantage (especially the recurrent structure) can be more convincing if dataset with more time points can be used although this may not be applicable in the current paper form.
2. The CBT is considered to be good only when it is a Frobenius distance-based center in the current setting. However, I am doubtful it this is enough or proper. Frobenius distance doesn’t measure the topological structure of the graph. Some graph topological measures should be reported to validate that CBT is indeed representative of the population graph. It will be interesting to see if the graph topological measures like modularity is also around the average of the population.
3. It will be better if the author reports how the hyperparameters were chosen.
4. Some typos in the paper. Last paragraph in page 5, “loss L2”->”loss L_t”. In page 4, “t_i<=t_1”->”t_i>=t_1”
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I am not fully convinced on the evaluation metrics of CBT and the dataset size is also not appropriate considering the complexity of the framework.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Somewhat confident

Review #3

Please describe the contribution of the paper

The manuscript proposed a novel method of learning longitudinal connectional brain templates (CBT). The new method consists of 1) view normalizers, 2) subject-specific RNN learning cells that transform paired messages with a neighbor of the i-th node into the hidden state of the same node at the next time point, 3) time-varying loss that optimizes for centeredness with similarity constraint between time points, 4) a single learning cycle of hyper parameters, and 5) method for constructing population-level templates based on element-wise medians.

The benchmark results include: 1) representativeness, quantified as the average Frobenius distance between the training set CBT prediction and testing set samples; 2) reproducibility, evaluated as overlap rates of top 15 selected discrminative ROIs in cross validation. The manuscript also provides two types of potential applications, 1) comparing the group-specific CBTs for differentiating ROIs, and 2) selecting ROIs that progress the most.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The major strength and novelty lie in the longitudinal joint-modeling of the CBT process. This novelty is built on improving the previous works on modeling cross-sectional CBT with graphical neural networks. Compared to the existing method, the proposed method improves the convolutional cell by including pair messages within the neighbor. More importantly, the new proposed loss allows stable joint modeling across multiple time points, and is the first of this type of method.

It is interesting that the proposed longitudinal modeling approach outperforms the existing method that models each time point separately. The recurrent prediction structure utilizes repeated measurements when learning the hidden embeddings, which partially explains the possible gain in model fitting. It is left to be verified if the gain is mostly coming from the modified GNN or from the joint modeling across time points. Nonethelss, the work demonstrates the great potential in longitudinal CBT modeling. Potential clinical applications include predicting individualized disease progression by comparing it to the typical progression for precision-health purpose, which can be very crucial to promote more effective early intervention of neurodegenerative diseases.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The sample sizes (32 + 35 = 67) and the number of time points (n_t = 2) are limited in the data application. It would be better to have repeated datasets with more time points to further demonstrate the strength of the new method.

It is hard to tell if the improved performance (for example, in Table1, right hemisphere) is primarily due to the modified GNN learning cell or the longitudinal modeling strategy.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The procedure to reproduce the results is clearly described in the manuscript, although no code is provided with the submission.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

It is unclear what can be implied if a subject is very distant from the population CBT. For example, would that suggest a faster progression of the disease, or the other way around? Could there be future work that can provide such interpretability, for example, provided that training data of behavioral scores or lab testing results are available? It can also be of great impact if the derived outcomes such as distance from the population CBT can be verified as meaningful biomarkers.
Please state your overall opinion of the paper

strong accept (9)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

My recommendation is based on the method novelty and the potential significance in clinical applications. The proposed method demonstrates a novel yet flexible way of modeling complex longitudinal connectome processes. Such method has potential impact in precision medicine of neurodegenerative diseases.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The approach is interesting and figure1 is well designed. The authors need to address two concerns raised by the reviewers: 1) what is the rationale of using Frobenius distance as a measure? Is there any biological meaning? 2) How to justify the results will work on more than two time points? As a meta reviewer, I respect our reviewers’ suggestions. However, I am not sure what the exact scientific meaning of this work. We already have neuromas studies based on group-level results, but I think the biggest issue so far is that it is difficult (or impossible) to transfer those at the individual level, e.g., individual prediction. What is the useful clinical outcome or knowledge delivered from this work, besides the approach itself, and why they (if some) are important? Another concern is that this paper only has 67 samples, but I am not sure how many parameters are involved in the proposed complicated RNN-based graph model. How to justify the results is not overfitting? And as we know, most current brain imaging data only have a few time points (less than 3, even UKB only have less than 5 I think), why we need a complicated model like RNN which is indeed powerful with a long sequence, instead of some simple models, say regression?
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Author Feedback

The pioneering aspects of our work were highlighted by all reviewers (R1&R2&R4) and AC: (1) Novel: The manuscript proposed a novel method (R4) and the novelty of the method lies in the proposed recurrent neural network that integrates information (R2). (2) Effective: the experimental results showed that the longitudinal modeling approach outperforms benchmarks (R4) with pretty good improvements (R2). (3) Clinical feasibility: Our method allows to identify regional biomarkers and how these biomarkers vary at different stages of the disease (R1). Additionally, our method has potential clinical applications that can be crucial for early intervention (R4). (4) Clarity: The method is written pretty clear with straightforward illustration figures (R2). * The approach is interesting and figure1 is well designed* (AC).

Clarifying concerns: *Dataset size and different datasets (R1&R2&R4&AC): We agree there is a scarcity in the longitudinal medical imaging datasets leading to few numbers of timepoints. We evaluated our method using two independent datasets (AD/LMCI right and left hemispheres). On top of that, we further simulated a dataset with 6 timepoints and 200 subjects, and our method significantly outperformed all benchmarks (p<0.03). These results will be included as supplementary material in the final version.

Complexity of the model and number of hyperparameters (AC): The main purpose of using an RNN is to *obtain graph embeddings (R1) by incorporating neighborhood information from the previous time point (R1) non-linearly. The model has only 4 hyperparameters: hidden size (n_h), l_t regularizer (alpha), learning rate, and number of random training samples (n_k) for centeredness loss.

*Individual prediction (AC): In the results section, in the paragraph of “CBT discriminativeness and biomarker reproducibility test”, we provide top K regions as features. We already trained an MKL classifier for biomarker discovery at the individual level. A simple classifier (e.g., nonlinear SVM) can use these features to predict the labels at the individual level. In the extension of our work into a journal paper, we will also provide connectivities as features, and report the accuracies of an independent classifier that uses CBT-selected features.

*Topological soundness and biological meaning(R2&AC): We reported new results using the node strength as topological measure to investigate the topological soundness of the learned CBT. We calculated the MAE between the node strength vector of the learned CBT and each individual in the population. Average results across folds for our method are 2.25 and 2.02 for RH and LH, respectively, while the DGN gave 3.50 and 3.57. This will be included in the final version. Due to the space limit, we presented the results for node strength, from which other topological measures are derived (e.g., closeness centrality).

*Overfitting (AC): In the results section, we reported the comparisons for each fold based on two evaluation strategies (the last model and best model) to justify that the model is not overfitting. The rationale for evaluating these strategies is to ensure that the test loss at the max epochs and the test loss with early stopping are similar.

Post-training refinement (R2): As stated in the paper and Fig.1, the element-wise median is applied to a set of *subject-specific CBTs to select the most centered connectivities. Therefore, the result is not a vector but an adjacency matrix that we named as population-specific CBT.

*Validation of biomarkers (R1): As we discussed in the results section (clinical discoveries), we showed that our findings overlap with the paper of (Yang et al. 2019).

*Reproducibility (R1&R2&R4): The source code will be shared publicly via GitHub upon acceptance.

*Source of gain (R4): Gain is mostly coming from the modified GNN as we provided results for other GNN architectures with joint modeling across timepoints (i.e., our model variants).

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Because all the reviewers’ comments are positive, I will respect our reviewers’ suggestions.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

22

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This was a difficult decision. While the reviewers were fairly positive, the meta-reviewer raised key issues with the paper that I do not believe were adequately addressed. Specifically, the response to the dataset size comment is to reference new experiments, which would not have the chance to undergo peer review. Another concern is the need for an RNN when longitudinal neuroimaging datasets only contain 2-5 time points. I do not think that the desire to fuse neighboring information necessarily warrants an RNN. That said, the main weakness is with regards to subject-level prediction. As far as I can tell, the procedure outlined by the authors is to fit the model on each dataset separately, identify discriminative ROIs, and then train a subsequent classifier on these features. If accurate, then this process double-dips, in that the entire dataset is used for feature selection and then again in the five-fold CV. It is unclear to me that the method would extract discriminative (or even reproducible) features in a truly nested setting.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The paper proposed novel method on learning longitudinal connectional brain templates (CBT). The idea is novel. Although the validation set is small, the authors did thorough comparisons and clinical discovery discussion. The rebuttal on the application of group-level result to individual prediction is strong.

The weakness is with the lack of appropriate validation. It is invalid that the authors promised lots of new contents to be included in its final version. The question about Frobenius adoption was not answered in the rebuttal.

The “Accept” recommendation is to recognize the novelty and solidness of the proposed method and its complete validation experiment design. It may promote new research on the longitudinal CBT research in the future.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6

back to top

Recurrent Multigraph Integrator Network for Predicting the Evolution of Population-Driven Brain Connectivity Templates