Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Yuan-Xing Zhao, Yan-Ming Zhang, Ming Song, Cheng-Lin Liu

Abstract

Despite many recent advances, computer-aided mild cognitive impairment (MCI) conversion prediction is still a very challenging task due to: 1) the abnormal areas are subtle compared to the size of the whole brain, 2) the features’ dimension is much larger than the number of samples. To tackle these problems, we propose a region ensemble model using a divide and conquer strategy to capture the disease’s finer representation. Specifically, the features are independently extracted from non-overlapping regions and then fused to describe the subject according to the attention scores. Moreover, we design a novel loss that models the relationship between different stages of the disease to regularize the training process explicitly. Experiments on public data sets for MCI conversion prediction demonstrate that our method has achieved state-of-the-art performance. Specifically, the area under the receiver operating characteristic curve (AUC) is improved from 79.3\% to 85.4\%. Beyond that, each region’s contribution can be assessed quantitatively, using the proposed method.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_18

SharedIt: https://rdcu.be/cyl5Q

Link to the code repository

N/A

Link to the dataset(s)

http://adni.loni.usc.edu

Reviews

Review #1

Please describe the contribution of the paper

this paper develops region-wise approach to develop regional subnetworks that are then fused with attention scores for prediction.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

an interesting novelty is a regularized loss criterion encoding the known order/ranks for the probabilities of different AD diagnostic classes
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- lack of cross-validation! while testing on a held-out supposedly independent dataset (ADNI-2) is supposed to demonstrate generalizability, it would have been more convincing to randomize it even further in the true sense of cross-validation, to mitigate the effect hard-crafted hyper-parameters (which are several).
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

using open datasets is a plus! however, analysis code is not shared!
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Run proper cross-validation with an interest to show the results are not a fluke from hand-crafted hyper-parameters for the specific splits of ADNI-1 and -2
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

good paper presenting a rather interesting loss criterion, however it is weakened substantially by weak or no cross-validation
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

The paper presents a novel method to predict MCI conversion. They introduce a region ensemble network and a relation regularized loss function to tackle the problems related to the MCI conversion task. They conduct ablation studies to study the value of each component, and also compare their results with prior studies.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The paper introduces a novel method to combat the difficulties associated with MCI conversion, namely the subtle changes in abnormal areas and the large feature dimension compared to the number of available samples. Their approach combats these challenges and outperforms existing methods by introducing region ensemble network and new relation regularized loss function.
2. The paper is thorough and provides ablation studies to show value of each component in the method, and also comparison with other methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The main weakness of the paper is the lack of clarity in some parts, particularly, the relation regularlized loss function description. Several acronyms are used without spelling out first (e.g., NC, sMCI,pMCI). The inequalities on the probabilities on page 5 are unclear and needs better explanation. Similarly, the relation matrices need better explanation. There are also some other acronyms related to model (e.g., CBAM, BN) which should be spelled out on their first occurence.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The dataset is public. The authors mention in reproducibility statement that they will release code and trained models after acceptance.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. The paper will benefit from more clarity and better explanation of the relation regularized loss function section, including spelling out acronyms, explaining the inequalities with clinical relevance, and better explanation of the relation matrices. Some nomenclature is inconsistent, e.g. L_rank.
2. Some spelling and typo errors (e.g. “As shown in the figure, our model can localize different subjects’ abnormalities, which is a valuable property in the clinical”). Please double check and correct these errors.
3. Several acronyms used without prior spelling out. Please spell out each acronym on their first occurrence.
4. Fig 2: Consistent color scheme for bar plot and AUC plot would help in easier comprehension.
5. Fig 2 and Table 1: How are the sensitivity specificity values computed? By thresholding the probability maps? How were these thresholds decided? Is the same method used to compute the sensitivity specificity of the other models? For a fair evaluation, all methods must have all metrics computed in the same way.
6. Table 1: Statistical tests must be performed to show that the proposed method improves results statistically significantly.
7. Please confirm if code and trained models will be made publicly available.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper provides novel methodology for MCI conversion using region ensemble and relation regularized loss. They perform ablation experiments to verify utility of each component, and also compare performance with other methods. Overall paper is well written, although it needs clarification in some parts, and some statistical tests to prove statistical significance.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Somewhat confident

Review #3

Please describe the contribution of the paper

This paper proposes a region ensemble model using a divide and conquer strategy to capture the disease’s finer representation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

(1) It uses a divide-and-conquer strategy and attention mechanism to extract the discriminative features and locate abnormal areas (2) Proposes a relation regularized loss to regularize the model’s training process through additional samples. (3) The experimental results show the effectiveness of the proposed prediction model.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

(1) In Fig.1, the authors state that 15 region-based diagnosis sub-networks are adopted in our model. Why do the authors use 15 sub-regions? (2) This method designs an ensemble sub-network to automatically identify discriminative regions in the whole brain via an attention module. Thus, how about the performance when withouting the attention module?
(3) The authors define P_c as the predicted probability of X belonging to class c, and then give four conditions. It is not clear for this definition. (4) What is the “SEN” in Fig. 2 and Table 1? The authors also report the performance under different SEN values.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The whole framework is a little complex, raising my concern about the reproducibility of this work.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

(1) In Fig.1, the authors state that 15 region-based diagnosis sub-networks are adopted in our model. Why do the authors use 15 sub-regions? (2) This method designs an ensemble sub-network to automatically identify discriminative regions in the whole brain via an attention module. Thus, how about the performance when withouting the attention module?
(3) The authors define P_c as the predicted probability of X belonging to class c, and then give four conditions. It is not clear for this definition. (4) What is the “SEN” in Fig. 2 and Table 1? The authors also report the performance under different SEN values.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The overall idea is interesting and the paper is well organized.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

4
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposed a region-wise approach to develop regional subnetworks that are then fused with attention scores for prediction. Several major concerns include: 1) Not clear why 15 sub-regions were finally used, considering there are 130 regions generated by the segmentation model. What about the results if we directly use these 130 regions or merge them to a different number of sub-regions? 2) No analysis on parameters’ influence. It is critical to see the different contributions of these three terms in Eq. 8. 3) Lack of clear explanation of the proposed relation regularized loss function, and not clear on how to apply the model trained on four categories (AD, NC, sMCI and pMCI) to the two-category classification (pMCI vs. sMCI)? 4) The experimental evaluation seems to be superficial. Only ADNI-1 and ADNI-2 were used as the training and testing set, respectively. What about the results when ADNI2 were used as training data, and also what about when we use the cross-validation strategy? What about the performance without using the attention module? 5) The writing needs to be improved significantly to avoid typo and grammar errors, since some sentences would cause misleading, such as the paragraph in the Section of “Ablation Studies” and undefined acronyms.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

4

Author Feedback

We thank AC and reviewers for their valuable suggestions and will explain the concerns one by one.

Q1: Not clear why 15 regions were finally used. A1: We use 15 regions in our method mainly for 3 reasons.

Finer brain region segmentation results in a higher segmentation error. Hence, we use the 15 regions to balance the segment model’s accuracy and the diagnosis model’s interpretability.

3D CNN model is very GPU-memory consuming. To use more regions, we have to reduce the size of each region sub-network to fit the whole model into the GPU. We have tried to train models using 15, 70, and 134 regions. However, the AUCs of the 70 and 134 region models are 2-3% lower than the 15-region model.

Criteria on brain partitions are inconsistent in some small regions’ definition [Ref 1, Ref 2]. Therefore, we merge these small regions into larger ones to eliminate the conflict. [Ref 1] Lingzhong, F, et al. The Human Brainnetome Atlas: A New Brain Atlas Based on Connectional Architecture. [Ref 2] Glasser M F, et al. A multi-modal parcellation of human cerebral cortex.

Q2: It is critical to see the contributions of these three terms in Eq. 8. A2: Some contributions of the three terms in Eq. 8 are analyzed in section 3.3 and shown in Fig. 2. We further assess the influence of different combinations between these items: when λ1=1: λ2=0.01, AUC=0.783; λ2=0.10, AUC=0.812; λ2=0.50, AUC=0.850; when λ2=1: λ1=0.01, AUC=0.808; λ1=0.10, AUC=0.828; λ1=0.50, AUC=0.852; when λ1=λ2=1: AUC=0.854. We can conclude that both modules can improve the performance and not very sensitive to the parameters.

Q3: Lack of clear explanation of the relation regularized loss and how to apply the model trained on four categories to the binary classification? A3-1: The relation regularized loss, Eq. 6. is a ranking loss, which models the relationship between categories based on prior knowledge. When the relation between the output probabilities is consistent with the prior knowledge, Eq. 6. is small; conversely, Eq. 6. is large. Furthermore, both B < A and A > B are equivalent. We only optimize the relation of A > B, which is set to “1” in the relation-matrix. Using this loss with additional samples, we can reduce the solution space and find the optimal solution more efficiently. A3-2: The model outputs a four-dimensional score vector for each sample, corresponding to the four categories. We compare sMCI and pMCI scores and pick the higher score as the predicted category to make the binary prediction.

Q4: What about ADNI2 was used as training data? What about using cross-validation? What about the performance without using the attention module? We follow the experiment setting of the compared work in the paper to make the comparison easy and fair. Following the advice of the AC and reviewers, we have performed more experiments: A4-1: We reversed the training and testing set to train the model on ADNI2 and tested the model on ADNI1. The AUC is 0.778. It is lower than the model training on ADNI1 because the number of pMCI in ADNI2 is much less than ADNI1 (38 vs. 239). A4-2: As the compared method, we use 10% of training subjects for validation and do not perform cross-validation when writing the paper. Due to time constraints, we cannot finish the cross-validation and will perform it in the future. Additionally, our model does not overfit the training set seriously because the relation loss and region loss regularizes the whole training process to avoid getting trapped in local optima. A4-3: We retrained the model without using the attention module. The AUC is 0.846, which is slightly lower than the original method. It is noted that the attention module aims to describe each region’s weight rather than improving the accuracy of the module.

Q5: The writing needs to be improved significantly. A5: We are sorry for the confusion and inconvenience due to the writing. We will do our best to correct the typos and grammar errors in the next version of our paper.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Most of major concerns have been well addressed.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper introduces a novel methodology with region ensembling and relation-based regularization via a ranking loss. The rebuttal covers the reviewers’ comments with reasonable answers and supporting results. However, it should be guaranteed that the acronyms and typos are revised to improve legibility in a camera-ready version.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The proposed idea of the region ensemble network and a relation regularized loss function is novel. Concerns about a less clear explanation of the loss function and selection of regions for subnetworks were addressed in the rebuttal.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

back to top

Region Ensemble Network for MCI Conversion Prediction With a Relation Regularized Loss