Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Hyuna Cho, Gunwoong Park, Amal Isaiah, Won Hwa Kim

Abstract

Brain development in adolescence is synthetically influenced by various factors such as age, education, and socioeconomic conditions. To identify an independent effect from a variable of interest (e.g., socioeconomic conditions), statistical models such as General Linear Model (GLM) are typically adopted to account for covariates (e.g., age and gender). However, statistical models may be vulnerable with insufficient sample size and outliers, and multiple tests for a whole brain analysis lead to inevitable false-positives without sufficient sensitivity. Hence, it is necessary to develop a unified framework for multiple tests that robustly fits the observation and increases sensitivity. We therefore propose a unified flexible neural network that optimizes on the contribution from the main variable of interest as introduced in original GLM, which leads to improved statistical outcomes. The results on group analysis with fractional anisotropy (FA) from Diffusion Tensor Images from Adolescent Brain Cognitive Development (ABCD) study demonstrate that the proposed method provides much more selective and meaningful detection of ROIs related to socioeconomic status over conventional methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87234-2_40

SharedIt: https://rdcu.be/cyl8A

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

In this paper the authors developed a novel framework, called CoCoNet, built with multiple pairs of linear models correcting for covariates and optimising on statistical sensitivity. This approach led to improved statistical outcomes compared to traditional methods based on GLM when evaluating diffusion-based measurements from different groups of adolescents, classified on the basis of their socioeconomic conditions.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

• Novel statistical framework (CoCoNet) for multiple tests that fits the observation (e.g. ROI measurements), corrects for covariates and increases the sensitivity of the outcome compared to conventional statistics (e.g. GLM) • Appropriate comparisons with conventional approaches based on GLM • Clarity of writing, comprehensive description of the main methods/results
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

I do not see major weaknesses in the manuscript, I only have few doubts/concerns that I explain below
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Some details on the methods are missing in the manuscript, I assume also for space constraints, but if the codes will be released after acceptance as it seems from the reproducibility checklist this would be of great help for all the interested readers/researchers.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
In this paper the authors investigated the performance of a novel framework, named CoCoNet, that constructs an ensemble of multiple pairs of linear models that correct for covariates and optimise on statistical sensitivity. The authors demonstrated the superiority of this method compared to the other baseline methods (conventional GLM, GLM with NN and LASSO) in detecting differences between adolescents, divided into different groups on the basis of the poverty criteria. The paper is interesting, the methodology is sound and the results convincing, with possible applications in other domains, I only have few comments that I hope will help the authors to further improve the manuscript:
- In the abstract, the authors state that “multiple statistical tests for a whole brain analysis lead to inevitable false-positives if the method is not sufficiently sensitive”. Could you please comment on this statement? If multiple comparison corrections are applied, why false positives should still be largely present in conventional statistical analyses?
- Figure 1 is not straightforward to interpret, I would suggest to carefully revise it in order to make it clearer for the readers;
- In Section 4, I would add some more details on the three baseline methods, in particular about GLMNN and GLMLasso as they are only briefly mentioned here. Moreover, how did you choose the sparsity coefficient for LASSO?
- Regarding the hyperparameters for CoCoNet, why the number of groups was set to 1 (p parameter), as we have at least two groups in the comparisons? Moreover, did the authors try to add more covariates to the model besides these simple ones, in order to test the robustness of the method when several nuisance variables are added, as currently done in most of the studies? Why the alpha values for Bonferroni has been set to 0.01 rather than to the more conventional 0.05?
- Could you please add some comments on the values of the p-values resulting from CoCoNet, which are very different from those coming from the three baseline methods, as well as on the need to correct also in this novel framework for multiple comparisons?
- I would encourage the authors to make their code freely available for the other researchers facing similar issues, as the scientific community would benefit from this type of novel methodologies.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Novel and interesting framework for facing a common issue in neuroimaging studies, clarity of the manuscript and detailed results.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

This work proposed a novel Artificial Neural Network (ANN) framework called Covariate Correcting Network (CoCoNet). CoCoNet corrects for covariate effects, optimizes multiple models for whole brain analyses, and the implementation of CoCoNet in this work identified associations between socioeconomic status and brain outcomes, which were not detected (or detected with lower sensitivity) with GLM approaches.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- A novel formulation of an ANN architecture that addressed several issues that are common in OLS-based approaches, such as multicollinearity, outliers, and false-positive findings.
- A strong evaluation of CoCoNet on the ABCD study, which resulted in significantly increased statistical sensitivity compared to several GLM approaches.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The discussion is not very clear regarding the associations between socioeconomic factors and the brain regions identified by CoCoNet. Target audience would want to know why certain brain regions cannot be missed, i.e. it is important to use CoCoNet instead of simpler statistical models, such as GLM.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Sufficient details have been provided regarding the models, datasets, and evaluation.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
Main text:
- Check spelling consistency for “Full/Reduced Model/model” throughout the text.
- Page 5: show quantitative improvements
- Page 5: add comma between “148 regions” and “and”
- Page 5: please add citation for Adam optimizer.
- Page 8: The last sentence of Section 4.2 is quite long, and the logic is not very easy to understand, consider shortening and revising.
- Equation (4): explain gamma_l1, gamma_R, and gamma_F (instead of stating gamma’s).
- Equation (6): what were the gamma_R and gamma_F set to in this work?
- Table 1: Age (Mean ± Std)
- Table 3 caption: indicate the methods used (i.e. CoCoNet and GLM). Also, add note that GLM and GLM_lasso were not reported for Table 3(b) (I assume no significant p-values?)
- Fig 1: Move to Page. 5 (near the paragraph that describes CoCoNet overall architecture)
- Fig 1: In the diagram, ddd label for the regularizers
- Fig 2 and 3: color scales are too small to read.
References: The year format for citations 12, 13, 18, 19, 26, 30 are not consistent.

Supplementary:
- Please also cite the Supplementary Figures and Table in the main text (wherever appropriate).
- Supplementary Fig 1: fix the spacing for “p-value”.
- Supplementary Fig 2: spell out BP and NP.
- Supplementary Table 1: spell out BP and NP.
Please state your overall opinion of the paper

strong accept (9)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper is well written and clear in general, and certain aspect can be improved. What I like the most is the efforts reviewing and discussing the long-existing but often-ignored issues in GLM approaches: multicollinearity, outliers, and false-positive findings from multiple statistical tests; following that, strong evidence was provided comparing the proposed CoCoNet and the several GLM approaches. The authors should be commended for their efforts putting these together in a short conference paper. I look forward to seeing this paper accepted in revised form.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

3
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

The manuscript is concerned about the multiple testing issue in brain image studies, for example, under the setting where one tests the association between a socioeconomic factor and a large number of brain regions. The goal of the study is to improve test sensitivity through a penalized linear optimization problem, which maximizes F statistics, and takes sparsity constraint and prediction accuracy constraint into account. However, the resulted inference results are invalid due to the lack of type-I error control.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The application of flexible penalized linear optimization can have great potential in discovering sparse imaging signals. However, any follow-up inference has to be based on careful investigation on its impact to the resulted null distribution.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The F-test p-values and inferential results of the proposed new method are not valid. The F distribution of the regular F statistic under the null is only guaranteed under the normality assumption of regression residuals, which can hardly remain true given the proposed neural network fitting procedure. Because of the unknown null distribution, the proposed modified F* statistics can lead to uncontrolled high type-I error if one still enforces F distribution as the null distribution.

Admittedly, constructing valid inference for similar problems can be challenging. For example, despite the popularity of lasso (Tibshirani, 1996; title: Regression shrinkage and selection via the lasso), it is only until the more recent work by Lockhart et. al. (2014; title: A SIGNIFICANCE TEST FOR THE LASSO) that a valid asymptotic inference method becomes available. However, without any justification on the resulted statistic distribution under the null, the proposed p-values by this manuscript can be very inflated. For example, 52 ROIs among 148 ROIs are selected as significant by the proposed method, whereas traditional GLM only selects 6, which leads to serious concerns about false possitivity issues, especially the dataset does not provide ground truth information.

The manuscript lacks numerical confirmation of successfully controlling type-I error rates when there is no true signal.

The choice of dichotomizing socioeconomic variables can lead to possible drop in testing power.

Bonferroni correction used in section 4 is overly conservative because it is dominated by other family-wise-error-rate control methods such as Holm (1979; A simple sequentially rejective multiple test procedure).
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

No reproducible code is provided with the submission.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

There are recently developed multiple testing methods for imaging settings (for example, Zhang et al., 2011; title: MULTIPLE TESTING VIA FDRL FOR LARGE SCALE IMAGING DATA) that might be more appropriate candidate for head-to-head comparison.
Please state your overall opinion of the paper

strong reject (2)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed inferential method does not provide valid type-I error control, and therefore leads to invalid p-values and results.
What is the ranking of this paper in your review stack?

5
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

Two reviewers were very enthusiastic about the paper. However, Reviewer 3 argues that the assumed null distribution is not statistically correct. As a result, the method does not control the Type-I error rate, leading to invalid inferences. This reviewer also cites earlier work (Zhang et al., 2011) which might be a fairer baseline to the proposed method. The authors should carefully address the concerns raised by Reviewer 3. If, in fact, the statistical assumptions are wrong, then this paper should not be accepted.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Author Feedback

We appreciate all the reviewers for their enthusiasm and constructive comments. We address all the raised concerns below.

Rev #4 Q) Normality assumption missing / unknown null distribution / inflated p-values / Type-I error control / LASSO driven p-values / Dichotomization of groups. A: We partially agree with the reviewer and appreciate a chance to clear these concerns. 1) Yes, the method (and all other F-tests) works only under the Gaussian assumption. We should have made it more clear and will clarify it. 2) Soundness of the framework? Statistical Parametric Mapping (SPM) is a very conventional approach and use of GLM or mixed-effect models is common in neuroimaging. While SPM performs multiple ROI/voxel-wise regressions, we propose to solve them simultaneously as a unified framework. Almost identical results between GLM_NN and GLM_OLS show this is doable, which shares the same inductive bias with CoCoNet without l1-penalty. 3) Validity of findings? The main focus of our work is to estimate MSEs for reduced and full models robustly that leads to a more precise F. The idea to improve the estimation of F can be seen elsewhere (Ye et al., ICML 2012), and what matters is whether the data (or data-derived statistics) follow the right distribution. With l1-penalty in CoCoNet, we observed marginal decrease in R2 (<0.01 on average across 148 ROIs) and the residuals should be very close to F-distribution. There already exist many literature performing statistical analyses under Normality assumption on the ABCD study (Fine et al., JAMA Psychiatry 2019, Goldstone et al., JAH 2020), and we also have checked our regression residuals using QQplots which turn out to be highly Gaussian. 4) The increase in F and decrease in p-value were what we aimed for. Regardless of sample size, without sufficient effect sizes, it is impossible to detect any signals using conventional approaches even if they exist. The ABCD statistical team showed that meaningful effects are often garnered from small effects accompanied by significant p-values (https://doi.org/10.1101/2020.09.01.276451). This is because this is not a clinical sample and meaningful effects may be quite small but biologically relevant. Also, the cortical parcellations identified in our work are biologically feasible, which can be grouped together within a lobe of the brain (e.g., front lobe). It has been shown to be sensitive to the environment in preadolescence by other works (see section 4.2). 5) The l1-penalty in CoCoNet is applied on the vector of \beta which suppresses those unuseful ROIs and \gamma controls its sparsity. However for GLM_LASSO, l1-penalty is applied across all 148 testings and we did observe uncommon behaviors as the reviewer expected. 6) Our definition of socioeconomic groups follows many recent ABCD papers, e.g., Marshall et al., Nature Medicine 2020. Taking income as a continuous variable is practically impossible as it was self-reported and tax data is unavailable. Therefore, the use of these categories minimizes reporting error at the cost of statistical power.

Rev #2 / #3 Q) False positives (FP) in conventional statistical analyses? A: FPs exist but are removed after the correction. We used Bonferroni to control type-I error. Q) Choice of hyperparameter for LASSO? A: We tried various values in [0.0001, 0.1] to obtain the most stable convergence, which was 0.01. Q) Number of variables and significance? A: p=1 is sufficient to represent a group as a binary variable, i.e., X=0 or 1. Race, site and religion were available but they marginally changed the results. For \alpha, a more rigorous threshold is recommended as 0.05 is a rough cut-off in disease association studies (see Jafari et al., Cell 2019). Q) Why use CoCoNet? What are the clear indications of socioeconomic factors? A: We are able to reject the same null with improved statistical outcome, please see 4) for Rev #4 above.

P.S. We will correct format issues in the later version. Thanks.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

I would tend to agree with Reviewer 4 about the dubious statistical validity. The authors seem to conflate statistic significance with optimization procedures (e.g., sparsity or “suppressing” ROIs). In addition, they do not provide justification about how CoCoNet can select twice as many significant regions than GLM. It seems like their model simply outputs lower p-values, so that more regions will survive correction. It is unclear to me that it is capitalizing on some true property of the underlying data distribution.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors addressed for all main issues including the two issues from R4, which the meta-reviewer #1 requested to address for. I have no further concerns in regards to the flaws in statistics after the rebuttal.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

I think the authors have done a good job in addressing the reviewers’ questions. I would like to recommend to accept.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

back to top

Covariate Correcting Networks for Identifying Associations between Socioeconomic Factors and Brain Outcomes in Children