Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

# Authors

Dongren Yao, Erkun Yang, Hao Guan, Jing Sui, Zhizhong Zhang, Mingxia Liu

# Abstract

Major depressive disorder (MDD) is a common and costly mental illness whose pathophysiology is difficult to clarify. Resting-state functional MRI (rs-fMRI) provides a non-invasive solution for the study of functional brain network abnormalities in MDD patients. Existing studies have shown that multiple indexes derived from rs-fMRI, such as fractional amplitude of low-frequency fluctuations (fALFF) and voxel-mirrored homotopic connectivity (VMHC) and degree centrality (DC), help depict functional mechanisms of brain disorders from different perspectives. However, previous methods generally treat these indexes independently, without considering their potentially complementary relationship. Moreover, it is usually very challenging to effectively fuse multi-index representations for disease analysis, due to the significant heterogeneity among indexes in the feature distribution. In this paper, we propose a tensor-based multi-index representation learning (TMRL) framework for fMRI-based MDD detection. In TMRL, we first generate multi-index representations (i.e., fALFF, VMHC and DC) for each subject, followed by patch selection via group comparison for each index. We further develop a tensor-based multi-task learning model (with a tensor-based regularizer) to align multi-index representations into a common latent space, followed by MDD prediction. Experimental results on 533 subjects with rs-fMRI data demonstrate that the TMRL outperforms several state-of-the-art methods in MDD identification.

SharedIt: https://rdcu.be/cyl5P

N/A

N/A

# Reviews

### Review #1

• Please describe the contribution of the paper

This paper proposes a multi-task learning model for detection of major depressive disorder using functional MRI data. The model employs a novel regularizer based on the recently proposed tensor nuclear norm (which measures the rank of a block circulant matrix constructed by all projection matrices) to align multi-index representations into a common latent space, followed by MDD prediction. Experimental results demonstrate that the proposal outperforms several state-of-the-art methods in MDD identification.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The application of the recently proposed tensor nuclear norm for regularizing the multi-task learning model is novel and interesting. Overall, the paper is well written and the method looks sound. The code will be freely released to the public via GitHub.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The method (both model and algorithm) is not clearly presented/explained (See detailed comments 1-4).

• Please rate the clarity and organization of this paper

Satisfactory

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
1. The code will be freely released to the public via GitHub.
2. Still, the method (both model and algorithm) is not clearly presented/explained (See detailed comments 1-4) which raises concerns whether the algorithm was correctly implemented and whether the experimental results are correctly interpreted.
• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

1. In the formulated problems (1) and (4), the tensor nuclear norm $\|\mathbf{\mathcal{U}}\|_{\circledast}$ is not defined. Since the tensor nuclear norm $\|\mathbf{\mathcal{U}}\|_{\circledast}$ used in the paper is new (not the commonly used ones that are defined based on the Tucker or CP ranks), it needs to be defined/explained. However, the tensor nuclear norm is neither defined nor adequately explained. It does not even cite the original paper that proposed this norm: [a] Lu, Canyi, et al. “Tensor robust principal component analysis with a new tensor nuclear norm.” IEEE transactions on pattern analysis and machine intelligence 42.4 (2019): 925-938. In particular, the description ‘‘The high-order tensor low-rank norm $\|\mathbf{\mathcal{U}}\|_{\circledast}$ measures the rank of a block circulant matrix constructed by all projection matrices, where $\|\cdot\|_{\circledast}$ is the tensor nuclear norm’’ is not very clear to me. For example, what is a block circulant matrix defined exactly? It would be clearer if it is described e.g., as The $\|\mathbf{\mathcal{U}}\|_{\circledast}$ is the tensor nuclear norm of $\mathbf{\mathcal{U}}$ defined in [Lu, Canyi, et al., 2019]. The tensor nuclear norm of $\mathbf{\mathcal{U}}$ can be thought of as a convex relaxation/surrogate of the rank of the block circulant matrix \mathop\left(\mathbf{\mathcal{U}}\right) \in \mathbb{R}^{dV \times CV} of $\mathbf{\mathcal{U}}$ [Lu, Canyi, et al., 2019]:
\begin{equation*}
\mathop\left(\bm{\mathcal{U}}\right) =
\begin{bmatrix}
\mathbf{U}^{(1)} & \mathbf{U}^{(V)} & \dots & \mathbf{U}^{(2)} \\
\mathbf{U}^{(2)} & \mathbf{U}^{(1)} & \dots & \mathbf{U}^{(3)} \\
\vdots & \vdots & \ddots & \vdots \\
\mathbf{U}^{(V)} & \mathbf{U}^{(V-1)} & \dots & \mathbf{U}^{(1)}
\end{bmatrix}.
\end{equation*}

2. The algorithm to solve the formulated problem (4) is not clearly/correctly presented. First, the presented algorithm is the alternating direction method of multipliers (ADMM) [13], NOT the Augmented Lagrange Multiplier (ALM) [12]. Please note that the presented algorithm solves an equivalent problem (by introducing an auxiliary tensor variable $\mathbf{\mathcal{G}}$):
\begin{aligned} \min_{\mathbf{\tilde{U}},\bm{\mathcal{G}}} \quad & \|\mathbf{\tilde{U}}^{ T} \mathbf{\tilde{X}} - \mathbf{\tilde{Y}}\|_F^2 + \alpha \|\bm{\mathcal{G}}\|_{\circledast} + \beta \mathop\left(\mathbf{\tilde{U}}^{T} \mathbf{M} \mathbf{\tilde{U}}\right) \\ \textrm{s.t.} \quad & \bm{\mathcal{U}} = \bm{\mathcal{G}} \end{aligned}


and to solve this problem it updates $\mathbf{\mathcal{U}}$ and $\mathbf{\mathcal{G}}$ in an alternating or sequential fashion, instead of updating them jointly. Second, the presented algorithm is not complete since it does not mention how the dual variable $\mathbf{\mathcal{W}}$ is updated. Third, the closed-form solution of problem (6) should be provided. Putting all together, the ADMM algorithm consists of the iterations:

\begin{algorithmic}
\STATE Update \bm{\mathcal{U}} via solving problem (6):
$\mathbf{\tilde{U}}^{k+1} = \left(2 \mathbf{\tilde{X}}\mathbf{\tilde{X}}^{T}+2\beta\mathbf{M}+ \rho \mathbf{I}_{dV}\right)^{-1} \left(2 \mathbf{\tilde{X}} \mathbf{\tilde{Y}}^{T} - \mathbf{\tilde{W}}^k + \rho \mathbf{\tilde{G}}^k\right).$
\STATE Update $\bm{\mathcal{G}}$ according to Theorem 1.
\STATE Update the dual variable $\bm{\mathcal{W}}$:
$$\bm{\mathcal{W}}^{k+1} = \bm{\mathcal{W}}^k + \rho \left(\bm{\mathcal{U}}^{k+1} - \bm{\mathcal{G}}^{k+1}\right).$$
\end{algorithmic}

3. In Section 2.2, “Prediction with Metric Learning”, why is the latent representation of a new test subject $\mathbf{z} = [\mathbf{z}^{(1)}; \mathbf{z}^{(1)}; \dots; \mathbf{z}^{(V)}] \in \mathbb{R}^{dV}$ defined as
\begin{equation*}
\mathbf{\hat{z}} =
\begin{bmatrix}
\mathbf{U}^{(1)}\\
\mathbf{U}^{(2)}\\
\vdots\\
\mathbf{U}^{(V)}
\end{bmatrix}^{ T}
\begin{bmatrix}
\mathbf{z}^{(1)}\\
\mathbf{z}^{(2)}\\
\vdots\\
\mathbf{z}^{(V)}
\end{bmatrix} = \sum_{v=1}^V {\mathbf{U}^{(v)}}^{ T} \mathbf{z}^{(v)} \in \mathbb{R}^C,
\end{equation*}


not as

\begin{equation*}
\mathbf{\hat{z}} =
\begin{bmatrix}
{\mathbf{U}^{(1)}}^{ T}\mathbf{z}^{(1)}\\
{\mathbf{U}^{(2)}}^{ T}\mathbf{z}^{(2)}\\
\vdots\\
{\mathbf{U}^{(V)}}^{ T}\mathbf{z}^{(V)}
\end{bmatrix} \in \mathbb{R}^{CV}
\end{equation*}


? It seems that the latter representation is more powerful since it has a much higher dimension ($C*V \gg C$).

4. In Supplementary Materials Theorem 2, “the optimization problem in Eq. (5)” should be “the optimization problem in Eq. (4)”. Moreover, can you briefly show why the iterates of the primal variables $\mathbf{\mathcal{G}}, \mathbf{\tilde{U}}$ converge to an optimal solution? Please note that according to reference [13] (subsection 3.2.1), the objective function and dual variables of the ADMM iterates converge but the primal variables need not.

5. Other mistakes/typos: In the second line below Eq. (1), “Bregman Discrepancy” should be “Bregman divergence”. For consistency, it needs to change the $\sum_{i,j=1 (i\neq j)}^V$ in Eq. (1) to $\sum_{1 \leq i < j \leq V}$, or alternatively, insert a scaling constant of 2 before the third term in Eq. (4). Section 2.2, “Prediction with Metric Learning”: The $\mathbf{U}$ should be written as $\mathbf{\tilde{U}}$ (which is defined in Eq. (2)). In Eq. (7), $(\mathbf{x}_i,\mathbf{x}_j)$ should be $(\mathbf{x}_i-\mathbf{x}_j)$. Also, please make clear that $\mathbf{x}_i = [\mathbf{x}_i^{(1)}; \mathbf{x}_i^{(1)}; \dots; \mathbf{x}_i^{(V)}] \in \mathbb{R}^{dV}$. In line 1 below Eq. (7), the transpose operator $^{ T}$ should be applied to the right $\mathbf{U}$ instead of the left one. It is a little confusing the same symbol $\mathbf{M}$ is used in both Eq. (3) and Eq. (7) for two different matrices. Please use a different notation. “ASD” should be “MDD”. “$m$ sample” should be “the $m$ samples”. Section 3, “Competing Methods”: The range ${0.01, 0.05, \dots, 10}$ is not completely clear. Is it ${0.01, 0.05, 0.1, 0.5, 1, 5, 10}$? Section 3, “Results of MDD Detection”: The description ““the TMRL achieves at least 4\% improvement in terms of ACC and SPE values and 2\% improvement in terms of SEN and F1 metrics” seems not accurate. According to Table 2, the improvement of TMRL over the second-best method is ACC: (0.642-0.594)/0.594 = 8.08\%, SEN: (0.643-0.621)/0.621 = 3.54\%, SPE: (0.639-0.579)/0.579 = 10.36\%, F1: (0.654-0.626)/0.626 = 4.47\%. Section 3, “Comparison with State-of-the-Arts”: “one single index” should read as “a single index”.

Supplementary Materials: Please cite the source/reference for Theorem 1, e.g., [Lu, Canyi, et al., 2019]. In Theorem 1, please make it clear that $*$ denotes the tensor-tensor product (t-product) [M. E. Kilmer and C. D. Martin, 2011] and $^T$ denotes the conjugate transpose of a tensor [Lu, Canyi, et al., 2019].

borderline accept (6)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

See main strengths and main weaknesses of the paper.

• What is the ranking of this paper in your review stack?

2

• Number of papers in your stack

4

• Reviewer confidence

Confident but not absolutely certain

### Review #2

• Please describe the contribution of the paper

This paper proposed to fuse multi-index representations for major depressive disorder (MDD) classification. The fusion was based on multi-task learning that mapped the fALFF, DC, and VMHC features in a latent space that the representations were complementary and not redundancy.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. This research’s feature representations are of clinical and practical meaning.
2. Comprehensive experiments were carried and the results show obvious performance improvement.
3. The potential of the proposed method is of significance, it can be useful for patients with MDD.
• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1 Although the results of the proposed method are better than the results of other state-of-the-art methods. The values are still very low (all below 0.66), which reduce the clinical significance of the proposed method.

• Please rate the clarity and organization of this paper

Excellent

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The statement of ethics approval is missing in Material part.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. Authors should give more descriptions in Fig 1. ’s caption, e.g., different color arrows.

accept (8)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper proposed a novel method based on multi-task learning. The method designing is rigorous and reasonable with better performance for MDD diagnosis, which is useful in clinical. Although the results of MDD diagnosis are not good enough, I think this method may have potential in other clinical application.

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

4

• Reviewer confidence

Confident but not absolutely certain

### Review #3

• Please describe the contribution of the paper

The authors propose TMRL : Tensor-based Multi-index Representation Learning framework. TMRL is used on low level rs-fMRI descriptors (fALFF, VMHC, DC) to predict Major Depression Disorder (MDD). It is a linear model on tensors of multi-index representations, while multitask represents subject MDD status.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
• Original low rank tensor regularisation of a multitask linear model.

• Consistent comparison with state-of-the-art works.

• Consistent benchmark of different predictive models.

• Insightful neuroimaging-based interpretation of the relevant features.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
• The multitask learning of MDD usefulness is not clear since the prediction relies on one MDD information only (1 and 0).

• The use of fALFF, VMHC, DC is not clearly motivated. In particular, functional connectivity matrices that are often considered as state-of-the art features are not included in the comparative study.

• Please rate the clarity and organization of this paper

Excellent

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The dataset is private (from a local hospital). As mentioned in the manuscript, the authors are committed to publish the code in GitHub once the review process completed.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
• Please include functional connectivity matrices in the comparison in order to highlight the benefits of TMRL.

• Please include non-linear models in the comparison of predictive models.

borderline accept (6)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The method is original and the application shows consistent improvements in terms of MDD prediction.

• What is the ranking of this paper in your review stack?

2

• Number of papers in your stack

5

• Reviewer confidence

Confident but not absolutely certain

# Primary Meta-Review

• Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The proposed method has its own novelty and the comprehensive experiments were well conducted by supporting the validity of the methods.

However, it is highly recommended to reflect the reviewers’ comments on clinical significance of the performance and motivation of using the features of fALFF, VMHC, and DC.

• What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

1

N/A