Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jingyun Yang, Jie Hu, Yicong Li, Heng Liu, Yang Li

Abstract

Among symptoms of cerebral palsy (CP), the degree of hand function impairment in young children is hard to assess due to large inter-personal variability and differences in evaluators’ experience. To help design better treatment strategies, accurate identification and delineation of manual ability injury level is a major clinical concern. Periventricular leukomalacia (PVL), a form of brain lesion in periventriular white matter in premature infants, is a leading cause of CP and have clinical associations with motor function injuries. In this paper, we exploit the correlation between PVL lesion segmentation and manual ability classification (MAC) to improve the identification performance of the both tasks for T2 FLAIR MRI scans. Particularly, we propose a semi-supervised multitask learning framework to jointly learn from heterogeneous datasets. Two clinically related auxiliary tasks, lesions localization and ventricle segmentation, are also incorporated to improve the classification accuracy while requiring only a small amount of manual annotations. Using two datasets containing 24 labeled PVL samples and 87 labeled MAC samples, the proposed model significantly outperforms single-task methods, achieving a dice score of 0.607 for PVL lesion segmentation and 84.3\% accuracy for manual ability classification.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87234-2_43

SharedIt: https://rdcu.be/cyl8F

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors have proposed a semi-supervised multitask learning framework that exploits the correlation between periventricular leukomalacia lesion segmentation and manual ability classification to improve the identification performance of both these tasks on T2 FLAIR MRI scans. To do so, the proposed approach incorporates lesions localization and ventricle segmentation information in a novel framework. The validation of the proposed approach is performed in 2 different datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is clear and well structured.

    A novel framework called Semi-supervised Heterogeneous Multi-task Network is proposed.

    Two different datasets are considered in the experiments.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Many relevant parameters that are required to train the framework are missing.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Different details for training the proposed framework are missing and it would be hard to reproduce the results. The datasets used are also not publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Please specify all the parameters used to train the framework including the learning rate, the parameters λ_x, the batch size, size of the images, etc.

  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposed a novel method developed for an important clinical application. The presented experiments show some improvements obtained by the proposed approach in comparison to the state-of-the-art methods. A minimal ablation study is also introduced in the experiments.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Not Confident



Review #2

  • Please describe the contribution of the paper

    The proposed method was implemented in comparison with single-task method on two datasets. The experimental results demonstrate that the proposed model significantly outperforms single-task method for PVL lesion segmentation and manual ability classification in premature infants.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The authors proposed a semi-supervised multi-task learning framework for joint PVL detection and manual ability classification using T2 FLAIR MRI scans. 2) Lesion localization and ventricle segmentation were incorporated as two clinically related auxiliary tasks to improve the classification accuracy while requiring only a small amount of manual annotations.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1)This paper lacks clear explanation on how and why the multi-task learning works. 2)The proposed method is only verified on small dataset without comparison with the state-of-the-art methods, whose superiority is not convincing.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The results of this paper may not be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1)The authors need to express clear the correlation between PVL segmentation and manual ability classification in Fig. 2. 2)The authors need to explain how to define each lesion center. 3)In the experiment, the authors did not provide clear descriptions on the training, validation and testing data. 4)In the experiment, the authors did not provide detailed experimental settings. 5)As the experimental data is a little bit small, the authors need to carry out the proposed model on more data, which is more convincing. 6)There are some typos and grammar errors, such as “lesions localization” vs. “lesion localization” and “ventricle segmentaion”.

  • Please state your overall opinion of the paper

    reject (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors did not provide clear descriptions on network architecture and enough experimental data for method validation.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    3

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    In this paper, authors utilized a multi-task learning to improve the interpretability of the features encoded from deep learning network and in turn increase the MAC accuracy and PVL detection.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The method proposed is not complicated, but very clever and useful, both the MAC and Lesions segmentation were significantly increased.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some key information related to the images were missing.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Matched what the authors described in reproducibility checklist.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1What is the age of the images? Author complain in introduction that the PVL is a WM injury in preterm infants, but later authors said the CP was diagnosed in children. so the images used in this studies are preterm infant brain MRIs or children MRIs? 2, Why authors decided to use lesion localization and ventricle segmentation as auxiliary jobs? Some hypothesis is needed in introduction. 3, Why authors decided to use T2 FLAIR images instead of other modalities? Something is missing here in introduction.

  • Please state your overall opinion of the paper

    strong accept (9)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The results were significantly improved and the methods is reasonable.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    In this paper, the authors utilized multi-task learning to improve the interpretability of the features encoded from the deep learning network and in turn increase the MAC accuracy and PVL detection. There are several strengths: First, the method proposed is not complicated, but very clever and useful, both the MAC and Lesions segmentation were significantly increased. Second, lesion localization and ventricle segmentation were incorporated as two clinically related auxiliary tasks to improve the classification accuracy while requiring only a small amount of manual annotations. In addition, two different datasets are considered in the experiments. There are some weaknesses: First, many relevant parameters that are required to train the framework are missing, making it hard to reproduce the results. Second, the proposed method is only verified on a small dataset without comparison with the state-of-the-art methods, whose superiority is not convincing. When resubmitting, some problems to be addressed are listed below: 1)The authors need to express clearly the correlation between PVL segmentation and manual ability classification in Fig. 2. 2)The authors need to explain how to define each lesion center. 3)In the experiment, the authors did not provide clear descriptions of the training, validation, and testing data. 4)In the experiment, the authors did not provide detailed experimental settings. 5)As the experimental data is a little bit small, the authors need to carry out the proposed model on more data, which is more convincing. 6)There are some typos and grammar errors, such as “lesions localization” vs. “lesion localization” and “ventricle segmentation”.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    6




Author Feedback

We would like to express our sincere thanks to three reviewers for their valuable comments. We address the major concerns one by one as follow: 1.Clarification on the Figures Fig.2 shows the architecture of the model which jointly learns a common encoder for all target and auxiliary tasks. The correlation among these tasks enables us to assume that their discriminative features lie in a common multi-scale feature space, represented by the encoder network.

2.Details on Auxiliary Tasks For selecting auxiliary tasks, clinical evidence shows that PVL lesions are always detected regularly in specific locations like the thalamus and basal ganglia, and the area of the ventricle changes due to Leukomalacia. Thus we leverage information learned from Lesion Localization and Ventricle Segmentation to improve the performance of target tasks. Given the training data for lesion segmentation, we automatically create labels for Lesion Localization as follows: extract the topological structure of the lesion shape from the binary segmentation mask by outermost border following (suzuki, 1985) and compute the mass center coordinates using spatial moments.

3.Clarifications on the Dataset: age, modality and data size Dataset1 is captured from PVL patients between 19 to 28 months. Although PVL mostly occurs in newborns, premature infants younger than 1 year old often cannot tolerate MRI examinations, therefore we use MRI images of infants between 1-2 years old who are still in the relatively early stage of PVL. Dataset2 is captured from CP patients with MA tests between 4-12 years old. These children are older since CP is commonly developed from PVL and can only be diagnosed when children are old enough to perform certain physical tasks. Our study uses joint training to learn a better model that predicts MA level based on MRI images, which is a novel design. We choose T2 FLAIR instead of other modalities, as it is sensitive to brain pathological changes and can highlight small lesions around the ventricle. The data from PVL and CP children are difficult to collect and our study is the first to utilize such a systematic dataset to the best of our knowledge. Our framework is therefore specifically geared toward the small data situation. Despite the small dataset size, cross-validation shows that the results are relatively stable. The standard error of MAC is 1.2e-3 and of PVL segmentation is 1.6458e-2 (in SHMN-4 setting).

4.Additional Training Settings: sample size and parameters Model training is done on two datasets containing 24 images with PVL segmentation mask and 87 images with manual ability classification 68 mild, 29 severe respectively. We divide both dataset1 and dataset2 into 70% training, 10% validation and 20% testing for two target tasks. For auxiliary tasks, Lesion Localization is trained on Dataset1 and Ventricle Segmentation is trained on Dataset2, both with a 9:1 trainining-validation split. Due to the image size, 182x218x182, and memory constraints we set a batch size of 2. We use the Adam optimizer with an initial learning rate of 3e-4 and set it to decrease periorically if the losses do not improve enough. During training, coefficients of all tasks are equally set to 1 as our work mainly focuses on the multi-task framework. Better results could be obtained if coefficients are chosen through cross-validation.

5.Additional Comparison with Related Works There is no direct comparison with existing methods since MRI-based MA classification has not been done before. We have compared our model with a SOTA multi-task learning architecture (Ref.4 in paper) which adopts subnets to perform multiple tasks, by augmenting the 2D U-Net basline model with an image classification subnet. Using the same datasets, the accuracy of MAC and dice score of PVL segmentation are 0.627 and 0.531 respectively, compared to 0.667 and 0.583 for our result with SHMN-2. This demonstrates that our architecture choice for multi-task learning is more effective.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    All the main issues have been well addressed during the rebuttal.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents a semi-supervised multitask learning framework for two tasks periventricular leukomalacia lesion segmentation and manual ability classification on FLAIR MRI. To this end, the proposed method incorporates both lesion localization and ventricle segmentation. Authors assess their method in comparison with a single-task method on two datasets (one containing voxel-wise lesion labels and the other only related to the MAC). The experimental results on cross validation show significantly improvement over the single-task approach.

    I think the authors clarify most of the points requested by the reviewer, but the one as regards more dataset it is not really addressed. I understand this is challenging for certain populations.

    In my opinion the work presents a minor methodological contribution and validation is very limited (as for the lesion segmentation task very few subjects are available). In the rebuttal authors indicate that no SOA is available, but they seem to have now another multi-task method and results adapting Ref4, with another task. One one side they mention 2D UNET baseline model, but I though their baseline was 3D, at least it is stated like this in the paper, with an additional classification task. The new results with this additional multi-task approach appear quite similar to their results (0.627 and 0.531 respectively, compared to 0.667 and 0.583) so I would not draw any strong conclusion about the improvement of the proposed method. In fact, this makes me hesitate more about the additional value of this work and bias my decision towards not recommending it for acceptance.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    15



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper presents an interesting, clear and simple idea for increased performance in an interesting context that is likely to have applications in other situations. The weaknesses regarding clarification of design choices and experimental design are well clarified in the rebuttal and promising interesting interaction at MICCAI.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    6



back to top