Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yuexiang Li, Yanping Wang, Guang Lin, Yi Lin, Dong Wei, Qirui Zhang, Kai Ma, Guangming Lu, Zhiqiang Zhang, Yefeng Zheng

Abstract

In recent years, there has been increasing awareness of the occurrence of fatigue fractures. Athletes and soldiers, who engaged in unaccustomed, repetitive or vigorous activities, are potential victims of such a fracture. Due to the slow-growing process of fatigue fracture, the early detection can effectively protect athletes and soldiers from the material bone breakage, which may result in the catastrophe of career retirement. In this paper, we propose a triplet-branch network (TBN) for the accurate fatigue fracture grading, which enables physicians to promptly take appropriate treatments. Particularly, the proposed TBN consists of three branches for representation learning, classifier learning and grade-related prior-knowledge learning, respectively. The former two branches are responsible to tackle the problem of class-imbalanced training data, while the latter one is implemented to embed grade-related prior-knowledge into the framework via an auxiliary ranking task. Extensive experiments have been conducted on our fatigue fracture X-ray image dataset. The experimental results show that our TBN can effectively address the problem of class-imbalanced training samples and achieve a satisfactory accuracy for fatigue fracture grading. We will release the source code once the paper is accepted for publication.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_43

SharedIt: https://rdcu.be/cyl6j

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposed a triplet-branch network for the fatigue fracture grading using imbalanced training data. Compared to state-of-the-art methods such as label distribution-aware margin (LDAM) loss and bilateral branch network (BBN), the proposed method achieved higher average classification accuracy.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The authors proposed a new method named triplet-branch network for representation learning, classifier learning and grade-related prior-knowledge learning. The experiments show the proposed method outperforms state-of-the-art methods.
    2. Generalization evaluation shows the proposed method may be useful for other classification problems of which training data are imbalanced.
    3. Extensive tests have been conducted such as 5-fold cross validation and generalization evaluation using different datasets.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed TBN method shows marginally better results than the state-of-the-art approaches, there are still lots of space to improve the TBN.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors have stated that they will release the source code once the paper is accepted for publication.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    For future work, I would recommend more experiments to use benchmark datasets to compare the performance among the proposed TBN and the state-of-the-art methods such as LDAM and BBN.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method is new and experimental results show it outperformed the state-of-the-art approaches, and the generalization evaluation shows it may be useful for other grading problems in medical applications.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #2

  • Please describe the contribution of the paper

    This paper proposes a triple branch network for fatigue fracture grading. The author(s) propose to have 3 branches — conventional, re-balancing and a regularization branch which differ in the input sampling. This is primarily done to alleviate the class-imbalance problem in fatigue grading. In addition to this. they also incorporate a ranking loss to further improve the detection of the grading.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well written and each module is well explained along with its motivation.
    2. The results by proposed method show significant improvement in accuracy.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Although the paper is good read in general, I feel that the proposed method is an incremental one, in comparison to BBN [Ref1]. Only adding class wise uniform sampler might not be a significant novel contribution.

    2. The ranking loss, in some sense is a more broader form of the cumulative learning module since it give a relative grading between f_c, f_g and f_r (I agree it would not give which grade it belongs to but only the relative grading between them). I wonder what would happen if only the ranking loss is employed and then the classifier is trained on top of this.

    3. Another experiment of BBN with Ranking loss can be performed. The jump in accuracy from TBN - ranking loss to TBN is huge (4%) as compared to BBN to TBN (2%). I suspect that ranking loss is actually contributing to the accuracy in a significant manner as opposed to the class-wise re-sampler.

    References: [Ref1] Zhou, B., Cui, Q., Wei, X.S., Chen, Z.M.: BBN: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The author(s) promise to release the code (if accepted). In addition to this the model is derived from BBN and the details of the backbone architecture, learning rate, optimizer, number of epochs are given in the Implementation details section.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Please refer to weaknesses section.

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although I have concern about the contribution of class-wise sampler and ranking loss to be novel (in comparison to a previous work - BBN), I would still like to give a borderline accept to this paper. The results seems impressive and a combination of different sampling strategy with another existing loss (ranking loss) is actually helpful in this case.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #3

  • Please describe the contribution of the paper

    In this work, the authors propose a triplet-branch network to address the problem of fatigue fracture grading in X-rays. The proposed method is built upon the bilateral-branch network (BBN) that specifically designed for long-tailed visual recognition. To overcome the overfitting problem, the authors introduce a Class-wise Uniform Sampler to alleviate the sampling bias and regularize the learning process. Besides, a novel auxiliary ranking task is proposed to incorporate the prior knowledge of class dependency. Evaluation is conducted on a fairly large in-house dataset with 2725 X-ray images and a public dataset APTOS 2019 with 3662 fundus images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Automatic quantification of fatigue fracture can resolve the subjectiveness of human grading and also potentially improve clinical diagnosis efficiency. Thus, the task of fatigue fracture grading is of clinical interest;
    2. The proposed auxiliary ranking task and the modifications made for the existing BBN method is well-motivated and technical sound;
    3. Comprehensive evaluation is conducted on both the fatigue fracture grading task with the in-house dataset and the diabetic retinopathy severity grading task with the public dataset;
    4. Noticeable improvements are achieved on both tasks;
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The impact of some hyperparameters is not discussed. For example, how do the authors determine the value of a and b in the final loss function L;
    2. The design of y_aux in equation (5) is too straightforward. In the current design, the final value of y_aux can be affected by the scale of y_c, y_r, and y_g. A rescaling or normalization process may be necessary (please consider the differences between y_c, y_r, y_g = 1, 2, 3 and y_c, y_r, y_g = 2, 3, 4);
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Most of the impelemtation details are provided in the paper. But the settings for some hyperparameters should also be included (such as loss weight a,b).

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Overall, it’s a good work with clear writing and well-motivated and reasonable design. Meanwhile, I have the following suggestions for the authors:

    1. For pathological grading problem, reporting Kappa coefficient is also a good way to demonstrate the effectiveness of the proposed method;

    2. For better interpretability, CAM/Grad-CAM and other deep nerual network visualization methods are found to be very helpful;

    3. State-of-the-art fracture classicication/detection works in X-ray images should be studied in the literature review, for example: @inproceedings{jimenez2019medical, title={Medical-based deep curriculum learning for improved fracture classification}, author={Jim{'e}nez-S{'a}nchez, Amelia and Mateus, Diana and Kirchhoff, Sonja and Kirchhoff, Chlodwig and Biberthaler, Peter and Navab, Nassir and Ballester, Miguel A Gonz{'a}lez and Piella, Gemma}, booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention}, pages={694–702}, year={2019}, organization={Springer} }

    @article{cheng2021scalable, title={A scalable physician-level deep learning algorithm detects universal trauma on pelvic radiographs}, author={Cheng, Chi-Tung and Wang, Yirui and Chen, Huan-Wu and Hsiao, Po-Meng and Yeh, Chun-Nan and Hsieh, Chi-Hsun and Miao, Shun and Xiao, Jing and Liao, Chien-Hung and Lu, Le}, journal={Nature communications}, volume={12}, number={1}, pages={1–10}, year={2021}, publisher={Nature Publishing Group} }

    @article{kalmet2020deep, title={Deep learning in fracture detection: a narrative review}, author={Kalmet, Pishtiwan HS and Sanduleanu, Sebastian and Primakov, Sergey and Wu, Guangyao and Jochems, Arthur and Refaee, Turkey and Ibrahim, Abdalla and Hulst, Luca v and Lambin, Philippe and Poeze, Martijn}, journal={Acta orthopaedica}, volume={91}, number={2}, pages={215–220}, year={2020}, publisher={Taylor \& Francis} }

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    My rating primaryly considers (1) the potential clinical impact of fatigue fracture grading; (2) the soundness of the proposed method; and (3) the relatively solid evaluation.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    4

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a triplet-branch network for fatigue fracture grading, especially on imbalanced training data. Given three consistent positive reviews, I recommend accepting this submission. The authors should address the detailed comments from the reviewers in the camera-ready manuscript.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3




Author Feedback

N/A



back to top