Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Fei Lyu, Baoyao Yang, Andy J. Ma, Pong C. Yuen

Abstract

Developing a Universal Lesion Detector (ULD) that can detect various types of lesions from the whole body is of great importance for early diagnosis and timely treatment. Recently, deep neural networks have been applied for the ULD task, and existing methods assume that all training samples are well-annotated. However, the partial label problem is unavoidable when curating large-scale datasets, where only a part of instances are annotated in each image. To address this issue, we propose a novel segmentation-assisted model, where an additional semantic segmentation branch with superpixel-guided selective loss is introduced to assist the conventional detection branch. The segmentation branch and the detection branch help each other to find unlabeled lesions with a mutual-mining strategy, and then the mined suspicious lesions are ignored for fine-tuning to reduce their negative impacts. Evaluation experiments on the DeepLesion dataset demonstrate that our proposed method allows the baseline detector to boost its average precision by 13%, outperforming the previous state-of-the-art methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_12

SharedIt: https://rdcu.be/cyl5I

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    Propose a new segmentation-assisted model to handle the partial label problem in universal lesion detection (ULD), where a semantic segmentation branch with superpixel-guided selective loss is added; Propose a mutual-mining strategy to find unlabelled lesions for reducing their negative impacts on model training through the peer network

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • Simple but interesting idea, looks quite effective and have some practical values. • Present a new mutual-mining strategy for both detection and segmentation branches to find unlabelled lesion • Improvement of the ULD performance on DeepLesion, compared with the state-of-the-art methods

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    -For the detection branch, why Mask R-CNN is selected, not directly use Faster R-CNN?

    • The selection of many key parameters in the section of implementation details is not clear, how do they affect the overall performance?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • Use publicly available deeplesion dataset
    • Some details of the Experimental setting are provided.
    • Code is not provided
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • The overall idea is interesting. However, some details in the method section are not clear. E.g., why Mask R-CNN is selected as the detection branch? The Mask R-CNN is basically an extension of Faster R-CNN, by adding a segmentation mask to the detection branch. If you need to add an additional segmentation branch, why in the detection branch not directly uses the Faster R-CNN?
    • What is the overall computational cost?
    • Page 5, it says “we select “a” sample…”, what is the value for “a”? Page 6, “the selection ratio “a” for mutual-mining is 50%.”, is the same variable?
    • Equation 3, what does the label “-1” mean, what process will you do with the proposals labelled “-1”?
  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Simple idea but looks quite effective and have some practical values.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    3

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    Partial label problem on medical images is a common problem. With the popularization and application of deep neural networks in the universal lesion detector (ULD) task, the author of this paper develops a ULD that can detect various types of lesions from the whole body is of great importance. This paper introduces a novel segmentation-assisted model to handle the partial label problem in universal lesion detection. A semantic segmentation branch with superpixel-guided selective loss is added for early diagnosis and timely treatment. Besides, a mutual-mining strategy is proposed to find unlabeled lesions for reducing their negative impacts effectively. Finally, experiments show that the ULD performance on DeepLesion is improved over previous state-of-the-art methods. 

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1.The paper is clear in structure. 2.It is of great clinical value to solve partial label problem and find more lesions.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Some parameters are not clearly defined and expressed (τSUS, etc.).
    2. There are still doubts about the standard of a random selection of super pixels. (How many super pixels are selected? Why? What is the motivation for selecting super pixels around the lesion?)
    3. For the experimental results shown in Table 1 in this paper, it may be more convincing to use more evaluation indicators to explain the experimental results.
    4. The visualization of test set segmentation results is not comprehensive enough.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    First of all, the network architecture of this paper is clear and easy to understand. Secondly, the flow chart, network composition, and the calculation method of the relevant formula are described in detail, which can be reproduced, and then applied to the problem of medical image analysis.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. First of all, it is strongly recommended that the threshold parameters in this paper be introduced in more detail, so that readers can better understand the experimental process.
    2. Second, in the experiment in Table 1, perhaps more evaluation indicators can make the results more convincing.
    3. Thirdly, we hope to see a more detailed description for the ablation experiment to explain the advantages of super-pixel and multi-pixel-mining strategies.
    4. Finally, there are still doubts about the generality of the Segmentation-Assisted model, which may be proved by additional experiments.
  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The innovation of this paper and the richness of the experiment are questionable.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #3

  • Please describe the contribution of the paper

    The paper proposes to improve universal lesion detection by joint training a detection and a segmentation branches, then using the two branches to mine missing annotations. The segmentation branch has a superpixel-guided selective loss. Evaluation on the DeepLesion dataset demonstrate that the proposed method boosts the average precision of the baseline detector by 13%.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A novel joint training and mutual mining strategy is proposed. The superpixel-guided selective loss is also inspiring.
    2. The proposed method outperformed four existing state-of-the-arts. Ablation studies showed the effectiveness of the proposed strategies.
    3. The paper is clearly written with informative figures and tables.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The author claimed that “our method that is trained with the mined suspicious lesions and hard negative samples allows a Mask R-CNN based detector to boost its average precision (AP) by 13%”. However, the baseline method was not trained with the additional labels in 844 completely annotated training subvolumes [2]. Thus, it is not suitable to directly compare the two results.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Satisfactory.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. In the mutual mining strategy, how about treating the intersection of two branches as true lesions and use them as positive labels during fine-tuning? Currently they are treated as ignore.

    2. In the superpixel-guided selective loss, how to assign labels to the superpixels? Specifically, are the labels 0 or 1 for superpixels that are neighbouring to the lesion areas? I think they should be 0 since they are not in the lesion region. However, what if a superpixel is partially in the lesion region?

    3. The superpixel-based pseudo labels should be compared with other pseudo-labels such as those generated from mask RCNN, [16], and [22], using similar pixel weights (Eq. 1) but different mask contours. The contours generated by mask RCNN, [16], or [22] should be smoother than superpixels, thus may lead to better segmentation accuracy. In addition, the pseudo mask in [16] is simpler to generate than superpixels.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Novel and inspiring method. Improved results.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Addressing the problem of partially labelled data is critical in medical imaging, particularly in detection problems when using data mined from PACS. Approaches to address this issue are highly welcome.

    All reviews agreed that the authors presented an interesting approach to this problem, with interesting joint mining strategy and superpixel-guided loss. This would be a good contribution to the MICCAI meeting. While reviews were positive, some important shortcomings are present. All reviewers highlighted some concerns, particularly in the clarity and justification of experimental/hyper-parameter choices. R3 also asked some very good questions (Detailed Comments section). As much as possible, please try to address these shortcomings/questions.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3




Author Feedback

We would like to thank all the reviewers for the insightful comments and constructive suggestions.

==Q0: Explain the experimental/hyper-parameter choices in more detail. ==R0: Unless otherwise noted, all hyper-parameters are as in [4] for Mask R-CNN. For the threshold parameters, t_low (we remove the detected bounding boxes with low confidence scores lower than t_low), t_sus(we assign label -1 to those proposals after RPN which have a maximum IoU with the mined lesions higher than t_sus), t_neg and t_pos(if a proposal has a maximum IoU with ground truth higher than t_pos, its label is 1(lesion); if its maximum IoU is lower than t_neg, its label is 0(background)). The thresholds are chosen empirically, and we do not list all the experiments because of the page limit.

R#1: ==Q1: For the detection branch, why Mask R-CNN is selected? ==R1: Because it has been proved that Mask R-CNN can boost the detection performance of Faster R-CNN, and we follow [16] which employs Mask R-CNN for the detection branch and constructs a pseudo mask for each lesion region based on RECIST annotation. ==Q2: What is the overall computational cost? ==R2: The model size is around 417MB, and the training takes about 10 hours using four NVIDIA Tesla V100 GPUs with 32G memory. ==Q3: Page 5, “a” sample, Page 6, the selection ratio “a”, the same variable? ==R3: Yes, α is the selection ratio. ==Q4: Equation 3, what does the label “-1” mean? ==R4: The proposals with label ‘-1’ will be ignored for fine-tuning, which means these proposals will not be sampled for calculating the loss.

R#2: ==Q1: More evaluation indicators and experiments can make the results more convincing. ==R1: Actually we have done more experiments than reported due to the page limit, such as selecting the threshold parameters. As for the strategies for calculating the loss of the segmentation branch, we also compare different selective masks and prove the superiority of the super-pixel based masks. Currently we follow [2] for evaluating the lesion detection performance, but we will try to explore more metrics designed for this task. ==Q2: How many super pixels are selected? What is the motivation for selecting super pixels around the lesion? ==R2: The total number is 150, and we have also done experiments with other numbers and find 150 could give the best result. Many lesion-surrounding pixels that may also include some lesion level information [8] are ignored, therefore we also select superpixels around the lesion.

R#3: ==Q1: The baseline method was not trained with the additional 844 sub-volumes [2]. It’s not suitable to directly compare the two results. ==R1: Unlike [2], our mined suspicious lesions are generated by a mutual-mining module during training, and our method can boost the performance without additional annotations, as shown in Table 1. For fair comparison with [2], we apply our trained detector on the 844 completely annotated training sub-volumes, and select the hard negative samples for retraining. ==Q2: In the mutual mining strategy, how about treating the intersection of two branches as true lesions? ==R2: Because the mined lesion could be an unlabeled lesion or a false positive, ignoring them can prevent the false positives from introducing noise to the training process. Moreover, we have also done experiments which treat them as true lesions, and find that the result is not as good as ignoring them. ==Q3: How to assign labels to the superpixels? ==R3: We only calculate the loss for the pixels within superpixels which have non-zero values, and only lesion-related superpixels and another randomly selected superpixels with the same amount have non-zero values. ==Q4: The superpixel-based pseudo labels should be compared with other pseudo-labels such as [16], and [22]. ==R4: Actually we follow [16] to construct a pseudo mask for each lesion region based on RECIST annotation. But in our method, the selective masks(weight maps) are generated based on the super pixels.



back to top