Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jiacheng Wang, Lan Wei, Liansheng Wang, Qichao Zhou, Lei Zhu, Jing Qin

Abstract

Skin lesion segmentation from dermoscopy images is of great impor-tance for improving the quantitative analysis of skin cancer. However, the auto-matic segmentation of melanoma is a very challenging task owing to the large variation of melanoma and ambiguous boundaries of lesion areas. While con-volutional neutral networks (CNNs) have achieved remarkable progress in this task, most of existing solutions are still incapable of effectively capturing global dependencies to counteract the inductive bias caused by limited receptive fields.Recently, transformers have been proposed as a promising tool for global con-text modeling by employing a powerful global attention mechanism, but one oftheir main shortcomings when applied to segmentation tasks is that they can-not effectively extract sufficient local details to tackle ambiguous boundaries. We propose a novel boundary-aware transformer (BAT) to comprehensively addressthe challenges of automatic skin lesion segmentation. Specifically, we integrate a new boundary-wise attention gate (BAG) into transformers to enable the whole network to not only effectively model global long-range dependencies via trans-formers but also, simultaneously, capture more local details by making full use ofboundary-wise prior knowledge. Particularly, the auxiliary supervision of BAG is capable of assisting transformers to learn position embedding as it provides much spatial information. We conducted extensive experiments to evaluate the proposed BAT and experiments corroborate its effectiveness, consistently outper-forming state-of-the-art methods in two famous datasets.

SharedIt: https://rdcu.be/cyhLM

N/A

Reviews

Review #1

• Please describe the contribution of the paper

This paper proposes a Transformer based method for the skin boundary segmentation. The model contains a novel boundary-aware transformer (BAT) as the part of convenlution to model global feather and integrate a new boundary-wise attention gate (BAG) into transformers to capture local details. The proposed method achieves state-of-the-art results on ISIC2016 and ISIC2018 datasets.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The transformer is very popular in the computer vision. This paper introduce the transformer for the medical image segmentation.
2. This paper solves the problem of original transformer about the local information missing through a boundary-wise attention gate and proves it effectness in the ablation experiments.
3. This paper also compares the results with the transformer-based method for medical image segmentaion like TransUnet and MedT and gets promising results.
• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1.The paper: ‘‘CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation’’ gets the results on ISIC2018 of dice（92.08%） which is the state of art result than the proposed method(91.2%). The superiority of the method need to be reconfirmed. 2.We are quite familiar with method CE-Net and our result using that on the ISIC2018 is 92.2% for Dice. The superiority of the method need to be reconfirmed. 3.Again on the experiments, ISIC2016 and ISIC2018 may have the similary data distribution, the mehtod may not be robust on other skin datasets. 4.The baseline in the ablation study is not clear.

• Please rate the clarity and organization of this paper

Satisfactory

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

This paper can be reproduced.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. I suggest the author rethink the superiority and effectness of the method.
2. I suggest the author redo the experiments of CE-Net and get the equivalently results.
3. I suggest the author replace one of the dataset with an extra skin dataset to prove the robustness of method. 4.The effectness of dilated convolution is not obvious in this paper, which need extra detial explanation on it. 5.The writing and the presentation quality of figures can be improved. I suggest the author show which dataset is used in Fig.3 .

probably reject (4)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. As I know, the method of this article is not the state-of-the-art method on ISIC2018 comparing with the results in the ‘‘CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation’’.
2. Our results using CE-Net on ISIC2018 get the 92.2% Dice comparing with this paper 89.1%. And it outperforms than this paper’s method.
3. This paper introduces the popular Transformer for medical segmentation and solves the local information missing problem through boundary-wise attention gate (BAG).
• What is the ranking of this paper in your review stack?

2

• Number of papers in your stack

2

• Reviewer confidence

Very confident

Review #2

• Please describe the contribution of the paper

This paper proposes a novel boundary-aware transformer for skin lesion segmentation.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. This paper proposes a new transformer-based segmentation method for skin lesion segmentation, which is integrated by a new boundary-wise attention gate(BAG).
2. A detailed comparison experiment is conducted, and an ablation study is also evaluated.
• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1, this method is applied to skin lesion segmentation, which could be seen as a binary segmentation task. How does it perform in other multi classes segmentation tasks ?

1. The fig2 has been compressed too much. I could see the detailed information, such as backbone parameters.
2. the author should make a clear illustration for the network, such as backbone architecture(resnet 50 or resnet 101). From Fig.2, what confuses me is the skip connection. Does the features from stage 4 is transposed to the decoder?
• Please rate the clarity and organization of this paper

Excellent

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

there is no problem about reproducibility.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

This paper proposes a boundary-aware transformer for skin lesion segmentation. To capture the global features, the authors employ a powerful global attention mechanism to enhance the ability of feature extraction. In detail, a encoder network is first adapted, followed by a transformer architecture. Then, a prediction map is restored from a designed decoder. To tackle ambiguous lesion boundaries, the authors integrate a novel boundary-wise attention gate (BAG) into transformer to enable the whole network to learn global and local features. Overall, this paper is good. My major concerns are as follows:

1. Why does the method use CNN-based backbone to extract features, rather than the radical transformer, just like as ViT. As illustrated in abstract and introduction, it is important to capture local information, suddicient local details is dropped by the strides and pooling operations in the CNN-based network. Therefore, why does the proposed method persist in CNN-based backbone
2. What is the position embedding in this proposed method, and how to initialize it ?

accept (8)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper empoly the recent hot transformer architecture for skin lesion segmentation. It is not a simple increment.

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

5

• Reviewer confidence

Very confident

Review #3

• Please describe the contribution of the paper

This paper adopts the transformer network to solve the problem of skin lesion segmentation, and achieves the SOTA performance on two public datasets. To handle the ambiguous boundary of skin lesions, they propose a boundary-wise attention gate to emphasize the features of ambiguous boundary pixels.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) Using transformers in skin lesion segmentation achieves good results. 2) A specific module is proposed to improve SETR to handle the ambiguous boundary of skin lesions. 3) The writing is clear and the experiments are convictive.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) It is not clear how the predicted key-patch map $M_pred$ in Eq. 4 is obtained. 2) In Fig. 2, the $L_map$ seems to compare the output of encoder and the ground-truth key-patch map, which is not consistent with the definition in Eq. 4. 3) There is a typo of “MLA block” in Section 2.2. Should be MLP or MSA block. 4) The new algorithm of finding ambiguous boundary pixels does not totally make sense to me. Why do not directly set boundary pixels with low gradient magnitude as ambiguous boundary pixels? 5) How large is the patch size in image sequentialization? 6) Since this paper focuses on the boundary, using the Hausdorff distance as an additional evaluation metric could make the result more convincive.

• Please rate the clarity and organization of this paper

Excellent

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

It may need the original code to reproduce the proposed method.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

accept (8)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Although with a few minor problems, the paper is overall well-written, the method is novel, and the result are good.

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

3

• Reviewer confidence

Very confident

Primary Meta-Review

• Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a novel boundary-aware transformer for skin lesion segmentation.

Strengths:

• This paper novelty is to integrate a new boundary-wise attention gate in a transformer architecture, to overcome some problems that are specific to medical images, and to capture local details.
• Detailed comparison to SotA transformer based methods and ablations study are provided and show convincing results.
• Paper is well written.

Weaknesses:

• There are a few clarity issues or questions that can be addressed (please see each of the reviewer’s comments)
• The authors might want to refer a reference that is related to the SotA concerning skin segmentation (as suggested by Rev #1).
• What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

N/A