Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Mingyan Qiu, Chenxi Zhang, Zhijian Song

Abstract

Accurate and automatic segmentation of the prostate sub-regions is of great importance for the diagnosis of prostate cancer and quantitative analysis of prostate. By analyzing the characteristics of prostate images, we propose a hybrid attention ensemble framework (HAEF) to automatically segment the central gland (CG) and peripheral zone (PZ) of the prostate from a 3D MR image. The proposed attention bridge module (ABM) in the HAEF helps the Unet to be more robust for cases with large differences in foreground size. In order to deal with low segmentation accuracy of the PZ caused by small proportion of PZ to CG, we gradually increase the proportion of voxels in the region of interest (ROI) in the image through a multi-stage cropping and then introduce self-attention mechanisms in the channel and spatial domain to enhance the multi-level semantic features of the target. Finally, post-processing methods such as ensemble and classification are used to refine the segmentation results. Extensive experiments on the dataset from NCI-ISBI 2013 Challenge demonstrate that the proposed framework can automatically and accurately segment the prostate sub-regions, with a mean DSC of 0.881 for CG and 0.821 for PZ, the 95% HDE of 3.57 mm for CG and 3.72 mm for PZ, and the ASSD of 1.08 mm for CG and 0.96 mm for PZ, and outperforms the state-of-the-art methods in terms of DSC for PZ and average DSC of CG and PZ.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87193-2_51

SharedIt: https://rdcu.be/cyhMu

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The main contribution of this paper is an approach to handle organ structures of highly varying sizes in an image for producing reasonably accurate segmentations. The results show improvements over existing state of the art.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper is fairly well written, albeit the method and presentation of results can be improved. The paper performs rigorous analysis with ablation tests and comparisons to existing methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

While the idea of attention gating - specifically using dot product gating proposed here is interesting, its not clear how it really works. From the method descriptions and equations, it looks like its computing a local self-similarity and a dot product of this local similarity with a feature embedding. How this translates to attention gating should be explained better. Also, the purpose and need for features aggregation from multiple feature levels should also be explained better and demonstrated with ablation results.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

OK.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The main problem with the paper is the lack of clear explanation of the attention method. One suggestion is to provide a very brief background of the closely related dot-product based attention gating method by Oktay and explain both using explanation and formulaically how the proposed method is different. From the read, it doesn’t look any more than repeated dot product of the features and their Eigen values. But how does this improve (qualitatively show the attention features) and theoretically (just show the formulation to contrast) from the existing work. Without a proper explanation, its not clear what this method really does.

Also, isn’t computing the Eigen decomposition computationally expensive. How much more computationally intensive is this method compared to non-local self attention or Oktay’s method?
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper potentially has some strengths, especially given the improved accuracy over the existing methods. However, the methods and formulation of the self-attention is unclear and makes it difficult to evaluate the methodological accuracy of the proposed method.
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This work presents a hybrid attention ensemble framework (HAEF) to segment the central gland (CG) and peripheral zone (PZ) of the prostate from a 3D MR image. HAEF consists of a prostate location network (PL-Net), attention bridge network (AB-Net), targeted segmentation network (TS-Net), ensemble and classification. Experimental results show that the proposed network outperforms state-of-the-art methods.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. This work presents a hybrid attention ensemble framework to segment CG and PZ from MRI images.
2. The develop network presents a PL-Net to locate the prostate, AB-Net to obtain a zonal segmentation result, T-Net to locate CG and PZ, and then integrates these segmentation results and a classification results to improve the segmentation accuracy.
3. An attention mechanism is incorporate into AB-Net to fuse features at different layers.
4. Experimental results on ISBI 2013 challenge verify the effectiveness of the developed network.
5. Ablation study experiments are conducted to validate the effectiveness of major modules.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. This work suffers from limited technical novelties. The main novel part is to generate attention maps from the fusion of multiple encoder features to weight features at each encoder layer of the AB-Net. While other PL-Net, T-Net, and the classification network are U-Net and DenseNet.
2. In Section 2.4, how to vote the output of AB-Net and the outputs of T-Net to generate the final result?
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I think the readers can implement the developed network, and a released code and their results will help a lot.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. All the equations should be ended with ‘,’ and ‘.’
2. The whole pipeline in Fig.2 contains multiple steps. It is better to list the order of each step in Fig.2.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This work presents a hybrid attention ensemble framework (HAEF) for segment CG and PZ from MR images by involving multiple steps. Experimental results show that the effectiveness of the developed network and its major components. However, the novel part is to use attention mechanism to fuse features at different CNN layers. And such idea has been discussed in previous segmentation works [Ref-1]. [Ref-1] Deep Attentional Features for Prostate Segmentation in Ultrasound, MICCAI, 2018.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

The current study proposed a hybrid attention ensemble framework (HAEF) to automatically segment the central gland (CG) and peripheral zone (PZ) of the prostate from 3D MR images, which contained multi stages to optimize the segmentation results. The proposed attention bridge module (ABM) in the HAEF helps the U-Net to be more robust for cases with large differences in foreground size. Classical U-Net and U-Net integrated with ABM (AB-Net) were used to first two stages for coarse segmentation. The cropped sub-ROIs were sent to T-Nets, which was integrated with channel- and space- attention, for fine segmentation and the results of AB-Net and T-Nets were fused by voting for final segmentation results. Classification results by DesNet169 were used to remove the mis-segmentations. Extensive experiments on the dataset from NCI-ISBI 2013 Challenge demonstrate that the proposed framework can automatically and accurately segment the prostate sub-regions.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The current study proposed a hybrid attention ensemble framework (HAEF) to automatically segment the central gland (CG) and peripheral zone (PZ) of the prostate from 3D MR images, which contained multi stages to optimize the segmentation results. To address the difference in the voxel proportion of the whole prostate among different samples, an attention bridge module (ABM) was proposed to fill the semantic gap between the shallow, fine-grained encoder features and the deep, coarse-grained decoder features. In order to deal with low segmentation accuracy of the PZ caused by small proportion of PZ to CG, multi-stage segmentation mode was used to increase the proportion of voxels in the ROI. T-Net integrated with attention mechanism in the channel and spatial domain was proposed to enhance the multi-level semantic features of the target for fine segmentation. Moreover, post-processing methods including ensemble and classification were used to refine the segmentation results.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The coarse segmentation accuracy of PL-Net and AB-Net is necessary for the fine segmentation in sub stages, the segmentation accuracy of PL-Net and AB-Net should be provided to prove the accuracy of coarse segmentation is enough for sub stages.
2. In equation (1), what were the d and l in e_d^l? This should be el in personal understand.
3. The diagram of ABM in Fig.2 is not clear, it should be explained clearly for the calculation way of dl and the role of dl in the calculating process of S.
4. The W in equation (3) and (4) was not be explained clearly.
5. The diagram of AB-Net and T-Net should be provided.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The code of this work was not provided while the data set used for train and test is the public data set. The reproducibility is general.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
The current study proposed a hybrid attention ensemble framework (HAEF) to automatically segment the central gland (CG) and peripheral zone (PZ) of the prostate from 3D MR images, which contained multi stages to optimize the segmentation results. The attention bridge module (ABM), channel- and space- attention mechanism was used to fill the semantic gap between the shallow, fine-grained encoder features and the deep, coarse-grained decoder features and to enhance the multi-level semantic features of the target. Post-processing methods including ensemble and classification were used to refine the segmentation results. Extensive experiments on the dataset from NCI-ISBI 2013 Challenge demonstrate that the proposed framework can automatically and accurately segment the prostate sub-regions. Comments:
1. The coarse segmentation accuracy of PL-Net and AB-Net is necessary for the fine segmentation in sub stages, the segmentation accuracy of PL-Net and AB-Net should be provided to prove the accuracy of coarse segmentation is enough for sub stages.
2. In equation (1), what were the d and l in e_d^l? This should be el in personal understand.
3. The diagram of ABM in Fig.2 is not clear, it should be explained clearly for the calculation way of dl and the role of dl in the calculating process of S.
4. The W in equation (3) and (4) was not be explained clearly.
5. The diagram of AB-Net and T-Net should be provided.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The current study proposed a hybrid attention ensemble framework (HAEF) to automatically segment the central gland (CG) and peripheral zone (PZ) of the prostate from 3D MR images, which contained multi stages to optimize the segmentation results. The proposed method is innovative and described clearly. Thus, I suggest receiving it after modification.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This work is well organised and presented clear methodology and results. In comparison studies, only the averages were compared without statistical testing. With the limited data and high variance, the improvement due to the proposed method. I would recommend an acceptance of this work with a clearly demonstrated benefit of the proposed HAEF, rather than leader board rankings. Therefore, i would invite authors to add this information, in addition to addressing comments from all three reviewers.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Author Feedback

Q1. It’s not clear how the attention method (our ABM) works (R1 Q4&7). It should be explained for the calculation way of dl and the role of dl. (R3 Q4.3&7.3). A1. According to the reviewer’s suggestions, we compare the principle of AG and our ABM to illustrate how the ABM works. The AG (Oktay’s) is formulated as: AG = Sigmoid (S (e_l, d_(l+1))), where e_l denotes the encoder features of the current l-layer and d_(l+1) denotes the decoder features of the (l+1)-layer. And AG only uses one method of concat to calculate the similarity score (S). There are three main differences between our ABM and AG. First, since the proportion of the prostate among different samples varies greatly, generation of well-organized features consisting of abundant semantic and ﬁne information is important. To maximize the utilization of information in each layer of the encoder, we fuse the cross-layer features to obtain richer spatial detail information. Our ABM is formulated as: ABM= Sigmoid (S (c_l, d_l)), where c_l is the contextual feature obtained by multi-scale feature fusion of the encoder. Second, the ABM is inspired by dot-product attention (Luong et al.), and uses the decoder features of the current l-layer d_l as gating for information filtering instead of the features of the coarser-grained layer (d_(l+1)) in the AG module. The d_l acts similarly to the gating mechanism in RNN, which filters information and better captures long-range dependencies. Third, we take into account four different alternatives for calculating similarity scores (see Eq 3 for details) and ‘general’ is selected as the best method through the experiment. We compare performance of AG and our ABM on the prostate zonal segmentation. Experiment results show that our Unet + ABM outperforms Unet + AG in terms of DSC of PZ (0.782 vs 0.757) and CG (0.875 vs 0.860).

Q2. How much more computationally intensive is this method compared to Oktay’s method? (R1 Q7). A2. The number of parameters of our method is 6.85 M, which is slightly higher than 5.94 M of Oktay’s method.

Q3. The reproducibility of the paper. (R2&3 Q6). A3. We’d like to release the code after the paper is accepted.

Q4. The novel part is to use attention mechanism to fuse features at different CNN layers. And such idea has been discussed in previous segmentation works [Ref-1]. (R2 Q4.1&9) [Ref-1] Deep Attentional Features for Prostate Segmentation in Ultrasound, MICCAI, 2018. A4: Our idea is not the same as [Ref-1]. [Ref-1] leverages the encoder multi-layer features to refine the features at each encoder individual layer, while our method filters the encoder multi-layer features by gating signal from decoder features d_l, with the ultimate goal of obtaining a finer segmentation result. Another contribution of this paper is a whole attention-based segmentation framework, which can be used to address the common problems in automatic zonal segmentation, such as organ structures of highly varying sizes in various images, large difference in size between PZ and CG, and mis-segmentation at apex and base of the prostate.

Q5. How to vote the output of AB-Net and the outputs of T-Net to generate the final result? (R2 Q4.2). A5. Voxels can be classified as 0 (back ground)/1(PZ) /2 (CG) in ABNet, 0/1 in TNet-PZ and 0/2 in TNet-CG. The category of each voxel is determined by the number of dominant votes in three outputs for that voxel. E.g., if results of a pixel from ABNet, TNet-PZ and TNet-CG are (1, 1, 2), the voting result of the pixel is 1, i.e. PZ. If there is a tie, according to the results of Table 3, the tie vote rule is that the result of TNet-PZ is given priority, followed by ABNet (e.g. for (0, 1, 2), the final result is 1). And for the case of (0, 1, 0), the TNet-CG result does not determine the PZ category and is ignored. The other two results (0, 1) are compared. Then it becomes a tie case, and the final result is 1. For case of (0, 0, 2), the result of TNet-PZ is ignored and the voting result is 0.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The supporting reviewers focused on the technical details with the existing methodology, with a good explanation provided by the authors’ rebuttal. Some are still based on intuition and vague motivation, for example, in “well-organized features consisting of abundant semantic and ﬁne information is important”, it is not clear what “well-orgnaised” means and what “abundant semantic and ﬁne information” are; More importantly, the rebuttal did not address the statistical concerns raised by the AC. With most conclusions were drawn from comparing only the expected values, means, with subjective concluding comments e.g. “only slightly higher”. Variances are reported, which are high in Table 1-4, but no statistical test results were reported. Therefore, it is not clear that whether such a difference in means were caused by overfitting the entire small dataset with only 10 test data, by chance or model performance. I worry that the results may be misleading in current presentation.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

15

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors have addressed/clarified most major concerns raised by the reviewers. The AC would like to recommend “Accept” of this paper.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors rebuttal effectively addressed the major concerns and comments. Their answers to Q4 and Q5 should be added to the revised version for clarification and reproducibility, respectively.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

9

back to top

A hybrid attention ensemble framework for zonal prostate segmentation