Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Jiaqi Yang, Xiaoling Hu, Chao Chen, Chialing Tsai

Abstract

Structural accuracy of segmentation is important for fine-scale structures in biomedical images. We propose a novel Topological-Attention ConvLSTM Network (TACLNet) for 3D anisotropic image segmentation with high structural accuracy. We adopt ConvLSTM to leverage contextual information from adjacent slices while achieving high efficiency. We propose a Spatial Topological-Attention (STA) module to effectively transfer topologically critical information across slices. Furthermore, we propose an Iterative Topological-Attention (ITA) module that provides a more stable topologically critical map for segmentation. Quantitative and qualitative results show that our proposed method outperforms various baselines in terms of topology-aware evaluation metrics.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87193-2_21

SharedIt: https://rdcu.be/cyhLN

Link to the code repository

https://github.com/YangjiaqiDig/Topology-Attention-ConvLSTM

Link to the dataset(s)

ISBI12: http://brainiac2.mit.edu/isbi_challenge/home

ISBI13: http://brainiac2.mit.edu/SNEMI3D/home

CREMI: https://cremi.org/

Reviews

Review #1

Please describe the contribution of the paper

This paper develops a Spatial Topology-Attention (STA) module to process a 3D image as a stack of 2D image slices and adopt ConvLSTM to leverage the information from adjacent slices. In order to effectively transfer topology-critical information across slices, they propose an Iterative-Topology Attention (ITA) module that provides a more stable topology-critical map for segmentation. Experimental results show that the proposed method outperforms various several baselines in terms of topology-aware evaluation metrics.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

– extend the 2D topology preserving method [5,11,12] to 3D efficiently by using the convLSTM and the proposed spatial topology attention module. The major contribution is to incorporate the adjacent slices critical points location into the current slice by designing a new attention module. This is a good extension for the topology preserving method to 3D.

– Because of the unstable of the critical points in the persistent homology method, authors propose an iterative topology-attention module to increase the training stability. Although simple, but seems effective.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

– The experimental evaluation can be more comprehensive. Authors did not evaluate on the popular ISBI13 dataset, and authors did not compare with the most recent SOTA method [12]. Authors also failed to report the segmentation accuracy metrics, such as dice score.

– The performance improvement is very minor in ISBI12 dataset as compared to the previous method TopoLoss [11]. E.g. the ARI is worse than TopoLoss, and the VOI and Betti error is quite comparable.

– It is not clear how fast is the computation of persistent homology. Is the whole workflow end to end trainable?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility depends on if authors share their codes. The computation of persistent homology is not trivial.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Please see the detailed comments in above sections.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Considering the reasonable good method contribution, but the experiments are not comprehensive and relative minor performance improvements achieved, I recommend prob accept.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This paper presents a new strategy based on the known “ConvLSTM network” to improve the segmentation for 3D images. The contribution of this work is to incorporate context information for a more precise segmentation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

a) A novel strategy to segment 3D images by combining existing methods based on deep learning and attention models, which allows avoiding loss of information in complex images. Each part of the method was evaluated, showing the contribution of the complete pipeline
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

a) The method is not clear in the abstract section b) The use of abbreviations without introduction c) Some sentences are not entirely true, e.g. “Deep learning methods have achieved human-level performance for image segmentation” d) It is missing an introduction of the related works in the text, and not just the references d) The method section is hard to follow e) The results are promising, and f) Could be used the method to segment cardiac structures from 3D images? heart segmentation is a challenge, and it could be nice to test the method on this complex task, with recognized datasets from MICCAI challenges. g) The computational time is not presented, which be interesting to validate the efficiency of the method
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The work is validated in known datasets, and therefore, this method could be reproduced to segment 3D cardiac structures
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

a) The method is not clear in the abstract section b) The use of abbreviations without introduction c) Some sentences are not entirely true, e.g. “Deep learning methods have achieved human-level performance for image segmentation” d) It is missing an introduction of the related works in the text, and not just the references d) The method section is hard to follow e) The results are promising, and f) Could be used the method to segment cardiac structures from 3D images? heart segmentation is a challenge, and it could be nice to test the method on this complex task, with recognized datasets from MICCAI challenges. g) The computational time is not presented, which be interesting to validate the efficiency of the method.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents good results when is compared with other related works. However, the method is difficult to follow and there is not information on sensitivity regarding parameter changes
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

4
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

This paper proposes a Topology Attention Module based on convLSTM for 3D medical image segmentation. This module captures rich structural context and stabilizes the attention.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The motivation is valid. The inter-slice correlation is important for the segmentation
- The experiments demonstrate the effectiveness of the proposed method
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Lack of comparison with the state-of-the-art.
- Lack of experiments on various datasets.
- The authors made several claims that are not supported by the experiments
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

As the authos is wiling to release the code, the work is reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- Lack of comparison with the state-of-the-art. There are many segmentation methods that are getting published every year. Why just compare with these baseline methods?
- Lack of experiments on various datasets. The authors claim that the method is “applicable to both isotropic and anisotropic images” (paragraph 3, Sec. 1). Which dataset is isotropic/anisotropic in the experiments? This claim is not supported by the experiments.
- The authors claim that the method is efficient (paragraph 3, Sec. 1), which is not supported by the experiments.
- According to the method section, only 3 slices are input to the network each time for the segmentation. I am not convinced that 3 slices are enough for capturing complex structural information.
- Notations are confusing. f is defined as continuous-valued function, but also used for the computation of attention
- The Iterative Topology-Attention (Sec. 3.2) is ensembles with the attentions of previous epochs in the training time. How it is computed during test time?
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

-Several unsupported claims -Lack of comparison with the state-of-the-arts -Lack of demonstration of various datasets
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

8
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper received mixed reviews. Most of the reviewers agreed the novelty of this paper, however, they also agreed the performance improvement is marginal, and the presentation quality is poor. R4 is more critic raising the fact that the work lacks of comparison with SOTA, and also needs to be validated on various datasets. Especially, R4 indicates that some claims of this paper were not supported by in the current form, and the authors are suggested addressing this issue in rebuttal. In addition, I feel the title of the paper is too large as only EM imaging type was used in this work.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

Author Feedback

We sincerely thank (meta) reviewers for constructive feedback, and we will improve our presentation accordingly. Below we address some major concerns. We start with R4’s concerns which are more critical.

R4: Compared with SOTA? A: Despite the extensive literature on segmentation, only a few methods focus on structural accuracy. We did compare with SOTA structure-preserving segmentation methods (Mosin. [CVPR’18], TopoLoss [NeurIPS’19]) and showed that our method improves the topological accuracy especially for anisotropic images. In principle, our cross-slides topological attention approach can naturally extend to all these methods and improve their structural accuracy.

R4: The claim “applicable to both isotropic and anisotropic images” is not supported by the experiments. A: An isotropic image is a special case of anisotropic image with less discrepancy across slices due to higher sampling rate in the z-dimension. For this reason, our method is naturally applicable to isotropic images as well. In addition, as slices are closer and their topologies are more consistent, our method should only work better on isotropic images. We understand the reviewers’ concern about the lack of experiment to explicitly support this claim. We will only emphasize the application on anisotropic images in our final version.

R1: Is trained end to end? R4: The claim of efficiency is not supported by the experiments. R1, R3: How fast is persistent homology? A: The model is trained end-to-end. The computational efficiency benefit is relative to 3D topology-preserving methods (e.g., topoloss [11]). We compute topological information on a stack of 2D images rather than directly on a 3D image. This significantly reduces the computational expense. For CREMI dataset, our method takes ~1.2 hours per epoch to train, whereas topoloss (3D version) takes ~2.8h per epoch.

R4: Lack of experiments on various datasets. R3: Apply to 3D cardiac structures? A: Topology-preserving segmentation methods have been proven useful in various datasets beyond EM images, e.g., cardiac images [5] and arteries [12]. Albeit we only demonstrated efficacy of TACNet on EM images, we are confident that our method will also generalize to these datasets, and will apply our method to cardiac images in the future.

R4: 3 slices are not enough for capturing complex structural information. A: We’ll add an ablation study in the final version. It is possible that more neighboring slices bring more modeling benefits, while the computation time can also increase.

R4: How Iterative Topology-Attention (ITA) is computed during test time? A: Both ITA and STA are only used for topological loss during training to improve the ConvLstm model. They are not used in inference time.

R1: Compare with [12]. A: The novelty of our work comes from the ability to handle topological discrepancy across slides. [12, ICLR’21] was published at the same time as we submitted to MICCAI. It uses discrete Morse theory rather than persistent homology to design the topological loss. We believe it is easy to incorporate the attention module with this new loss to achieve even better performance. The attention mechanism is applicable to similar loss.

R1: Evaluate on ISBI13 dataset. A: We will include the ISBI13 results in the final version. We expect similar results as ISBI13 is very similar to ISBI12 and CREMI.

R1: Report dice score. A: The strength of our method is on structural accuracy (measured in terms of ARI, VOI and Betti Error), but we’ll report the DICE score for pixel accuracy in the final version for completeness.

R1: The improvement is very minor for ISBI12 dataset. A: ISBI12 is a well-studied dataset. The performance has been saturated. Even so, we still made improvements over [11]. And our method outperforms [11] and others by a significant margin for CREMI.

Meta: Title is too large. A: We will change our title to: “A Topology-Attention ConvLSTM Network and Its Application to EM Images”.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

However, most of the reviewers agreed the novelty of this paper, but they also suffers from the poor presentations of this paper. In addition, most of the reviewers raised their concerns about the experiments: no SOTA and not enough datasets, e.g., ISBI 12 and 13. Although the authors explained their technical novelty and potential performances in different datasets in rebuttal, it is still hard to reveal the actual contributions to the community. I would like to suggest the authors to expand their dataset pool and compare with more SOTA in their future submission.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Although I believe this work is interesting and novel in the way it adresses topoloy preservation in image segmentation, I was not convinced by the authors rebuttal. In particular, I was not convinced by answers on the scope of the experiments (which is restricted), or on comparison to SoTA, and some hypothetical experiments that may confirm (or not) the authors hypothesis.

To my opinion, the paper is not ready for MICCAI, hence I recommend rejection for this paper.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

16

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The rebuttal has very addressed the critiques from the reviewers. This is an interesting method that incorporates 2D-based topology attention into 3D segmentation.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

back to top

A Topological-Attention ConvLSTM Network and Its Application to EM Images