Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Xiaoyu Liu, Yueyi Zhang, Zhiwei Xiong, Chang Chen, Wei Huang, Xuejin Chen, Feng Wu

Abstract

The pipeline of connectomics usually divides the large-scale electron microscopy volumes into multiple 3D blocks and segments them independently. The segmentation results in adjacent blocks demand subtle merging so that corresponding neurons can be correctly stitched. In this paper, we propose the first deep learning based neuron stitching method for connectomics. Specifically, we densely slide a 3D window along the shared face of two adjacent blocks to generate the training and testing input. A classifier based on a 3D convolutional neural network is utilized to identify whether two instance objects from adjacent blocks should be merged. The stitching label is obtained from the in-block segmentation of dedicated blocks. Experimental results on isotropic and anisotropic datasets demonstrate that our stitching method outperforms state-of-the-art methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_42

SharedIt: https://rdcu.be/cymay

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The authors propose a deep learning based approach for stitching the segmentation results of neurons in electron microscopy volumes. In particular, given an existing segmentation method, they simulate a sliding window in the 3D results to generate training blocks for a convolutional network that decides if two overlapping instances need to be merged. Their approach is evaluated in three publicly available EM datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

While classic approaches are usually based on clustering an over-segmented result using regional adjacency graphs and pre-defined local features, the proposed method uses a CNN to learn those features and converts the problem into a binary classification problem.

The label samples used to train the CNN are automatically extracted from the real segmentation outputs of the method whose patches need to be agglomerated.

The experimental results on three public EM datasets used in connectomics show improvements with respect to two previous approaches: one overlap-based (Rhoana) and one graph-based (Gala, using 33 and 61 local features).
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The ablation test is a bit limited. The use of class activation maps is mentioned there to visualize the regions that are important for the CNN decisions, although it is not really part of an ablation. Therefore, the only experiment in that direction is related to the depth of the proposed CNN.

The execution times in Table 1 show only inference time for the proposed method, while the Gala-based approaches include as well the calculation of the pre-defined features. The training time of both approaches should also be mentioned.

The proposed approach is limited to segmentation methods that produce boundary maps.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

No information is provided about the availability of the code or experimental data.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

As mentioned before, the training times should probably be mentioned for all methods for a fair comparison.

Regarding the ablation study, it would be very interesting to see the impact of more than the depth of the network. For example, one may think of using other inputs or restricting the existing ones, as well as studying the impact of the block size in both the inference speed and the final result.

The training details provided are really limited. I would recommend following a checklist similar to the one provided by Dodge et al. (https://arxiv.org/abs/1909.03004) to facilitate the reproducibility and replicability of the proposed work.

The supplementary document is not mentioned in the main paper.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper describes an interesting new approach using CNNs for stitching neuron segmentation results from adjacent volumes and is compared with classic existing approaches in public datasets. A deeper analysis of the proposed methodology would enrich the work further and clarify the impact of some components of the proposed framework.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

The paper introduces a novel method for stitching segmented blocks for large-scale instance segmentation in 3D. The application is in the segmentation of neurons in EM volumes, where stitching is particularly challenging as the processes are small and alignment of slices is often imperfect.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper addresses an important problem in a conceptually simple, but novel, way. Overlap-based stitching is clearly not adequate, but RAG-based approaches are tricky and difficult to tune. It’s great to see a new contribution to this problem which will hopefully open the door to other learning-based approaches.
- The method is sufficiently well validated, on both isotropic and anisotropic data
- The performance improvement is impressive
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Can’t point out anything major
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The method is described in detail and should be reusable if the code is released. It would be very important to specify which part of the FAFB dataset is used for evaluation as it would allow direct comparison for future algorithms. I can’t see from the reproducibility statement if this will be the case.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The paper will be easier to understand at first glance if the figure captions are more self-contained. Currently, the figures are not understandable without reading the text.
Please state your overall opinion of the paper

strong accept (9)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper proposes a novel approach to solve an important problem. The approach is simple enough to be usable in practice and, provided the code is released, will likely become an impactful contribution to the field of large-scale instance segmentation.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

The paper proposes a deep learning pipeline to stitch the EM segmentation results in small blocks for the whole volume. Near the block intersecting face, the proposed method extracts four channels of features to feed into a customized neural network model and predicts if the two segments of interest should merge or not. On three popular datasets, the proposed method achieves good results.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper proposes an end-to-end solution for the segmentation stitching problem, which is important for large-scale segmentation.
- The proposed method achieves good results.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Limited novelty. (1) For converting the stitching problem to a binary classification problem. However, the block stitching problem can be seen as a special case of the error correction problem starting from over-segmented results. In the error correction field, the problem has been converted into a binary classification one [B,C] and further integrated by a graph for global optimization [D]. (2) for converting gt segmentation into labels for binary classification, it is straightforward and standard.
- Missing related works: For agglomeration methods, the paper misses [A]; for error correction, the paper misses a line of works [B,C,D].
- Lack of justification for the method design. The paper puts forward the design without much justification with ablation studies. (a) For the model, the paper proposes the BasicBlock module, and it is unclear if it is better than the standard resBlock. It’s good to have new designs, but it needs to be backed up with ablation studies. Otherwise, it is hard to understand the importance of the proposed module, e.g. is the “adjustable in choosing stride number” helpful. (b) for the input features, it’ll be good to have an ablation study to evaluate the importance of each type of channel (image, boundary, binary seg).
- Lack of strong baseline methods. For comparison, Graph-based methods are non-deep-learning. For the overlap-based method, the paper uses a non-state-of-the-art method [7]. The state-of-the-art segmentation method [2] (used for in-block segmentation) provides results on two of the datasets, where they use a simple affinity-based region graph merging method. The paper results can be convincing with a head-to-head comparison with it. Currently, it is unclear if the proposed stitching method is useful for [2].
[A] High-precision automated reconstruction of neurons with flood-filling networks. Januszewski et al., 2018 [B] An Error Detection and Correction Framework for Connectomics. Jung et al., 2017 [C] Guided Proofreading of Automatic Segmentations for Connectomics. Haehn et al., 2018 [D] Biologically-Constrained Graphs for Global Connectomics Reconstruction. Matejek et al., 2019
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
- The code is not submitted.
- The paper provides enough details for the model architecture for reproduction.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- It’ll be good to put the paper in a better context, e.g. error correction and compare with relevant baseline methods.
- For the method design, it’ll be good to use or compare with standard building blocks.
- For the experimental results, it’ll be good to compare with strong baselines, e.g. [2]. Also, it’ll be good to provide more experiment details. For example, CREMI dataset has 6 volumes and it’s unclear which volumes is the method trained upon. It’ll be convincing to have a good performance using the standard benchmark settings, e.g. CREMI leaderboard or results in [2]. Otherwise, it is unclear if the proposed method is indeed useful for the community.
Please state your overall opinion of the paper

reject (3)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper misses the related work and comparison in the segmentation error-correction field, which makes the claimed novelty unconvincing. Further, the experimental results are unconvincing due to the lack of comparison with strong baselines, e.g. [2] and its non-standard experiment settings on the popular datasets. Also, the proposed method design is not well-justified with ablation studies.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

4
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a method to stitch segmentation results together for large scale 3D connectome images. The proposed method treats the stitching problem as a binary classification problem for each pair of adjacent patches.

This paper received diverging scores. While R1 and R2 considered the proposed approach novel and well-motivated, R3 raised the following concerns which should be addressed carefully during rebuttal.

1) the proposed method shares similarity with existing error-correction methods. The novelty needs to be clarified. 2) ablation study is not sufficient to justify different design choices (also raised by R1). 3) strong baselines are missing.

Note the authors may not submit additional experimental results in the rebuttal. The rebuttal is only meant for clarification purpose.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Author Feedback

We thank all reviewers for comments, especially two of them who have given very positive comments (e.g., important problem, novel idea, and promising results) on our paper. We also appreciate the opportunity to further address their concerns. Overall, we believe the following issues raised by R3 can be clarified: 1) the proposed method shares similarity with existing error-correction methods. The novelty needs to be clarified. Reply: Although the proposed method for neuron stitching and existing error-correction methods [B, C, D mentioned by R3] share similarities in terms of segmentation error correction, they are different in some key aspects. i) Different tasks The neuron stitching task aims to stitch the neurons between adjacent segmented blocks, which is very useful in connectomics. It is different from the error-correction task (Ref. [B, C, D]) which corrects in-blocks errors caused by segmentation algorithm itself. For example, the “boundary shift” described in Fig. 1 is pervasive in the neuron stitching task, but seldom in error-correction task. We are the first to solve the neuron stitching task via an end-to-end deep neural network. ii) Different solutions Ref. [B] proposes a two-stage method. They utilize two UNet++-like networks to detect and correct the segmentation errors. Ref. [C] develops two classifiers to detect 2D potentially-erroneous regions. Ref. [D] introduces biological priors and time-consuming global optimization to refine the segmentation results. In our work, we propose a classifier to judge whether two 3D neurons should be merged between adjacent segmented blocks. The input of our classifier is different from that of Ref. [C, D]; while the output can be directly utilized to predict whether merge or not. In addition, we also develop an effective way to generate training patches and propose a speed-up strategy. iii) Different GTs All Ref. [B, C, D] utilize manual annotation for the GT generation. Our GT is generated by segmenting dedicated blocks, which is a simple yet timesaving way for the task. We will clarify the novelty more clearly and add the mentioned references in the camera-ready version if the paper is accepted. 2) ablation study is not sufficient to justify different design choices (also raised by R1). Reply: i) not sufficient ablation study In the main paper and supplementary material, we provide the ablation study of network depth, stride number and number of training samples. Here, we present the ablation study on the network input channels (required by R3). The results are as follows: Input Channel—Validation Acc.–VOI mask——————-0.9152——-0.7560 mask+Bound.——–0.9192——-0.7537 mask+Raw————0.9231——-0.7474 mask+Bound.+Raw-0.9403——-0.7379 The raw image and boundary map provide useful texture and structure information respectively for classification decision. It can be seen from the table that the input combination composed of binary mask, raw image and boundary image has the best performance. We will add the result of ablation study in the supplementary material. ii) concerns from R3 about the “Basic Block Module” The “Basic Block Module” actually is a basic building block in classic Resnet18. Essentially, it is composed of ResBlocks with minor modification to fit our task. This module is not the main contribution of our paper. It is not necessary to perform ablation study on this module. 3) strong baselines are missing. (raised by R3) Reply: We compare our method with the non-deep-learning graph-based method and the traditional overlap-based method, which are the most representative methods. Ref. [2] mentioned by R3, although has graph merging operations, is just an in-block segmentation method, while our stitching method does not segment blocks but merge segmented blocks. Actually, the segmented blocks we use for stitching are generated using the method proposed in [2]. In summary, we don’t think Ref. [2] is a suitable baseline for our task.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The concerns raised by R3 seems to be reasonably addressed. The novelty compared with existing papers given by R3 is clarified well. The ablation study concern is alleviated as well. As for strong baselines, the authors seem convincing when they claim that the method [2] is not a good baseline. But still no strong baselines are provided. Overall I recommend the paper to be accepted.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The rebuttal better justified the novelty and contribution of the proposed method. The experiments also seem appropriate.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

9

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The major weaknesses 1) and 3) were well addressed by authors in rebuttal. while ablation study in manuscript is not sufficient, the idea of convert stitching problem to a binary classification problem is well-motivated.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

back to top

Learning Neuron Stitching for Connectomics