Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jianing Wang, Dingjie Su, Yubo Fan, Srijata Chakravorti, Jack H. Noble, Benoit M. Dawant

Abstract

We propose an atlas-based method to segment the intracochlear anatomy (ICA) in the post-implantation CT (Post-CT) images of cochlear implant (CI) recipients that preserves the point-to-point correspondence between the meshes in the atlas and the segmented volumes. To solve this problem, which is challenging because of the strong artifacts produced by the implant, we use a pair of co-trained deep networks that generate dense deformation fields (DDFs) in opposite directions. One network is tasked with registering an atlas image to the Post-CT images and the other network is tasked with registering the Post-CT images to the atlas image. The networks are trained using loss functions based on voxel-wise labels, image content, fiducial registration error, and cycle-consistency constraint. The segmentation of the ICA in the Post-CT images is subsequently obtained by transferring the predefined segmentation meshes of the ICA in the atlas image to the Post-CT images using the corresponding DDFs generated by the trained registration networks. Our model can learn the underlying geometric features of the ICA even though they are obscured by the metal artifacts. We show that our end-to-end network produces results that are comparable to the current state of the art (SOTA) that relies on a two-steps approach that first uses conditional generative adversarial networks to synthesize artifact-free images from the Post-CT images and then uses an active shape model-based method to segment the ICA in the synthetic images. Our method requires a fraction of the time needed by the SOTA, which is important for end-user acceptance.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_2

SharedIt: https://rdcu.be/cyhPI

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors propose a deep learning method to register an atlas image to post-op CT imaging of the cochlea. This method, during training, uses two networks to learn and perform the registration in two directions: (i) from the post-op CT to the atlas and (ii) from the atlas to the post-op CT. The networks are trained using loss function that combines numerous metrics including voxel-wise labels, image content, fiducial registration error, and cycle-consistency of the transformations. Registration results are compared to another state-of-the-art approach using a dataset of n=624 images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • Co-trained network formulation to registration is interesting and appears to provide improved registration performance over a single network. • Ablation experiments demonstrate the contributions of each term in the loss function. • The dataset (n=624 total images) is large, a relatively large test subset (n=93).

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • Results do not exceed those of the current state-of-the-art method and possible advantages of the proposed method are not discussed. • The method contains numerous loss terms that are weighted by empirically chosen values.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The dataset and imaging information used in this study are well described. Some details regarding model training are not provided. Also, sensitivity to the choice of hyperparameters in the loss term weights is not available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    This is an interesting method that contains a complex set of loss terms. The ablation experiments are nice to demonstrate the contribution of each loss term to the method’s overall performance. However, enthusiasm for the method is slightly diminished due to performance not exceeding that of the state-of-the-art method “cGAN+ASM”. While these results are ok, the paper does not provide a discussion about possible advantages (e.g. running time) of the proposed approach over cGAN+ASM that would be beneficial to strengthen the impact of this work.

    The work on consistent image registration by Christensen et al. is highly relevant to this work and should probably be cited in the introduction/background section.

    Christensen, Gary E., and Hans J. Johnson. “Consistent image registration.” IEEE transactions on medical imaging 20.7 (2001): 568-582.

    The Figure numbers are incorrect. Please verify that all figures and references to these figures in the text are accurate. Figure 1 appears and then the next figure to appear also starts as “Figure 1” which makes all following figures incorrectly labeled.

    Sec. 2.2 & Table 1: There are many loss terms in the training step and the weights of these were chosen empirically. How does model performance change (sensitivity) to the choice of weights?

    Fig. 4: It might be helpful to add within the caption that “a description of the numerical value color legend can be found in the text.”

    Sec. 3: Some training details could be added to improve the description. For example, how many epochs were required for training, what optimization method was used, how were the model weights initialized, and was early stopping used on the validation data?

    Sec. 4: In the abstract, it is mentioned “Our proposed approach also produces results in a fraction of the time needed by the current state of the art.” What was the model running times? These results are not provided in the paper.

  • Please state your overall opinion of the paper

    probably reject (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Enthusiasm for the method is diminished due to performance not exceeding that of the state-of-the-art method “cGAN+ASM”. While not exceeding competing methods is fine, the paper does not provide either a discussion about possible advantages or results to show an advantage (e.g. running time) of the proposed approach over cGAN+ASM. Furthermore, some implementation details are incomplete.

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    4

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    The authors propose an atlas-based method to segment the intracochlear anatomy in the post-operative CT images of cochlear implantation surgery recipients that the authors claim to preserve point-to-point correspondences between the meshes in the atlas and the segmented volumes.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors employ a pair of co-trained CNNs that generate dense deformation fields in opposite directions. One network was tasked with registering an atlas image to the Post-CT images and the other network was tasked with registering the Post-CT images to the atlas image. Different loss functions were utilized in the segmentation approach. The authors have also presented both successful and failure cases as example outputs.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The introduction section lacks motivation for using DDF and two CNNs for the task? The overall model size and computations seem fairly big for the task at hand. From what I understand, the CI is normally close to the lateral wall of the scala tympani. Whereas in Fig 1, it is shown as close to modiolus side wall. How certain are the authors of the groundtruth as ICA is impossible to visualize on clinical CT images.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code and data are not made available. Very less information on details of each sub-network.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    What was the motivation for using DDF and two CNNs for the task? Was the initial cropping done automatically or manually? Were the registered GT verified by an expert/surgeon? A detailed ablation study on the effect of each loss function (without combination with other losses) and useful of opposite CNNs would provide important insights. Validating the approach on micro-CT or histology data could be useful for validating the approach with certainty for future work.

  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors have tackled a difficult problem with slightly less accurate results to SOTA but with less segmentation time. However, approach motivation has not been fully addressed. The intracochlear anatomy can not be seen in post-operative CT-scans so saying that any segmentation methods correctly predicts the segmentation is nearly impossible. There I have only chosen accept for recommendation.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    3

  • Reviewer confidence

    Confident but not absolutely certain



Review #3

  • Please describe the contribution of the paper

    This paper presents an atlas-based method for the segmentation of intracochlear anatomy (ICA) in the post-implantation CT images. Due to the clinical request, the segmentation of ICA has to be performed in a registration manner to preserve the point-to-point correspondence from the atlas model to the segmented model. The author designs a pair of co-trained networks to estimate the deformation field from two opposite directions, on which a cycle consistency constraint is created during training. The experimental results on a large in-house dataset show that the proposed method can provide comparable performance in comparison to the state-of-the-art two-step method while costing less time in only one step.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) This work is well-motivated by a clinical problem.
    2) The design of the cycle consistency learning manner is somewhat novel.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The clarity and expression of this paper should be improved. There are many terms in Section 2.2 to represent different objects in different spaces, which makes the paper relatively hard to follow. The author may need to simplify these representations or use more figures to express the detail of the pipeline. 2) The registration spaces used in this study are not clearly defined. According to the author’s statement, there are two spaces in the registration pipeline: 1) the atlas space where the atlas-CT is defined and 2) the post-operative space where both the pre-CT and the post-CT are defined. I doubt whether it is correct to define the pre-CT and the post-CT in the same space. Although the author mentioned in Section 2.1 that “For each ear, the Pre-CT image is rigidly registered to the Post-CT image.” It may still exist some non-rigid deformation from pre-CT to post-CT due to the operative surgery. 3) According to the example illustrated in Figure 1, the ICA shows complex shapes in 3D space. However, the image resolution is relatively low and coarse. I doubt whether this kind of imaging quality can be capable to be used to infer the dense deformation field, especially regarding there are severe metal artifacts. Maybe an affine matrix is adequate to represent the transformation.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The author did not release the source code. The experiments are conducted on an in-house dataset.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1) Section 2.1, “Our atlas image is a Pre-CT image of an ear that is not in the 624 ears.”: The author should give more information about how to choose this atlas Pre-CT image. What if we choose another Pre-CT as the atlas image? Will the proposed method be sensitive to that? 2) Section 2.1: According to the author’s statement, the input image should be resampled and cropped to a uniform size of 64×64×64 with an isotropic spacing of 0.2mm. This is a relatively small region. To make it contain the cochleae, do we need to need some kind of manual intervention or a preprocessing step of target localization? If so, will that also apply to the testing phase of the proposed method? The author should clarify this in Section 2.4. 3) There are two “Fig. 1”. Please fix all the figure numbers throughout the paper. 4) Section 3, “The 624 ears are partitioned into 465 ears for training, 66 ears for validation, and 93 ears for testing.”: Please specify how many different objects are involved in this study. 5) Section 4: As the author states in the abstract, the proposed method achieves comparable accuracy compared to the current state-of-the-art two-step method but needs less processing time. Please report the quantitative result of the testing time in the results section.

  • Please state your overall opinion of the paper

    probably reject (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is well-motivated by a clinical problem with practical value. However, the clarity and organization should be improved to make it easier to follow. My major concern comes from the second point I listed in the “main weaknesses of the paper”. I think it may be improper to assume that the pre-CT and the post-CT are in the same space, even though they have been aligned through rigid registration. This makes me suggest a decision of “probably reject (4)”.

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    6

  • Reviewer confidence

    Confident but not absolutely certain




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    I believe it is an interesting application with a sound and potentially very useful way to perform registration - using heavy (weak) supervision during training. therefore, I would invite this paper to rebuttal for 1) addressing the first point from Reviewer 1 clarifying the different to the existing methods; and 2) the second point from Reviewer 3 - from two reviewers that indicated low scores.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    9




Author Feedback

We thank the reviewers and the meta-reviewer (MR) for their positive feedback. We are also pleased to read that both the MR and the reviewers thought that our problem is challenging and important and that the method we propose is interesting and potentially very useful. We address the questions raised by the MR and specific points raised by the reviewers as place permits. Q1 by MR: please compare the proposed method to the SOTA as requested by R1. We stated that the proposed method produced results comparable to the SOTA’s but in a fraction the time without providing timing information; a clear oversight. The SOTA is a two-step process: (1) generate a synthetic pre-op image from a post-op image with cGANs trained for this purpose and (2) apply an ASM-based method to the synthetic image. Step 2 requires the very accurate registration of an atlas to the image to be segmented to initialize the ASM. This is achieved through an affine and then a non-rigid intensity-based registration in a VOI that includes the inner ear. Step 1 takes about 0.3s while step 2 takes on average 75s. The proposed method only requires providing a VOI that includes the inner ear to the networks described herein and inference time is also about 0.3s. Segmentation is thus essentially instantaneous with the proposed method while it takes over a minute with the SOTA. This is of importance for clinical deployment and end-used acceptance. It will be clarified. Q2 by MR: please address the second point from R3. The reviewer questions the fact that the post-op and pre-op images can be registered accurately because the surgery may induce some non-rigid deformation. This is a very good point but our application is unusual. The cochlea is a cavity surrounded by bone and the surgery consists in threading an electrode array through a small hole into that cavity. There is thus no non-rigid deformation of the cochlea. Registration of pre- and post-operative images was also used to train the cGANs that are used in the SOTA. Response to questions raised by the reviewers: R1: At this point the weights have been selected by looking at training performance on a small number of epochs. A larger study is ongoing but training is time consuming (~one week). R1: The work by Christensen should be cited indeed. R2: Why DDF and two CNNs. We need to establish a point-to-point correspondence between an atlas and a patient image. This is automatic with the ASM-based SOTA technique. To do this via registration we need to project vertices from the atlas to the patient image. We also want to use image similarity measures to compute the transformation. Doing both requires knowledge of the transformation from the atlas space to the patient space and from the patient space to the atlas space. Imposing inverse consistency also regularizes the transformations. R2 and also R3: Does the method requires cropping. Yes as does the SATO and it is done automatically. R2: Cochlear implants are normally close to the cochlear wall but it does not appear to be the case here. There are different types of electrodes. Some are lateral wall and some are perimodiolar; the one shown is perimodiolar. We have validated our ground truth on images acquired with microCTs and we have used our technique clinically on hundreds of images. This has been published but providing references would violate anonymity rules unfortunately. R2 and R3: It is nearly impossible to segment post-operative images because the intra-cochlear anatomy cannot be seen in these images. THANK YOU. This is challenging indeed and that is why we are very pleased and frankly surprised by the results. We hypothesize that the network has been able to learn the shape of the cochlea and can fit this shape to partial information in the image. Since the submission we have conducted experiments that support this hypothesis. Overall clarity, lack of details, and editorial comments. Thanks for the feedback. We will modify the article accordingly.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Although Rev 1 & 3 recommended rejection with low scores, their critical comments are much less substantial from my reading. The rebuttal adequately addressed all the comments on limitations. I therefore have no reason to reject this work that contains a novel application with a clear description of the sound methodology and experiments.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    9



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    the work could be interesting, but the writing should be significantly improved.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    10



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This submission proposes an atlas-based segmentation network for extracting the intracochlear anatomy in the post-implantation CT images. The studied problem is challenging and the proposed method produces comparable results compared to the SOTA method. My main concerns are the unclear design choices of the proposed network and the clarity of the manuscript. It is not clear why two networks are required to registering between an atlas image and the post-CT image. The authors argue this in the rebuttal letter, while one network with an inverse transformation could satisfy their requirements. Also, a better shape of the manuscript will greatly improve its readability, including the notations, the figures, the organization, etc. The current version of this submission is at the preliminary step, and more work is desired before publishing.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    16



back to top