Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Paula López Diez, Josefine Vilsbøll Sundgaard, François Patou, Jan Margeta, Rasmus Reinhold Paulsen

Abstract

We propose a pipeline for the characterization of facial and cochlear nerves in CT scans, a task specifically relevant for cochlear implant surgery planning. These structures are hard to locate in clinical CT scans due to their small size relative to the image resolution, the lack of contrast, and the proximity to other similar structures in this region. We define key landmarks around the facial and cochlear nerves and locate them using deep reinforcement learning with communicative multi-agents based on the C-MARL model. These landmarks are used as initialization for customized characterization methods. These include the automated direct measurement of the diameter of the cochlear nerve canal and extraction of the cochlear nerve cross-section followed by its segmentation using active contours. We also derive a path selection algorithm for optimal geodesic pathfinding selection based on Dijkstra’s algorithm for the characterization of the facial nerve. A total of 119 clinical CT images from preoperative patients have been used to develop this pipeline that produces accurate characterizations of these nerves in the cochlear region and provides reliable measurements for computer-aided diagnosis and surgery planning.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_50

SharedIt: https://rdcu.be/cyhQ5

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper applied a communicative multi-agent reinforcement learning method for localizing multiple landmarks in inner ear CT scans. The predicted landmarks were used to measure cochlear nerve canal diameter and segment facial and cochlear nerves in an automated way.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed method is automated. Given a CT scan, the localization of landmarks and the segmentation of nerves were automated without human input. According to the authors, it is the first time to automate these measurements.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The multi-agent reinforcement learning method used was not novel. (https://arxiv.org/abs/2008.08055, cited) The authors applied the method to a different dataset and added some existing heuristic (non-learnable) algorithms (active contours without edge, Dijkstra) to establish the procedure.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
- The data will not be accessible.
- Models were well explained.
- Evaluation process was well explained but some analysis was subjective, performed by authors.
- Code would not be released according to checklist.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
Regarding the reinforcement learning formulation in Section 3.
- It is not clear how the reward is defined.
- It is not clear how an agent is considered as “reaching oscillation”.
Regarding the facial nerve path finding in Section 3.
- It is not clear how the directed graph is defined given the 3d image: given two neighbouring voxels, how the direction is determinated?
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- This paper established the first automated procedure for the application.
- Although the method components were not novel, the combined solution remained interesting as a baseline for this domain.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

4
Reviewer confidence

Somewhat confident

Review #2

Please describe the contribution of the paper

This paper has presented a reinforcement learning based method for detection of landmarks for the characterization of cochlear nerve and facial nerves in CTs. The proposed method result in a pipeline for automatic segmentation of both neural structures in the close-by area of the cochlea.

seven landmarks are determined by experts in cochlear CT images, with 2 for cochlear nerve and 5 for facial nerves. For each of the 7 landmarks, the authors use 3 agents and C-MARL model to achieve the goal (find the optimal landmark locations). Cochlear nerves are determined by segmenting the cross-section with using Chan-Vese approach. The labyrinth segmentation of the facial nerve is determined by using Dijkstra’s algorithm.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper presents an interesting and novel implementation of reinforcement learning for landmark detection in cochlea CT images, which are used further for facial nerve and cochlear nerve characterizations in CTs. The 7 landmarks annotations are relatively easier than the delineation of the facial nerve and cochlear nerve in CTs. Thus, the landmark detection problem is easier to train in terms of training data collection.

The usage of C-MARL model for reinforcement learning process during landmark localization allows explicit communication among agents (average the weights in fully connected layers) and implicit communication among agents (sharing CNN parameter weights). The adoption of multi-scale strategy allows the agents to gradually reach the goal.

Overall, the strongest innovation point is the usage of landmarks for segmentation of facial nerves and cochlear nerves. This avoids complicated annotations of those nerves in 3D images.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The major concern I had with this paper is it did not compare its method with other state of the art facial nerve/cochlear nerve segmentation methods. For instance, [15] has presented a method for segmentation of cochlear facial nerves with deformable models and atlas-based method. The performance seemed to be already satisfactory. The paper did not compare itself with this method and other related methods. It would be interesting to see the comparison and discussion of previous methods and the method proposed by the author.

Another concern I have is in section 2 where the authors mentioned “A region of interest is cropped” but no details have been given on how this region of interest was cropped. Is this an automatic step, or manual step?

The result sections need more discussion. For instance, In figure 6, how was the “correct”, “not precise” path determine in inference stage?
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Not applicable
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. Comparison with other methods should be included. At least discussion section should include the advantages of the proposed method compared with other existing methods, and the results should be compared too, if possible.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The idea for using landmarks for segmentation of complicated structures in CTs could make the annotation step much easier. This is an innovative step and the results are promising.

However, further comparison studies with existing methods should be included to better see the potential of the proposed method.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

1
Reviewer confidence

Somewhat confident

Review #3

Please describe the contribution of the paper

The paper proposes a pipeline to identify landmarks belonging to the CN and FN, followed by the extraction of metrics related to the health of the nerves. The pipeline starts with an MARL based method to identify 7 significant landmarks. An active contour based approach then segments the nerves. Finally a set of heuristics aids the identification and characterisation of the nerve canal.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Strong results
- Clinically interesting pipeline
- To the best of my knowledge the pipeline is novel
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The overall pipeline is novel, to the best of my knowledge but the distinct parts of it are not
- The structure of the paper, especially concerning section 3, methods, needs more attention with proper paragraphs and further analysis of various points, like the active contours variant
- No comparison with other methods even though the authors have identified various related works.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Given the heavy dependence on existing open sourced methods, reproducing this paper is possible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- As mentioned above, the organisation of the paper is lacking in the part of the methods. Moreover the analysis of the pipeline modules assumes the readers expertise in all parts. A bit more detail on the various approaches would be appreciated, specifically on the level of the used active contours variant.
- This is a somewhat difficult paper to analyse as, for example there is limited novelty and contributions in the modules that make up the pipeline. However to their credit the authors claim contributions on assembling the different modules into a pipeline. Hence this paper can be better considered as a clinical contribution / method validation one. In that case to make it ideal one would expect a bit more analysis on the shortcomings and limitations of the proposed pipeline, an ablation study on the various component choices, for example why were 3 agents used per landmark etc.
- Finally the paper is lacking comparisons with other related methods attempting similar tasks, the authors have correctly identified a few related works but have not compared against them. To their credit they have put their method in context.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

As mentioned above the paper is in need of additional experimentation and analysis yet I can see the clinical value of the proposed method.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

In this work a multi agent reinforcement learning approach is used to identify facial and cochlear nerves. Reviewers agree that this is a clinically relevant method showing good results and that there is some contribution in applying the reinforcement learning paradigm in the context of this application, since multiple agents communicate through weight sharing on how to localize the landmarks of interest. However, reviewers state a number of shortcomings that have to be clarified in a rebuttal, like a deeper clarification of the novel contributions of the proposed methodology, especially regarding the used multi agent reinforcement learning method as indicated by reviewer 1, and the lack of comparison to non deep (reinforcement) learning based methods for localization of these landmarks. Authors are encouraged to identify and address the main points of criticism in their rebuttal, including the issues mentioned above.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

Author Feedback

Firstly, thank you all for your valuable time and insightful comments.

The novelty of this work is the use and adaption of several state-of-the-art methods including the C-MARL model for a clinically very difficult task – location and quantification of challenging soft- tissue structures in clinical CT scans. The method was designed and successfully applied to the CN (first automated approach of its characterization on CTs) and the more complex region of the FN (labyrinth segment). This pipeline allows using the power of deep learning (DL) without access to time-consuming full 3D manual segmentation of multiple CT images that would usually be required for DL segmentation, presenting an alternative to overcome the lack of accessible fully annotated datasets. However, this also means that analyzing the performance relies on a more qualitative analysis of the outcome due to the absence of ground-truth needed for quantitative evaluation.

A comparison of results with previous approaches would indeed add valuable information and allows us to quantitatively compare performance as pointed out by the reviewers. However, several factors make this comparison a rather difficult task in this specific scenario. There is no previous work that we know of characterizing the CN in CT images. For the FN characterization comparison, most of the previous approaches focus on the segmentation of other segments of the nerve, relevant for surgery planning (mastoid segment), as in [15], but do not address the labyrinth segment necessary to compute the distance between the FN and the Cochlea. The work that does include the labyrinth segment in its characterization, as in [6], drops in performance in this region due to the complex anatomy and does not provide explicit metrics for this region to allow direct comparison. The data used for this project is clinical CTs of rather low resolution (0.3 mm on average) which are representative of images commonly used in clinical practice but do not allow for direct comparison with approaches tested in μCT (0.001–0.020 mm ) which are normally used in the state of the art methods. However, being able to characterize these structures in this more complex scenario allows the pipeline to work in challenging real-life cases and is very promising for testing it in higher resolution CT images.

Regarding the technical comments, the level of detail had to be limited due to the restricted space but we are glad to add some information. The reward value of the DRL is a function of the difference in distance from the target landmark location to the last and second-to-last state, if the agent is getting closer the reward is positive and negative otherwise. The oscillation state is reached when the last actions of an agent present a cyclic pattern with recursive visits to the same locations within a buffer of 15 actions. The different 3D connectivity values for selecting neighbors for building the directed graph for the FN tracking are edge 6 (voxels sharing a face), 18 (also sharing edges), and 26(also sharing corners). The region cropping step is an automated step centered in the estimated landmark position and with a known physical size of 30X30mm. The following categories were used to evaluate the landmarks: correct landmark means that it is in a location that fully satisfies the features’ criteria for that landmark; very close location means that the landmark is close to the optimal location and the methods based on its location should not suffer from the deviation; a wrong location is clearly off the structure. Concerning the path selection algorithm: a correct path successfully tracks the nerve in the whole region, a not precise path indicates that the path is not completely aligned with the centerline of the nerve but still provides a fair idea of the nerve’s location, a wrong trace indicates that the trace does not follow the nerve path.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

All reviewers were in favor of the presented paper. In my opinion, authors have addressed remaining reviewer criticisms and requests for clarification sufficiently. Given the limited size of the MICCAI paper format and the fact that no further experiments can be expected, I assume the value of the paper as it is can be a contribution to the community. Clarifications and minor changes are expected to be done in a revised version, in case of acceptance.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

1

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Although the authors proposed an interesting approach, I believe that a method that has not been compared to SOTA should not be accepted for publication at such a strong conference as MICCAI. There are a number of generic CNN-based methods for landmark localization with publicly available code that authors could use for comparison. I also understand how frustrating manual segmentation can be, but since the task is to extract the nerves in the image, some manual GT segmentation is needed to evaluate the method.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

11

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper has presented a reinforcement learning based method for detection of landmarks for the characterization of cochlear nerve and facial nerves in CTs. All reviewers found the paper well written and were satisfied with the results. The authors have properly addressed the remaining concerns on novelty and bench-marking.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

back to top

Facial and cochlear nerves characterization using deep reinforcement learning for landmark detection