Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Mert Asim Karaoglu, Nikolas Brasch, Marijn Stollenga, Wolfgang Wein, Nassir Navab, Federico Tombari, Alexander Ladikos

Abstract

Depth estimation from monocular images is an important task in localization and 3D reconstruction pipelines for bronchoscopic navigation. Various supervised and self-supervised deep learning-based approaches have proven themselves on this task for natural images. However, the lack of labeled data and the bronchial tissue’s feature-scarce texture make the utilization of these methods ineffective on bronchoscopic scenes. In this work, we propose an alternative domain-adaptive approach. Our novel two-step structure first trains a depth estimation network with labeled synthetic images in a supervised manner; then adopts an unsupervised adversarial domain feature adaptation scheme to improve the performance on real images. The results of our experiments show that the proposed method improves the network’s performance on real images by a considerable margin and can be employed in 3D reconstruction pipelines.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_29

SharedIt: https://rdcu.be/cyhQr

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper studies depth estimation from single laparoscopy image. A two step framework is proposed, which are (1) supervised training from synthetic data and (2) a GAN-based feature domain transfer method. Experiments show it can obtain accurate results.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The idea to first perform training on synthetic data and then transfer it to the real-world data is interesting.

This paper is well written and easy to follow.

Experiments include 3D modeling and manual registration, which look good.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

More details about the domain adaptation method is needed. A figure that shows the details of the CNN structures will be very helpful.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Need experienced researcher in this field to reproduce the method. No details of the CNN structures are given.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

None
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Clear structure, the idea is novel and the experiments can support the claims.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Somewhat confident

Review #2

Please describe the contribution of the paper

This paper propose an alternative domain-adaptive approach. The two-step structure trains a depth estimation network with labeled synthetic images in a supervised manner; then adopts an unsupervised adversarial domain feature adaptation scheme to improve the performance on real images.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The topic is interesting and the authors put a lot of effort in the paper
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The paper presents little modification to well established methods (REF[23,6,16,13, 11]) to solve the presented problem.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

can be reproduced
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Please compare your method to other state of the art that work on the same problem instead of comparing to different configuration or learned models.

Th caption of Table 1 should be above the table. The text in Fig. 3 is very tiny and cannot be read
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The comparison and contribution is limited.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

A bronchoscopic depth estimation method is proposed which includes two steps: first, an U-Net based supervised network is trained based on synthetic images. Then, an unsupervised adversarial domain feature adaptation scheme adopted from [29] is employed.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Obtaining the groundtruth for bronchoscopic depth estimation is difficult. This paper provides a self-supervised method for bronchoscopic depth estimation, which is valuable. The experiments illustrate the encouraging performance of the idea.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

While experiments are important to validate the effectiveness of deep learning methods. The paper does not provide ablation studies of the important module of the approach. Besides, the existing depth estimation methods are not compared. And the original domain feature adaptation method [29] is not compared.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Private data is used for training and validation. Code and the data will not make available. Those create difficulties or the reproducibility of the paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. It would be necessary to add comparison with SOTA depth estimation methods.
2. Please add the unit of the values in the Table.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

A self supervised bronchoscopic depth estimation method is proposed. The proposed method is finely presented. However, the technical novelty of the proposed method needs clarity. Besides, ablation studies and comparison to SOTA methods are necessary.
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper was well-reviewed by three experts. The author would provide more details about the domain adaptation and CNN structures. The novelty of this paper should be further clarified since the proposed method was similar to the currently available method. Additionally, it would strengthen this paper if the authors provide ablation studies and comparison to the existing depth estimation methods.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6

Author Feedback

We thank the reviewers for their positive feedback. We are highly motivated by the fact that they found our work novel [R1], well-structured [R1, R2, R3], and acknowledge the challenging nature of the problem [R3].

We would like to clarify a set of points:

Novelty:

In our work, we target improving an existing approach [29] to the problem of monocular depth estimation in bronchoscopic images, which is a crucial open problem in the field. In [29] the domain-feature adaptation approach is utilized for day to night-time adaptation of a depth estimation network in natural scenes. In our studies, we augment this method with additional improvements, such as the combination with Coordinate Convolution layers [16] at the adapted embedding levels for synthetic to real-image adaptation, tailoring the network specifically for our targeted application. Furthermore and in contrast to other works we assess the usability of our approach in a SLAM-based navigation scenario. References [23, 16, 6, 13] are various fundamental computer vision papers that have inspired our architecture but none of them could provide specific solutions for bronchoscopy environment without major adoption. [11] is an explicit image-level domain-transfer approach which we use only in the experimental comparison.

Description of domain adaptation approach:

In Section 1 we describe the fundamental issues regarding monocular bronchoscopic scenes and the current literature and reason why training on synthetic images is a valuable alternative. Furthermore, we discuss how the domain gap between the synthetic and real images causes non-negligible performance drops when the model is deployed. Elaborating more on this, we introduce our solution referring to the work [29] that inspired us. In Section 2.2 and Figure 1, we provide the technical details of our improved network with the implementation details in Section 3.

Evaluation against other methods:

As described in Section 1 recent publications in monocular endoscopic depth estimation either require a depth-ground truth [2] or pose-labels [25] for training which we lack in our setup. Developed for sinus-endoscopy scenes, which are more feature-rich compared to bronchoscopic images, [17, 18] utilizes features extracted with structure from motion (SfM) for a self-supervised training scheme. We tested various SfM frameworks on our bronchoscopic data but failed to extract adequate useful information due to the small number of visual features in the images. We therefore couldn’t compare against these architectures. Instead, we decided to compare against an extended version of [19] which utilizes an outdated architecture for explicit domain transfer by developing a more advanced CycleGAN based structure and testing our model against it. We also compared against [29] improved with adding one more discriminator which we found to work better for bronchoscopic images.

Ablation studies:

We performed two ablation studies. First, we compared our proposed domain feature adaptation against a direct domain transfer method and second, we compared the effect of the coordinate convolution layers [16], which is one of the major changes we added on top of [29]. We found that both the adaptation at the feature level and the coordinate convolutions are essential for the performance of our method.

Reproducibility:

In Section 2 and 3 we describe the configuration and training setup of the architecture with references to the original publications where their implementation details are further explained. We decided not to add a figure of the network architecture to the main paper due to space constraints. To make the paper more self-contained we will add a network architecture diagram to the paper or its supplementary material if for completeness. We also make sure that the paper contains all the necessary information to reproduce our results.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This AC was satisfied with the rebuttal addressed most of the main questions raised by the reviewers.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposes a domain adaptation on features to estimate depth in bronchoscope endoscopic images, using a contributed adversarial strategy.

One reviewer finds the pipeline interesting, but lacks sufficient details on the domain adaptation.

A second reviewer finds limited novelty compared to existing methods with lack of comparison.

A third reviewer also mentions a lack of comparative study on depth estimations.

A consensus is raised on missing comparison with a state-of-the-art. The rebuttal indicate an evaluation against [19] and [29] (the basis of this paper), as well as an ablation study, but I do fail to see them in the manuscript. From my understanding, these are promised additional post-submission experiments, which was failed to have been provided at submission time. As is, the paper, in my opinion, lacks a minimal comparison with at least [29], their source of inspiration. This is to understand how the proposed novelty contributes compared to the state-of-the-art.

For these reasons, Recommendation is toward Rejection.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

26

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors have responded the major concerns raised by the reviewers and AC. The main contributions and differences compared with the SOTA in computer vision domain have been emphasized in the rebuttal letter. Issues related to experiments (i.e. evaluation against other methods, ablation study and reproducibility) have been clarified.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

4

back to top

Adversarial Domain Feature Adaptation for Bronchoscopic Depth Estimation