Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Megha Kalia, Tajwar Abrar Aleef, Nassir Navab, Peter Black, Septimiu E. Salcudean

Abstract

Surgical instrument segmentation for robot-assisted surgery is needed for accurate instrument tracking and augmented reality overlays. Therefore, the topic has been the subject of a number of recent papers in the CAI community. Deep learning-based methods have shown state-of-the-art performance for surgical instrument segmentation, but their results depend on labelled data. However, labelled surgical data is of limited availability and is a bottleneck in surgical translation of these methods. In this paper, we demonstrate the limited generalizability of these methods on different datasets, including robot-assisted surgeries on human subjects. We then propose a novel joint generation and segmentation strategy to learn a segmentation model with better generalization capability to domains that have no labelled data. The method leverages the availability of labelled data in a different domain. The generator does the domain translation from the labelled domain to the unlabelled domain and simultaneously, the segmentation model learns using the generated data while regularizing the generative model. We compared our method with state-of-the-art methods and showed its generalizability on publicly available datasets and on our own recorded video frames from robot-assisted prostatectomies. Our method shows consistently high mean Dice scores on both labelled and unlabelled domains when data is available only for one of the domains.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_39

SharedIt: https://rdcu.be/cyhQB

Link to the code repository

https://github.com/tajwarabraraleef/coSegGAN

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper the authors propose a surgical instrument segmentation framework that includes an Image-to-Image translation method to address the lack of clinical labeled data. In this way, it is thus possible to train a segmentation model with labeled data from different domains. The generative network is based on cycleGan and the segmentation model is based on U-Net. A shape constraint is included to avoid changes of the surgical instruments. The results achieved outperform state-of-the-art methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    When clinical data are not available, it is common to use translated labels, which are generated from other domains, to train a segmentation model. The novelty in the paper is the way the framework is trained. The generative and the segmentation models are trained together; so that, the quality of both models increases simultaneously. Focal loss function and structural loss are incorporated to improve accuracy.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The authors added an explicit latent space loss to preserve structural properties of the final scene. Namely, the structural loss function avoids changes on the surgical instruments. However, its impact is marginal. Only Dice Index is reported.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Implementation details are given. It seems code and data will be available after the acceptance of the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    The paper in general is well written and the proposal is clearly presented. I missed that only Dice Index is reported. Usually Intersection-Over-Union (IoU) and area-under-the curve (AUC) are also included.

  • Please state your overall opinion of the paper

    strong accept (9)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses segmentation of surgical instruments with no labeled data. They propose to translate labels from a different domain, while avoiding changes in the surgical instruments. The results achieved state-of-the-art methods.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    2

  • Reviewer confidence

    Confident but not absolutely certain



Review #2

  • Please describe the contribution of the paper

    This paper combines cycleGAN, structural loss, and segmentation module to solve the problem of generalizing surgical instrument segmentation to unlabelled data. Experimental results demonstrate that the proposed method achieves good results on several datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Well motivated problem and real datasets
    • Good results
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Limited novelty. As far as I understand, this paper is a merely combination of existing techniques: cycleGAN, structural loss, and segmentation module.
    • It is very nice to show a failure case. Yet, why it fails, the failure case on the right column of Fig. 3 is very similar with the case of the second column in Fig. 3. Why it fails is not clear.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is rather clear. The authors also intend to release their code. Therefore, I believe one can reproduce the results with some minor efforts.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Please address the novelty problem and why the failure case does not work.

  • Please state your overall opinion of the paper

    borderline reject (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    limited novelty of the paper.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    2

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    The authors present a method for segmentation of Surgical Instruments on Unlabelled Data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors present a novel method that outperforms state-of-the-art algorithms for the segmentation of unlabeled data.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It will be interesting to understand why UCL data was associated with less generability. No major weakness.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Clear and easy to reproduce.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    It will be interesting to see an in-depth investigation regarding the results. Please report the SD in addition to the mean.

  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a very impressive result considering no labelled data. The authors incorporate a generative model to translate the train-set image domain to another one and then used the train-image domain to segment the images. This is a nice and creative idea that facilitates the segmentation of unlabeled endoscopic surgeries. It is the best work in my set of papers.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposed a surgical instrument segmentation method that utilises cycleGAN, focal loss and structural loss to solve the unlabelled clinical data problem. The paper is well-written, the experimentation is thorough and convincing. Reviewers comments must be addressed in the camera ready including highlighting the novelty, failure case discussion and addition of IoU metric.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    2




Author Feedback

*The paper has limited novelty. As far as I understand, this paper is merely a combination of existing techniques: cycleGAN, structural loss, and segmentation module.

We disagree with the reviewer. The novelty and main contribution of the paper is presenting a framework where each of these individual components work jointly to segment surgical instruments on videos from real surgeries without labels. Indeed, our framework consists of a combination of generative and segmentation modules, but the particular combination is not obvious. Moreover, we have shown in the paper that neither of these modules produce good results, when used individually. In fact, we have shown that when cycleGAN is used alone for data augmentation to go from non-surgical (with labeled images) to surgical domain (no labels), the model does not generalize well to the surgical domain (Table 1). This is the case with all state-of-the-art (SOTA) methods that are trained on the data, using cycleGAN augmentation alone (Table 1). Here, we would like to highlight another major challenge in unsupervised image-to-image (I2I) mapping techniques (such as cycleGAN), i.e., the change of shape of objects during the domain translation. This problem has been shown in Fig 1. This shortcoming limits the use of any such I2I mapping method, specially in medical and surgical applications. Our framework mitigates these two major challenges in the field, by jointly training generative and segmentation modules in an alternative fashion (details can be seen in Figure 2). The segmentation module provides feedback to the generator thus preventing the change of shapes of the surgical instruments during the translation. Simultaneously, the segmentation module learns using generated data from the generator, thus seeing much more varied data. To provide an additional shape constraint on the latent space we introduce a structural loss. We show that our model consistently achieves significantly better results on both the non-surgical (with labels) and surgical domain (with no labels) (with a delta Dice (less the better) of 0.9%, as opposed to 33%, 10% and 5% in U-Net, RASnet and Ternausnet, respectively). This validates our method’s generalizability when compared to the SOTA.

*It is very nice to show a failure case. Yet, why it fails, the failure case on the right column of Fig. 3 is very similar with the case of the second column in Fig. 3. Why it fails is not clear.

Yes, although the images in columns 2 and 4 look similar, they have a crucial difference. In column 4, in the region of failure, the presence of blood on the surgical instrument blends in with the background. In addition to this, this portion of the tool is very close to the edge of the endoscopic camera image, which is not well lit and creates a vignetting effect. This could be one the reasons for the poor segmentation in this region. Please note though, our model generalizes better than others, producing consistently good segmentation and fewer false positives across all four surgeries. It could be the case that the model tries to strike a trade-off where it generalizes better at the expense of slightly higher segmentation error in some cases.

*The authors added an explicit latent space loss to preserve structural properties of the final scene. Namely, the structural loss function avoids changes on the surgical instruments. However, its impact is marginal.

Delta Dice decreases (less the better) with structural loss from 1.8% to 0.9% and from 19.0% to 16.8 % in cases where the target domain is human surgeries. We argue that although quantitatively this absolute difference in improvement seems small, it is essential. The numbers alone might not give a true picture here. Preserving small structural details add marginally to the quantitative results. However, it is important to retain the correct structure of the surgical instrument. An example of such a detail can be seen in Fig 3, column 1, row1.



back to top