Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Andrés Marafioti, Michel Hayoz, Mathias Gallardo, Pablo Márquez Neila, Sebastian Wolf, Martin Zinkernagel, Raphael Sznitman

Abstract

Cataract surgery is a sight saving surgery that is performed over 10 million times each year around the world. With such a large demand, the ability to organize surgical wards and operating rooms efficiently is critical to delivery this therapy in routine clinical care. In this context, estimating the remaining surgical duration (RSD) during procedures is one way to help streamline patient throughput and workflows. To this end, we propose CataNet, a method for cataract surgeries that predicts in real time the RSD jointly with two influential elements: the surgeon’s experience, and the current phase of the surgery. We compare CataNet to state-of-the-art RSD estimation methods, showing that it outperforms them even when phase and experience are not considered. We investigate this improvement and show that a significant contributor is the way we integrate the elapsed time into CataNet’s feature extractor.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_41

SharedIt: https://rdcu.be/cyhQD

Link to the code repository

https://github.com/aimi-lab/catanet

Link to the dataset(s)

http://ftp.itec.aau.at/datasets/ovid/cat-101/


Reviews

Review #1

  • Please describe the contribution of the paper

    A method for remaining surgery duration (RSD) estimation is presented and evaluated on a public dataset. The method uses multi-task training of surgery phase, expertise, and RSD. The evaluation results show promising performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) RSD estimation is a relevant problem. 2) The multi-task learning approach is new for RSD estimation in cataract surgery. 3) The paper is well presented.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) In the evaluation, the proposed approach is compared to methods designed for cholecystectomy and laparoscopy. 2) For RSD estimation it is crucial that a method can be applied in real-time but there is no evaluation of the run-time performance.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The evaluation is performed on a public dataset.
    • Configurations and settings of the training process are clear.
    • Code and instructive examples will be made available upon acceptance. Hence, the work is considered as reproducible.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    This work is interesting and the presentation is good. Also, the architecture of the deep learning framework is properly designed. However, there are still a few concerns I have with this work:

    1) In the evaluation the method is compared to “competitors” that were not designed for RSD estimation in cataract surgery. 2) The evaluation results are separated for senior and assistant surgeons. However, the number of senior surgeons and the number of assistant surgeons in the entire dataset is only two.
    3) There is no information regarding the run-time performance of the presented work, which is important since RSD estimation should be applied in real-time. It can only be assumed that due to the use of an LSTM the method is rather slow. 4) In cataract surgery, the Incision phase is very short (a few seconds only), and there are also other short phases (e.g., Tonifying and Antibiotics). I wonder how stratified sampling with 8000 frames per phase has been performed for these phases, in order to tackle class imbalance?

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is nicely presented and the proposed method is new. Although I have several concerns, this is the first work I know for RSD estimation in cataract surgery and it is reproducible.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    3

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    The paper presents a framework for predicting RSD in cataracts surgery. They use information like observed surgical phases and surgeon’s experience within the model to make prediction for RSD. Results on the publicly available dataset show promise as the presented method out performs previous state of the art for most cases.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The problem of surgical phase recognition and RSD prediction is very clinically relevant. With the scale of cataracts surgery across the globe, automated methods for prediction RSD can help tremendously in OR management/scheduling
    • Proposed model out performs previous state of the art
    • Thorough evaluations are presented around model performance
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • My main concern with the paper is the lack of technical novelty - the proposed model is not a major step forward in terms of model architecture/training. Previously proposed methods have all used a CNN-RNN based architecture.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Authors mention that the code will be made available upon acceptance so I do believe that the results from the paper should be reproducible based on some instructive examples

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    The paper overall is well written and tackles a very clinically relevant problem. OR scheduling is a very important issue and automated ways to help OR teams in scheduling can be very beneficial. While the proposed method does out perform the compared previous state-of-the-art, I felt the paper lacked a bit of technical novelty. The model proposed is based on standard CNN-RNN approach which the previous methods have also used. However, the clinical relevance and thorough experimentation out weighs the lack in technical novelty for me.

  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper tackles a very important problem and proposes a model that works fairly well. This type of paper will attract clinical and technical audiences in the community.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    The manuscript presents a deep-learning pipeline for predicting remaining cataract surgery duration, surgeon’s experience and surgical phase from video frames.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The manuscript addresses a relevant problem, is well written and easy to follow
    • The introduction well summarizes current state of the art and open challenges
    • Figures are clear and explicative
    • References are appropriate
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The innovation against what is currently done in the literature seems rather limited
    • I cannot see why the surgeon’s experience has to be predicted for each frame
    • The choice of the competitors is not explained exhaustively
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Methodological details are sometimes missing (e.g., how is the elapsed time embedded?)

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • How was the minimum video length chosen?
    • Why wasn’t leave-one-surgeon out implemented?
    • How were the backbone models chosen?
    • Why were TimeLSTM and RSDNet chosen?
    • Which are the remaining open challenges that have still to be addressed?
  • Please state your overall opinion of the paper

    probably reject (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    No major innovation is brought by the authors. The experimental protocol could have been stronger.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposed a CNN-LSTM based method for the prediction of remaining surgery duration in Cataracts surgery. The problem is clinically relevant. The paper is well-written but lacks technical novelty. Important methodological details and justification of several experimental choices in missing. These should be addressed in the rebuttal

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    9




Author Feedback

We thank the reviewers for their insightful comments and overall positive reviews. In particular, we are excited to see that all reviewers see our work as having an important impact for the cataract surgery community.

Technical novelty It was noted that our work does not introduce a strong technical novelty. While we agree that our proposed method relies on a CNN-RNN based architecture that has already been used for RSD, our work identifies different contributions within this backbone that make RSD estimation effective for cataract surgery. Specifically, we introduce a concatenation of images with the elapsed time which strongly improves performances over existing and recent state-of-the-art methods that typically merge elapsed time with CNN features. That is, we show that the way information is integrated in the CNN-RNN backbone is critical.

Overall, our work provides a very detailed study of how to effectively do RSD estimation for cataract surgery. We combine many aspects related to the topic, spanning phase recognition and surgical skill. We show how to combine these different elements to achieve state-of-the-art performances. For example, even though the experience of the surgeon greatly changes the duration of the surgery, we found that our model could perform just as well without detecting this skill level. As far as we know, this is the first work to introduce an RSD method for cataract surgery.

Treatment of experience level R3 wonders why we predict the experience level on every frame. This has the great advantage of being a proxy for skill assessment throughout the surgery and not just a single aggregated number per surgery. We already elaborated on this in the text and in the conclusions, but will clarify further in our revision.

R1 notes that we separate the evaluation results for senior and assistant surgeons even though the dataset used only has 4 surgeons. We report results per experience level (and aggregated) because the error measure indirectly depends on the duration of the surgery, which depends on the experience of the surgeon. We will add this to the text, stating that senior and assistant surgeons have a mean surgery duration of 5.6 mins and 11.8 mins, respectively.

Real-time R1 asks if our approach is real-time. Inference time of our method using a batch-size = 1 and a small laptop GPU (GeForce MX250) yields results at 29.09 fps. Given that we sample the videos at 2.5 fps, our method can be easily applied at 10 times real-time speed. We will add this result to the revision.

Remaining challenges [R3] The biggest is studying the generalisation of the method to data from new clinics. We will add this to the conclusions.

Methodological details [R3] How is the elapsed time embedded? This is already explained in Sec. 2.1 (2nd paragraph), but we will further clarify, as we concatenate the elapsed time with the input images. This is one of the key differences of our method compared to others.

[R3] How is the minimum video length chosen? We did not choose a minimum video length, but used every complete video from the cataract101 dataset.

[R3] How were the baselines selected? These were selected as the closest and most recent existing methods that perform RSD. They just happen to be for cholecystectomy and laparoscopy surgery. Without these, there would be no baseline, as none exist for cataract surgery.

[R1] Stratified sampling with 8000 frames per phase? To clarify, we do not sample 8000 frames per phase for each video sequence but rather over the whole dataset (which ranges from 39’000 to 286’000 frames depending on the phase). This avoids the problem R1 refers to.

[R3] leave-one-out experience experiment. Since the used dataset only contains 4 physicians, we do not think that experience prediction can be properly evaluated. Therefore, we do not report any metrics for experience prediction.

We will clarify all of these in the revised manuscript.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Major concerns have been addressed in the rebuttal. Through technical novelty is limited, but the application is clinically relevant and remains underexplored to date. The camera ready must incorporate all provided justifications.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    9



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes a framework for RSD prediction in cataract surgery by also predicting surgical phase and surgeon experience. Despite limited technical novelty in methodology, the topic of RSD prediction is of interest to the CAI community and the new application towards cataract surgery is clinically relevant. The paper is well-written and the model is thoroughly validated. Concerns from reviewers regarding comparison baseline selection, run-time performance, and experimental details are addressed in rebuttal.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    5



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The novelty and contribution of this work was still unclear after the rebuttal.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    15



back to top