Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Mitchell Doughty, Karan Singh, Nilesh R. Ghugre

Abstract

We present SurgeonAssist-Net: a lightweight framework making action-and-workflow-driven virtual assistance, for a set of predefined surgical tasks, accessible to commercially available optical see-through head-mounted displays (OST-HMDs). On a widely used benchmark dataset for laparoscopic surgical workflow, our implementation competes with state-of-the-art approaches in prediction accuracy for automated task recognition, and yet requires 7.4× fewer parameters, 10.2× fewer floating point operations per second (FLOPS), is 7.0× faster for inference on a CPU, and is capable of near real-time performance on the Microsoft HoloLens 2 OST-HMD. To achieve this, we make use of an efficient convolutional neural network (CNN) backbone to extract discriminative features from image data, and a low-parameter recurrent neural network (RNN) architecture to learn long-term temporal dependencies. To demonstrate the feasibility of our approach for inference on the HoloLens 2 we created a sample dataset that included video of several surgical tasks recorded from a user-centric point-of-view. After training, we deployed our model and cataloged its performance in an online simulated surgical scenario for the prediction of the current surgical task. The utility of our approach is explored in the discussion of several relevant clinical use-cases. Our code is publicly available at https://github.com/doughtmw/surgeon-assist-net.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_64

SharedIt: https://rdcu.be/cyhRj

Link to the code repository

https://github.com/doughtmw/surgeon-assist-net

Link to the dataset(s)

http://camma.u-strasbg.fr/datasets

Reviews

Review #1

Please describe the contribution of the paper

The paper presents a lightweight deep learning model to be deployed on HoloLens 2 for online recognition and display of surgical phases.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This is a nice complete engineering work, from model creation to software implementation to phantom evaluation. A lot of references and reported in detail of technical aspects.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The clinical significance is lacking. Why would the surgeon wants to know which phase he/she is performing and what is the next phase? This is the basic information and I can’t see any added value here to the surgeon. The technology is cool and shows its capability to do deep learning inference on HoloLens. But it seems it will not provide useful information to the surgeon.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Reproducibility is good. The paper contains technical details of how they train the deep learning model and they will release the relevant code. One part of the validation is based on a public dataset so that part can be reproduced.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The paper is well written so I don’t have many comments. The ones I have are here:

1) One of the benefits of the proposed method is the decreased inference time. But for the paper’s application, which is to display the phases, it doesn’t need short inference time, because usually a phase will last long enough that it doesn’t matter if the inference time is short or not.

2) From Table 1 and Fig. 4, it seems that the user-centric tasks designed are relatively simple, have less complicated phases and without realistic surgical background.

3) For Tables 2 and 3, it’s better to use BOLD font for the best performance in each category rather that the proposed approach.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This recommendation is made based on the balance between the major strengths and weaknesses as listed above.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This work presents a machine learning framework for enhancing surgical guidance via AR. Their system utilizes CNN and RNN to discern and predict the current surgical task performed by the user. Similar work was accomplished before, but the authors’ framework improves on what exists by requiring less model parameters and execution time.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Novel framework that is competitive compared to state-of-the-art.
- Reduction of overhead for prediction (# of parameters and FLOPS) makes framework more readily available for clinical use.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Evaluation is evident but only for the two datasets used (Cholec80 and user-centric dataset). More work is needed to generalize the utility of the framework.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Authors present a framework that can be implemented and to reproduce similar results of other procedures.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- There is a lack of description regarding what is done on the side of the AR headset. How is the SDK specific to Hololens 2 utilized to accomplish your results in Figure 4?
- Tables 2 and 3: bold the value in the column that triumphs, not that of your method only. This will better indicate the comparison.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- Scientific method appears to be sound. Authors anchored their work by that of others and continuously compared their work to the state-of-the-art.
- The developed framework has a direct impact on task prediction approaches in surgical procedures using optical see-through AR headsets.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

4
Reviewer confidence

Somewhat confident

Review #3

Please describe the contribution of the paper

The authors have developed a framework that can be used with an HMD to detect the phases of an operation and provide the wearer with the appropriate matching information. They have evaluated the framework and tested it against other existing models. And, in an initial feasbility study, they have tried phase detection on a simulated surgical task using HoloLens 2.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The topic is relevant. Support during an operation without overloading the user with unnecessary information can contribute to patient safety, especially in training situations. The paper is well written and understandable, even for someone with only some modelling background.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

As I understand it, only one example was used in the feasability study. It would have been possible to create one or two more example data sets with relatively little additional effort and thus achieve a higher degree of vaildity. As I understand it, the surgical information provided is currently limited to the phase the user is currently in, and the phases to follow. This is very far away from real surgical guidance. But I acknowledge the very important first steps towards context awareness the authors made here.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I am no modelling expert, but it seems to me that the methodology is described in enough detail to allow reproduction of the study. I am not sure, if the description of the selfmade program for the HoloLens is sufficient.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- I am missing some conclusions in the abstract. For instance: What more would be needed to implement this in a clinical setting? Where might be room for improvement?
- Please check if the abbreviations are introduced correcctly. At least for OST-HMD and RNN this ist not the case in the main part of the manuscript
- Since the current implementation is far removed from useful surgical guidance, the authors could share some ideas on where they want to take this framework in the conclusions.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This is very interesting work and from a clincal point also relevant. However, it is still a long shot away from being clinically applicable.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #4

Please describe the contribution of the paper

This paper is towards context-awareness in endoscopic surgery. The specific project is to overlay the predicted workflow onto the video through a HoloLens device. The network is designed to be lightweight, in order to achieve real-time prediction. The concept is good, however, the work is not solid yet.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- This paper aims at providing AI-enabled context-awareness during surgery, with the use of AR to display. The idea and concept is novel and important. The introduction is also well described the motivation, value and related work
- The desgin of lightweight is reasonable, which is a lot faster than existing networks.
- The display is implemented to HoloLends, and not so many existing works have managed to do so.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- However, the work is not solid yet, clinical relevance is not answered.
- Implementation details of HoloLens app should be further elaborated.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

More implementation details on HoloLens implementation is needed.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- How would the display of workflow prediction onto the screen be helpful for surgeons?
- Why also display the prediction probability and inference time? Would these be useful for surgeons during the procedure? These clinical relevance are important and need to be clarified.
- Any user study with clinicians with the developed system?
- Evaluation numbers in the Table should report standard deviation
- Performance drop of the proposed lightweight in comparison with existing “heavy” networks is still large. Such trade-off needs discussion.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I appreciate the efforts made in this paper, towards enabling context-awareness in surgery relying on AR using HoloLens. However, the implementation of this paper is not solid enough. Therefore, I would like to rate at borderline reject.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This MICCAI submission, titled “SurgeonAssist-Net: Towards context-aware head-mounted display-based augmented reality for surgical guidance” was reviewed by 4 reviewers with varying confidence level/seniority. It should be noted that reviewers with more experience/confidence level in their review recommended the decision of borderline reject. Among the piles of papers reviewed by these reviewers, all reviewer consistently ranked this paper at the mid- to lower half of their assignments.

Fundamentally, this is a paper about surgical context-aware system and NOT about augmented reality system for surgery. The scientific basis for it is a CNN network with capability to predict surgical phase with “real-time” (~10fps), utilizing the limited hardware capability of the Microsoft Hololens 2. As the clinical significance is not demonstrated, this AC is recommending the decision to invite authors to submit rebuttal as a response to reviewers’ comments.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Author Feedback

We are grateful to the reviewers for their time and detailed critique. The main concern raised was study motivation: demonstrating the clinical significance of our work in the broader perspective of surgical guidance. Head-mounted display (HMD)-based augmented reality (AR), when effectively implemented, has been shown to enhance the performance of targeted surgical tasks across many medical domains by the improved visualization of intraprocedural target paths and structures. In a typical surgical guidance scenario, virtual models are created from preoperative medical imaging data and are aligned with intraoperative target structures to assist in guiding a user to target sites. Aside from alignment accuracy, a primary limitation of AR-guided approaches (the “effectively implemented” caveat), is the reliance upon a user to manually control the appearance and presentation of virtually augmented entities, thereby adapting the visualization to their current surgical context. This is tedious and detracts from their focus on the surgical task. Our work on surgical task prediction is thus critical and foundational in ensuring that the automated augmentation of virtual models meets the current information needs. In the revised paper, we will clarify that surgical task prediction serves as a prerequisite to displaying the optimal virtual information to the user. We will also discuss several concrete clinical scenarios involving surgical guidance, user training, and performance evaluation, where the SurgeonAssist-Net framework can be readily incorporated: Model-based Guidance: The predicted task context can control the choice and presentation of the augmented virtual models. For example, in general surgery, as a surgeon picks up a scalpel and the incision phase of a procedure is detected, the HMD would display a relevant virtual model, aligned with the patient, overlaying the target site for surgical entry. Throughout the procedure, our automatic surgical phase detection would enable different virtual models and information relevant to the surgical phase to be optimally selected and presented, without user intervention. Training: Task prediction can be used to guide a surgery fellow or medical student, wearing the HMD, in practicing a general surgery task on a cadaver. Their active learning can be reinforced by presenting them with a task history of phases performed, or of upcoming surgical actions, given their present surgical step. Additional relevant information, in the form of visual cues, text, or audio, could be presented in tandem with the detected surgical task to enhance the training experience. Evaluation: Task analytics can provide surgeons with quantitative data on a surgical procedure. For example, a surgeon performing a less frequent procedure could wear the HMD while re-training and be provided with chronology and analytics of the time spent in each surgical phase. This information, when compared to peers, could serve to suggest focus areas for improvement. Figure 4 may have introduced a narrative contrary to the overall theme of this work. Figure 4 shows a virtual user-interface on the HoloLens 2 that includes the current predicted surgical phase, prediction probability, and inference time, alongside a flow-diagram of the current surgical task layout. The intention of this figure was not to demonstrate the optimal augmentation layout for a typical clinical use case; but instead, to show the quantitative performance of our approach. If accepted, we will modify this figure to clearly demonstrate the clinical intent of this work and include an image displaying several different surgical phase predictions and the optimally selected virtual models for guidance. We would also like to address the comment requesting additional software details. Though we are unable to further elaborate due to space constraints, we will be publicly releasing the software/code used for the training and deployment of our framework on GitHub.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Authors have indeed clarified on the issue of clinical significance. It should be noted that authors have agreed to release their work as an open-source software on github, thus the significance and potential impact to the CAI community is also increased accordingly.

The remaining issue is the amount of revision that would be required to make this revised manuscript up-to-par with MICCAI standard. Authors had provided a comprehensive discussion in the rebuttal on how to address these issues.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The paper presents a novel light-weight framework for online task recognition used for context-aware virtual assistance on a head mounted display (Hololens 2), demonstrating near real-time performance. Paper is well-motivated, demonstrating a good end-to-end engineering effort with focus on application and execution time, and validated against state of the art with Cholec80 dataset and new user-centric phantom dataset. The main concerns of the reviewers regarding clinical significance and relevance of the work have been addressed by the rebuttal, and authors promise release of code upon acceptance to address questions regarding software implementation details on the Hololens side.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

9

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The paper proposes a context-aware solution for surgical assistance through HMD. This area has started grasping attention in recent areas especially because of the availability of AR devices (e.g HoloLens). The authors have satisfactorily addressed major concerns raised by the reviewers through the rebuttal. I recommend this paper for presentation at MICCAI. The authors should include the supporting justifications provided in the rebuttal in the camera ready.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

back to top

SurgeonAssist-Net: Towards Context-Aware Head-Mounted Display-Based Augmented Reality for Surgical Guidance