Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Matthias Seibold, Armando Hoch, Daniel Suter, Mazda Farshad, Patrick O. Zingg, Nassir Navab, Philipp Fürnstahl

Abstract

In this work, we propose a method utilizing tool-integrated vibroacoustic measurements and a spatio-temporal learning-based framework for the detection of the insertion endpoint during femoral stem implantation in cementless Total Hip Arthroplasty (THA). In current practice, the optimal insertion endpoint is intraoperatively identified based on surgical experience and dependent on a subjective decision. Leveraging spectogram features and time-variant sequences of acoustic hammer blow events, our proposed solution can give real-time feedback to the surgeon during the insertion procedure and prevent adverse events in clinical practice. To validate our method on real data, we built a realistic experimental human cadaveric setup and acquired acoustic signals of hammer blows during broaching the femoral stem cavity with a novel inserter tool which was enhanced by contact microphones. The optimal insertion endpoint was determined by a standardized preoperative plan following clinical guidelines and executed by a board-certified surgeon. We train and evaluate a Long-Term Recurrent Convolutional Neural Network (LRCN) on sequences of spectrograms to detect a reached target press fit corresponding to a seated implant. The proposed method achieves an overall per-class recall of 93.82+-5.11% for detecting an ongoing insertion and 70.88+-11.83% for identifying a reached target press fit for five independent test specimens. The obtained results open the path for the development of automated systems for intra-operative decision support, error prevention and robotic applications in hip surgery.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_43

SharedIt: https://rdcu.be/cyhQF

Link to the code repository

https://caspa.visualstudio.com/CARD%20public/_git/AudioFemoralStem

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a system for acoustic analysis of hammer strikes for femur implant insertion to detection optimal insertion endpoint based on spectral analysis of piezo-microphone recordings. The proposed approach is to perform traditional log mel-spectral analysis and then use a ResNet-50-based neural network to classify whether optimal insertion depth has been reached.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors report more accurate results than one other previously proposed method in a 5 specimen leave-one-out study. These results are promising.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The results are promising but very preliminary and on a very limited dataset.

    The method seems somewhat incomplete. Taken as a whole, sensitivity and specificity measured across all timepoints is reasonably accurate, however, given the level of noise (false positive “optimal” classifications that occur much earlier in time than the ground truth for each case) it is unclear how useful or actionable this information will be in a clinical setting.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It appears reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    A more complete description of how this method would be integrated into the clinical workflow would improve the work. As would novel technical developments that improve its performance.

  • Please state your overall opinion of the paper

    probably reject (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    . The limited study for this application combined with the moderate technical novelty make this paper less competitive compared to other MICCAI submissions

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    This paper demonstrates a technique to optimize the insertion depth of femoral stem implants. These implants must be hammered into the femoral bone, and a potential complication arises if the surgeon hammers too far deep - the femur can fracture.

    The proposed technique uses a microphone attached to the insertion tool to measure the acoustic vibrations of each hammer blow. A classifier is trained to categorize each hammer blow as either insertion (“keep hammering”) or press fit (“stop hammering”). The results show a positive result of 94% classification of insertion and 71% classification of press-fit, which lays the groundwork for continued development in this direction.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of this paper include:

    1. A creative way to measure hammering blows, which effectively captures the data using commodity sensors and does not suffer line of sight limitations.
    2. A potentially valuable ability to prevent clinical complications such as femur fracture.
    3. Clearly described development of the motivation, methodology, and results
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weaknesses of the paper include:

    1. The hammer-blow classes are treated as equal though they are fundamentally not – misclassifying an insertion blow as a press-fit blow is inconsequential even at 94% accuracy, while the reverse error (29% error rate) is potentially costly and damaging.
    2. The nature of the problem is asymmetric as there will be far more insertion blows than press-fit blows based on the application, so care must be taken so that the network is not overly biased to the majority class (insertion) and insensitive to the infrequent class (press-fit), particularly since the infrequent class is the more consequential class.
    3. The results are based on a small dataset of 5 cadavers so how the method scales to other instances is yet unknown.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Authors provide an excellent description on how one could recreate the experiments. The only details missing are the initial weights of the model (is it randomized or pretrained?) and the number of epochs that worked well.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Overall, authors have convinced me that theirs is a sound approach (so to speak) for classifying hammer blows. Even as the results are still preliminary with room for improvement, there is promise in continuing in this direction.

    Some strong advantages of the method include the use of relatively common acoustic sensors that can be integrated into existing devices without complicating the the workspace. Furthermore the measurements that can be obtained are relatively clean compared to acoustic sensors in air, and complete compared optical sensors which may have line of sight limitations. The network training method appears also to be robust against overfitting and random variabilities that occur in other data types, so performance may be more predictable. An alternative way to think about it is the data has low dimensionality yet captures the necessary information, which is a huge bonus. On the more tempered side, the problem statement is relatively limited to a very specific class of surgeries, so one may not find so many methods to compete against.

    There are a few items that I can think of that would make this a stronger paper (whether for the present iteration or the next, possible overlap with above):

    1. I’m curious as to how much the described data augmentations helped performance. It would be interesting to know to what extent they helped, what could be excluded without consequence, and what other augmentations might be better.
    2. For reproducibility, how was the network initialized (is there a good set of weights for transfer learning?) and how long was the network trained for?
    3. Dataset balancing: by nature of the application there will be very many insertion blows to each press-fit blow, which would tend to bias the network to be overly sensitive to one class and insensitive to the other. The ratio of classes in the dataset should be reported. The unfortunate reality is that the insensitive class is the more critical one, though this is not uncommon in medicine (e.g., cancer detection in medical images).
    4. More diversity of data and a qualitative/intuitive explanation of how the hammer blow classes differ
    5. Recasting the application, and the corresponding network training setup, around the asymmetry between insertion and press-fit blows. A few examples of what I mean: 5a. “Almost there” alert so that even if the network is not quite certain of the blow type, it can provide a robust estimate about how many insertion blows remain until likely reaching the press-fit regime 5b. Focus on specificity rather than sensitivity, so the surgeon will tend to be more conservative about their hammering in order to prevent complications 5c. Change the way the results are reported to account for the asymmetry
  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I am entering a recommendation of accept based on an assessment that this is a preliminary result that shows promise for further development.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    6

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    The paper presents a learning based approach to detection of the insertion endpoint during femoral stem implantation in Total Hip Arthroplasty. The paper proposes a method and data collection approach to automate the current subjective nature of the decision making process, for optimal insertion depth.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper has clinical relevance.
    2. The approach to data collection.

    These two are the most important in my opinion since the annotations are challenging.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The paper does not present results from standard practice to compare with the results from the presented approach. Additionally some images of x-ray to indicate the target reached vs ground truth would be a good addition to understand the results better.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    There is no mention of code sharing.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    The presentation of results could be done better with additional x-ray images. The motivation of the problem can also do with some polishing with images to demonstrate the need for additional support during surgery.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The approach is elegant and the application seems to be of practical values. So I would recommend the paper.

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    4

  • Reviewer confidence

    Somewhat confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposes an acoustic-based spatio-temporal method for analysing hammer strikes for femur implant insertion. The application is clinically relevant and is well-written. The reviewers’ raised concerns related to the experimental setup details and validation (e.g. results been preliminary, having class imbalance issue, dataset been very small (5 cadavers) ). No visual or qualitative results or illustration is presented which could have helped in motivating the problem. These should be addressed.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    7




Author Feedback

Dear reviewers and editors,

Thank you for constructive (R1, R2, R3), positive and supportive (R2, R3) reviews and the opportunity to provide this feedback.

First, we want to address reviewer #1’s comments about the limited sample size and therefore preliminary nature of the study: Because of ethical and financial reasons, we consider the sample size of five human cadaveric hip specimens to be sufficient for showing the technical feasibility of the proposed method. We have obtained an ethical approval for a limited number of cadavers as a trade-off between costs (over 10000€ cadaver cost for current study), ethical considerations, and dataset size. This kind of pilot study with limited specimens is the right ethical practice before conducting larger ex-vivo or in-vivo studies. The herein proposed study will lay the foundation for a larger follow-up research project including an in vivo study which enables gathering of larger amounts of real data. Nevertheless, as pointed out by all three reviewers, the validation shows well the potential of the proposed method. The limited sample size is therefore a justified limitation, which has been stated in the discussion section of the submitted manuscript and is in accordance with the guidelines provided by MICCAI for CAI papers (https://miccai2021.org/en/REVIEWER-GUIDELINES.html) considering the challenges of performing such cadaver experiments in general and in Covid conditions in particular.

Regarding the clinical relevance, we emphasize the number of complications of the conventional procedure which motivates the development of automated support systems. In current clinical practice, the optimal endpoint for the insertion of femoral stem implants in hip surgery is determined in a subjective way and cannot be resolved by an optical navigation system. In the presented work, we propose the first fully automated method to assess the press-fit of femoral stem implants. The final system could then inform the surgeon to stop the insertion process when the target press-fit is reached. We will add a sentence to make this clearer in the final version. As confirmed by reviewers #2 and #3, the technical novelty is the first application of spatio-temporal learning for acoustic signals in the medical domain and the introduction of a sensorized smart instrument which enables effective and noise-robust capture of the relevant signals. Furthermore, our novel approach is low-cost and easy to integrate into existing surgical tools and as explained above, it can get integrated into the workflow and impact the patient outcome.

We furthermore want to thank reviewer #2 for the valuable suggestions that will help us to improve the quality of follow-up studies. We will add a short explanation about the random initialization of the network to the manuscript text. Using early stopping, the network was trained for an average of 7 epochs during five-fold cross validation. The class ratio in our dataset is 1245 (increasing) to 550 (target fit) sequences, which is not highly imbalanced. However, to address this valid comment, we will add this information in the final version of the manuscript, including a statement that investigating the influence of class imbalance is subject to future work.

For the presented work and the conducted experiments, we assessed the target press-fit based on the preoperatively planned broach size fully seated in the anatomy and the target press-fit confirmed by an experienced surgeon. Therefore, it is not possible to assess radiographs of the reached target vs. the ground truth (which would correspond to the preoperative plan), as suggested by reviewer #3. However, for a better illustration of the problem in clinical practice, we will provide a radiograph of a periprosthetic fracture in Figure 1 on the right side without taking up additional space.

To facilitate reproducibility, we have made the code fully public, the link to the repository will be provided in the final version.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The have authors addressed the main concerns raised by the reviewers. The comparison results with [20] must be included in Table 1 and both precision and recall for each class and fold must be reported alongside their method’s results. The supporting justifications provided in the rebuttal should be incorporated in the camera ready as well.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    7



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I believe the rebuttal addresses most of the concerns. The limited evaluation is still a major concern. However, I also agree that having a large cadaver study is very costly. A follow-up clinical evaluation study, as mentioned in the rebuttal, would further strengthen this work. Since the application of spatio-temoporal learning in orthopedic procedures is new and the approach is low cost and has the potential to be used in conjunction with surgical tools I support the acceptance of this preliminary work. I also believe the work will generate healthy discussions among CAI practitioners and clinicians who are attending MICCAI conference.

    The authors promise to make the code available public. I strongly suggest making the collected data available with the code as it will be required for the reproducibility of the research.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This work aims at improving a real-world clinical practice that is long overdue to be enhanced taking advantage of recent technology developments. Despite reviewers’ concern on small dataset size, I also agree with the authors that for such pilot work, this is sufficient. Overall, I find this work interesting and VERY relevant to MICCAI theme.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    6



back to top