Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Elvis C. S. Chen, Burton Ma, Terry M. Peters

Abstract

Ultrasound probe calibration remains an area of active research, but the science of validation has not received proportional attention in current literature. In this paper, we propose a framework to improve, assess, and visualize the quality of probe calibration. The basis of our framework is a heteroscedastic fiducial localization error (FLE) model that is physically quantifiable, used to i) derive an optimal calibration transform in the presence of heteroscedastic FLE, ii) assess the quality of a particular instance of probe calibration using a registration circuit, and iii) visualize the distribution of target registration error (TRE). The novelty of our work is the extension of the registration circuit to Procrustean point-line registration, and a demonstration that it produces a quantitative metric that correlates with true TRE. By treating ultrasound calibration as a heteroscedastic errors-in-variables regression instead of a least-squares regression, a more accurate calibration can be consistently obtained. Our framework has direct implication to many calibration techniques using point- and line-based calibration phantoms.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_35

SharedIt: https://rdcu.be/cyhQx

Link to the code repository

https://github.com/chene/ARQOPUS

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    A novel method for the computation and quality assessment of ultrasound probe calibration is introduced. While existing methods assume that the error on the localization of fiducials within the image is homoscedastic, it is here shown that this assumption doesn’t hold for commonly used phantoms (such as tracked stylus or wire phantoms). In virtue of this, the authors propose to employ mathematical tools which are shown to have optimal behavior with respect to heteroscedastic noise. The superior performance of the method is shown via Monte Carlo simulations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This is in my opinion an outstanding paper. It provides an excellent contribution, as well as deep insights into an important problem for the community. The clarity of the exposition is excellent. The authors promise to provide an open-source implementation of their method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Although the text is very clear, it is also very dense and could be obscure for a reader coming from a different community. If the space allows it, the first part of the introduction could be made more extensive.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Excellent: the implementation will be released open-source, and the method is not data-driven.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    The paper is innovative as well as instructive. Well done!

    I think a remark about the possible extension to calibration methods based on image registration would be a very interesting addition, if the space allows it.

  • Please state your overall opinion of the paper

    strong accept (9)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper opens a new path for find a better solution to a fundamental problem.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    The paper describes a method to calibrate a US probe tracked with an external sensor, for freehand 3D ultrasound. The method assumes a phantom made of lines and takes into account the intrinsic heteroscedasticity of the landmark detection. This is validated through Monte Carlo simulation and shown to improve over homoscedastic least-squares method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Taking into account heteroscedasticity makes full sense for US calibration. This is a contribution that will enable more accurate calibrations, also because one will be allowed to capture images that are farther from orthogonal to the target lines. The authors demonstrate an excellent knowledge of the problem and literature and the validation is convincing.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some aspects could have been investigated: impact of inaccurate manual determination of the covariance, covariance axes ligned with the image axes, actual TRE measurements by 3D reconstruction of target shapes.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method is described in enough details to be reproduce by a motivated and educated student.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Clarity: The text is very clear, and well paced.

    References: are complete as far as I am concerned.

    Method: The paper describes a method to calibrate a US probe tracked with an external sensor, for freehand 3D ultrasound. The method assumes a phantom made of lines and takes into account the intrinsic heteroscedasticity of the landmark detection.

    The location of the landmark in each image is identified by manually placing a bounding box. This step is not straightforward and prone to observer variability. How is the proposed calibration sensitive to inaccurate annotation?

    These areas are assumed to be axis aligned boxes but the covariance axes might not coincide with the image axes. Could the authors comment on that?

    The out-of-plane covariance component is assumed to be small, but just enough to avoid a singular covariance matrix. However, the US signal is impacted by neighboring out-of-plane objects (e.g. correlated speckle). Would not that component take some definite value, dependent on the horizontal covariance?

    An interesting extension of the notion of registration circuit is proposed with error measures adapted to lines. Experiments involve Monte Carlo simulations initialized with actual data from an Aurora EM tracker, and a tracked needle. Here again, it would have been interested to perform multiple landmark annotation to experiment with the influence of inaccuracy in covariance determination. Least-squares and hetereoscedastic methods are compared and the latter is proved to be statistically significantly better. The TRE could be mapped over the whole image. I was surprised to find this ellipsoid shape since I would think the landmark covariance to be smaller near the probe (top of the image I guess). Does this distribution represent reality or is it the result of second order approximations in the statistics?

    Experiments with actual 3D reconstruction of shapes and therafter actual TRE measurements would have helped definitely convince the reader of the efficiency of the method.

    Typos:

    • p.4 line3: singual -> singular
    • p.4, 3 lines below eq (1): was take to be the point -> taken
    • p.6, line 6: was used in the our extension -> in our
    • p.6 4 lines below table 1: we only the considered -> no the
    • p.6 5 lines below: inFigure3b -> space missing
    • p.6 2 lines below: gradually increases -> increasing
    • hyphenation issues on pages 5 and
  • Please state your overall opinion of the paper

    strong accept (9)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is very clear, and the method has the potential to be very useful in practice, even though experimentations in actual settings would have been appreciated.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #3

  • Please describe the contribution of the paper

    The paper presents a new ultrasound calibration and evaluation method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written and has sufficient references. The paper is novel because it incorporates FLE in the calibration framework.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The current work involves a lot of manual work. How much time and effort is needed to perform such a calibration is not discussed. How easy can the proposed method be used by other groups is unknown.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Since the authors will release their code, it has good reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Can the proposed method work with stylus of other (non-needle) shapes?

    How the manually drawn bounding box affects the calibration result?

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper does not have many weaknesses so it should probably be accepted. On the other hand, the topic may not be very interesting to many of the MICCAI members.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Somewhat confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper was overestimated by Reviewers 2 and 3 who did not mention what’re exactly the main strengthens of this paper in their reviews. Reviewer 4 raised some weakness and questions. From my point of view, I agree reviewer 4 that US calibration is a classical problem currently so many publications deal with. The authors just extended the current Procrustean point-line registration method. In this respect, the technical novelty was limited. On the other hand, the calibration method involves much manual interaction. This meta-reviewer would question what’re the advantages of the proposed method. Moreover, the experiment results were insufficient, particularly, there’re no any comparison to currently available US calibration methods in this paper. Additionally, what’s the clinical requirement of US calibration in surgical procedures? Some of current US calibration methods work quite well so far.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    4




Author Feedback

We thank reviewers for their insightful reviews. Our paper is indeed very dense (R2), as such may obscure readers from different community (R2) to fully appreciate the innovation (R2) and deep insights (R2/R3) we hope to convey to the CAI community. In addition to be written to be instructive (R2) with clarity (R2/R3/R4), we will also release an open-source implementation to facilitate the integration of our framework to other Procrustean registrations (R2) in CAI.

We clarify that we are not introducing “yet another” ultrasound (US) probe calibration technique (R3/R4/AC). Instead, we propose a framework to quantitatively assess the fitness of a calibration. As suggested by the title of our paper, and the 1st two sentences of the abstract, we advance the “science of validation” by correctly accounting for heteroscedastic FLE (R2/R3/R4), used to 1) improve, 2) assess, and 3) visualize calibration quality. We formulated a quantitative metric for an instance of calibration, shown to correlate with true TRE via an extensive Monte Carlo simulation (R2/R3), without the need to perform extensive validation experiments ([15,22,23]). Our approach uniquely differs from TRE error propagation techniques that provide an “expected” TRE for registration problems under specific classes of FLE ([3,6,9,10,19]). Our strength (AC) is the universal applicability to all US probe calibration techniques based on Procrustean (point- and line-based) registration, which are the dominant methods in the current literature ([3,5,15,16,17,18,22,23,24,25,27,28], page 8). Our novel contributions (AC) are the extension of [20,21] to line-based HEIV and extension of [7,8] to a line-based Registration Circuit. Our novel incorporation of heteroscedastic FLE (R2/R3/R4) improves the calibration quality for all Procrustean-based methods (R3). It has further implication including image registration (R2), and opens a new path for finding a solution to the fundamental problem (R2).

As our method is not data-driven (R2), but is rather a framework and not an concrete manifestation of a calibration technique, it would be improper to compare our framework against a known calibration technique (AC). As we discussed comprehensively in page 2, “while an instance of the probe calibration can be validated against a known calibration obtained via an alternative means, it is difficult to separate contribution of errors between the calibration approach to be validated from the calibration technique against which is being compared”, and “As is the case for every registration problem, such a ground truth calibration may never be obtained”. The issues of Target Localization Error (TLE) and unknown ground truth are exactly the compounding motivations for our work. We believe a simulation-based validation, drawn from real calibration data, is more appropriate to demonstrate the framework.

Demonstration of this framework as described in our paper relies on manual US fiducial segmentation and bounding box placement, used as a surrogate for FLE. We discussed how this issue may be automated ([13], page 8), and in fact, presented an automatic US needle segmentation method for this exact purpose (un-cited), and we are currently developing a fully-automatic, video-based, US probe calibration technique. We separate our theoretical contribution (this MICCAI paper) from concrete manifestation of our framework (in preparation) because our framework has wider implication to all Procrustean based methods.

Re: R3’s comments on bounding box alignment and beam profile: such bounding-box placements need not to be axis-aligned: when oblique, our framework is capable to account for full variance-covariance matrices. Ultrasound beam profile in the sagittal plane is concave, thinnest at the focal region and wider at the proximal/distal end ([3,22]), contributing to heteroscedastic FLE.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal does not help me understand what’s the novelty and contribution of this work

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    5



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors propose a method to quantify ultrasound calibration. This is a very important, neglected problem and they propose a very elegant solution. The paper is well written and clear, and together with the code release will be a useful tool for the community. I therefore recommend accept.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    4



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal has convincingly addressed the concerns from the AC and R4. In particular, the question regarding the experimental comparison is well-addressed. The paper is an interesting contribution and will be useful to the community.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    2



back to top