Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Maria Tirindelli, Christine Eilers, Walter Simson, Magdalini Paschali, Mohammad Farid Azampour, Nassir Navab

Abstract

Medical Ultrasound (US), despite its wide use, is characterized by artefacts and operator dependency. Those attributes hinder the gathering and utilization of US datasets for the training of deep neural networks used for computer-assisted intervention systems. Data augmentation is commonly used to enhance model generalization and performance. However, common data augmentation techniques, such as affine transformations do not align with the physics of US and, when used carelessly can lead to unrealistic US images. To this end, we propose a set of physics-inspired transformations, including deformation, reverb and signal-to-noise ratio, that we apply on US B-mode images for data augmentation. We evaluate our method on a new spine US dataset for the tasks of bone segmentation and classification.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_66

SharedIt: https://rdcu.be/cymbt

Link to the code repository

https://github.com/mariatirindelli/UltrasoundAugmentation

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a data augmentation strategy for ultrasound, which mimics ultrasound physics better than classic augmentation strategies (rotation, flip, etc..), therefore can be considered more realistic. In particular, the authors propose 3 strategies: deformation (due to ultrasound probe pressure and bone structures), reverberation (based on the bone centroid) and signal-to-noise ratio (also based on bone structures). The method is evaluated on two tasks: bone segmentation and bone classification, however results show a marginal improvement of 0.001 in Dice for segmentation and 0.0 in accuracy for classification.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • It is a novel and easy to implement data augmentation strategy that could potentially improve the outcomes of some clinical applications. Classic data augmentation techniques include: rotation, translation, scaling etc. However these techniques may not be suitable for ultrasound, and the physics behind the creation of the images is very different. The strategies proposed in the paper, which are based on physics, could help improving the outcomes of tasks involving bone visualisation.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Marginal improvement compared to classic data augmentation strategies. Authors report a dice score (standard deviation) of 0.625 (0.03) with a classical data augmentation approach and 0.626 (0.01) with the proposed deformation approach and 0.626 (0.02) with the proposed reverberation approach on ultrasound segmentation. They also report an accuracy of 0.883 (0.04) with a classical data augmentation approach and an accuracy of 0.883 (0.03) with the proposed reverberation approach.

    • Limited application: although the authors claim that are proposing a general ultrasound augmentation (title, abstract, conclusions), this strategy is designed for bone detection and classification as it requires bone masks as input. It is evaluated on bone-related tasks only (detection and classification). It may be difficult to use this data augmentation on other applications such as breast, prostate, cardiac, etc..

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Some details are missing: image size, number of epochs. Code and data will be made available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Reverberation: The paper proposes to include reverberation as a strategy to provide data augmentation. However, the question is, how realistic will the images be after adding reverberation artifacts? One would imagine that if reverberation occurs, the original image will contain the artifacts already.

    Results: Looking at the results, the improvement is marginal and therefore not convincing that the proposed technique works better than a classical approach for segmentation (0.625 (0.03) vs. 0.626 (0.01), respectively) or classification (0.883 (0.04) vs. 0.883 (0.03)). With such small improvement, a statistical analysis would be needed to claim that “The proposed transformations of Deformation and SNR outperform the classical augmentation”

    The authors may need to look to other datasets to show more convincing results and on how useful this technique could be.

    Minor: Figure 1 is not referenced in the text.

  • Please state your overall opinion of the paper

    reject (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Results do not demonstrate that the physics-based approach improves classic data augmentation approaches.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    3

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    This paper describes three physics-inspired transformations of ultrasound images that can be used to augment learning of DNN. The three are 1) Deformation based on a simple linear deformation above bone, and zero deformation below bone; 2) Reverberation which duplicates the bone echo at multiples of depths; 3) SNR which changes the multiples of bone/non-bone pixels. The new augmentation is shown to have some advantages on a classic DNN bone segmentation task, although the benefit of using all of the proposed augmentations is not clear.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strength of this paper is the widespread and acknowledged need to augment DNN leaning with realistic transformations of ultrasound images that are very different than photographic image transforms used in other standard image DNN augmentations. I expect many ultrasound researchers in MICCAI would be interested in this paper and likely to generate much debate. I would also look forward to such a debate.

    The paper is well written, very clear, good and appropriate references and honest description of the results and conclusions.

    I also appreciate the demonstrated bone segmentation task which is a growing area of research at MICCAI in a number of different clinical applications using ultrasound.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    An obvious weakness is that only three possible augmentation methods are proposed whereas real variations in ultrasound images go far beyond just simple models of deformation, reverberation and SNR. All of the many reconfigurable parameters on the console of an ultrasound scanner also change the images such as TGC, gain, focus, depth, THI, etc. The authors acknowledge this in the literature review but do not really justify the choice of these three variations to start.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It would be very easy to replicate the methods in this paper. The three augmentation techniques are clearly described and are simple (although the simplicity is also a weakness because it doesn’t capture the true ultrasound physics)

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    All three of the augmentation methods require a priori knowledge of the bone location. Since the goal of the DNN is to give the segmentation, this becomes a chicken-and-egg problem: which comes first? The authors should acknowledge the need for bone locations more clearly in all three augmentation methods and describe what happens when there is uncertainty/errors in that knowledge. Some justification should be offered for the three chosen augmentations.

    The authors should define the variable H in equation (1) and be more explicit about how the deformation is performed. It seems to me that Algorithm 1 simply scales linearly the depth of tissue above bone and keeps the bone location unchanged. But what about axial versus lateral motion? Is the tissue considered incompressible? Moreover, if the tissue is deformed, it would suggest a physically incorrect point spread function. Finally, why smooth the deformation field after modeling it, doesn’t that result in a deformation below bone that is physically incorrect?

    For reverberation, if the bright bone echo is repeated at a multiple of depth, doesn’t the augmentation overlap with existing reverberations effects? Also justify the kernel and sigma parameters, i.e. try to relate to ultrasound physics to see how to select such parameters for other ultrasound transducers.

    Also defend the range of augmentation ranges in Table 1.

    Also be more clear about the definition of “All” in Table 2: is it both Classical and the newly proposed augmentations?

    Minor point: remove unnecessary capitalization, especially in abstract

  • Please state your overall opinion of the paper

    accept (8)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I am arguing in favour of this paper because I believe it is one of the first of likely many papers such on how to properly augment ultrasound images. This is of wide interest to all ultrasound AI/ML researchers. I’m not convinced the three proposed augmentations are the best, but it is a good start and I look forward to debating where to go next.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    In many computer vision tasks, data augmentation is applied to overcome the lack of training data. Conventional image modifications, such as random scaling, translations, rotations, and Gaussian noise additions, are applied to the dataset. However, the paper claims that these classical augmentations are based on mechanisms behind optical cameras which strongly differs from the principles of US. They suggest three augmentation techniques, using a set of US image modifications considering realistic sources of variability in US. The proposed method augments the US image using deformation, reverberation, or signal-to-noise ratio. At last, they evaluated the proposed methods on the tasks of bone segmentation and classification.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The unrealistic nature of conventional augmentation methods is recognized by the authors. I do agree that the existing augmentation methods do not consider the main characteristics of the medical US images.
    2. Deformation is an interesting idea considering.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper describes a segmentation task and a classification task for the US image containing bones to verify the novelty. However, the tasks performed has limited use in practice. In addition, the three proposed augmentation methods are valid only for such tasks and can not be applied for general purposes.
    2. This paper proposes three main augmentation methods that utilize realistic sources of variability of US. However, it is questionable how realistic the implementation of each method is. In the method of implementing SNR tuning, it is not realistic to adjust the intensity in the image domain since the actual B-mode image is generated through the delay-and-sum(DAS) algorithm of the rf data.
      • The paper claims classical transformation is an unrealistic method that does not consider the physical properties of the medical US imaging. However, as shown in experimental results, classical methods are also effective for training DNNs. On the other hand, the suggested method did not show improvement in performance compared to the classical method. In particular, the proposed transformations such as Deformation and SNR in the classification task demonstrated inferior performance as compared to the classical method.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors provides adequate information for the reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. When conducting an experiment to evaluate data augmentation, it is necessary to specify the size of dataset created. In addition, no description is provided for the dataset creation for the evaluation of the classical augmentation method.
    2. For each task, there exits differences in the performance of deformation, reverberation, and SNR. it would help readers to better understand why such difference exists. For example, there should be some explanation why Deformation and SNR transformations are beneficial for bone segmentation and Reverberation for classification task.
  • Please state your overall opinion of the paper

    reject (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The motives and solutions presented in this paper are interesting, but the results are not impressive as compared to the existing methods. In addition, the description on the experimental methods and variable control are insufficient.

  • What is the ranking of this paper in your review stack?

    5

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The work proposes an ultrasound (US) physics-based data augmentation method specifically in the context of bone imaging using ultrasound. US-based orthopedic procedures can benefit from the method to enlarge the small dataset size. All the reviewers find the physics-guided augmentation novel. However, they also raised concerns regarding the impact of the work as quantitative results did not show a significant improvement over the classical augmentation approach. I believe this work is important for ultrasound researchers at MICCAI and I would like to invite the authors for a rebuttal.

    I think the major weakness of the work is that the quantitative results do not show significant improvement over classical augmentation approaches (Table 2). A statistical significance test should be reported especially for classical vs proposed augmentations. Without this authors claim ‘The proposed transformations of Deformation and SNR outperform the classical augmentation’ is not valid.

    It is also not clear how many images were generated by using the proposed augmentation, and classical augmentation method. This information is important and should be included. Please provide information on how many images were generated for each proposed augmentation, all augmentations (all three proposed augmentations), and the classical augmentation method.

    Why are different physics-based augmentation methods performing differently? For example deformation and SNR perform the same for bone segmentation but if you use reverberation it drops the success for segmentation but improves the success of classification. Does this mean the reverberation model is not accurate? Also by investigating Fig4 a bone reverberation artifact will not appear that deep in the imaging direction. Shouldn’t the reverberation artifacts be closer to the bone surface?

    Why were these three augmentations chosen?

    More explanation is required regarding the deformation under the bone (Rev2) and the calculation of SNR not obeying the DAS image formation process(Rev 3). If the proposed method is based on observations obtained from the image only this should be clearly explained and maybe the physics-based approach should be reworded as a bone feature appearance-based approach.

    A better justification should be provided about the parameters used (Table 1) in the proposed method (Rev2)

    Minor: Fig2: Normally for bone US data you wouldn’t perform vertical flip as that is against the concept of data collection. One would never collect bone ultrasound data where the bone image would be flipped like the one shown in Fig2-b.

    Classification: If the intended application of US is for imaging bones why would an operator collect an ultrasound scan without the appearance of a bone feature? Do the authors mean a quality control (or scan adequacy) where the classification would omit US scans with bad quality bone features?

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    11




Author Feedback

We thank the reviewers for their constructive feedback. We appreciate that they recognize the novelty of the approach, impact for the MICCAI community (MR1, R1, R2) and the clarity and reproducibility of our method (R2, R3).

Our major contribution is the physical basis of the augmentations compared to vision-inspired transformations directly transferred from the field of computer vision. As also shown in the paper, data augmentation does not in general have a major impact on the results. We therefore did not expect significant improvement of the final results, when using different augmentation methods. However, our augmentations reflect the physics of ultrasound acquisitions and we believe that this is the scientifically correct way of manipulating US data to generate augmentations. Physics-inspired augmentations prevent the network from being exposed to random, unrealistic data, allowing not only for better modeling, but also for better interpretability and understanding of the model behavior under extreme conditions. The proposed method could provide an interpretable failure case of the model in case of improper ultrasound acquisitions or processing, while the unrealistic traditional augmentation methods borrowed from other communities do not provide such scientific paths. We therefore believe that instead of only looking at the improvement in the outcome results, we also need to pay attention to the correctness of concept and scientific foundations of the design and deployment of augmentations. Interestingly, MR1 points out that vertical flips are not realistic for US imaging. However, such transformations are still employed in the literature [9]. These unrealistic transformations are one of the main reasons we believe our proposed physics-inspired augmentation is fundamentally better suited for US images. We also believe that our augmentations will have a stronger effect on larger datasets.

Regarding the depth of the reverb artefact, we would like to point out that we consider reverb artefacts arising from multiple reflections between the US probe and the bony tissue. The proposed method can be easily extended to other tissue interfaces like bone-fascial layers. We assume the “bone is a static body without deformation or transformation” (sec. 2.1), thus we set the deformation below it to 0.

Regarding the choice of the augmentation transformations, we would like to stress that this work focuses on physics-inspired transformations, easily applicable to B-mode data. We acknowledge that other parameters such as TGC and frequency are relevant but would require access to more information than simply the B-mode data for defining complex models suitable for the RF domain. Main advantage of our method is that it can be applied to public US datasets that only provide B-modes without access to an US console or corresponding RF data.

On the consistency between the generation of the SNR artefact and DAS image formation (MR1, R3), we would like to point out that the SNR augmentation aims to simulate variations in tissue echogenicity in the image domain as defined in [Janesick, J. (2007). Photon Transfer, SPIE]. Variations in the RF domain have a direct effect on the image domain, thus variations in the image domain are also realistic.

We agree that to specify the content of the paper, it could be renamed to “Physics-Inspired Ultrasound Augmentation: A first application in bone segmentation”.

The parameter values were empirically chosen. For classical augmentations, we use parameter values from [6-14]. Augmentations were generated on the fly during training with 30% probability.

Concerning the application of bone classification, scan adequacy is indeed a possible use case. Other applications include robotic control and sonographer training.

Minor comments will be clarified in the paper.

Overall, the novelty of the method was acknowledged by all reviewers and we believe that this work offers new paths to the MICCAI community.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal addressed some of the major concerns. Although very important work in the right direction for US data augmentation, unfortunately, in its current form it can not be accepted. Statistical significance is not reported in the rebuttal and by looking at the quantitative results presented in the work there is no significant difference between the proposed and classical augmentation methods. As the authors mention on a larger dataset the proposed work could potentially outperform classical augmentation. However, this is not shown in this work. Finally, it would be important to evaluate the proposed work against GAN-based data augmentation to further improve the strength of this work. I am looking forward to reading a more updated version of this work.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    18



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I agree this is an interesting and important application wrt. ultrasound image computing. I also would like to add that negative results should not be the reason to reject a technically sound paper.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    13



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper, titled “Rethinking Ultrasound Augmentation: A Physics-Inspired Approach” received polarizing recommendations from 3 senior reviewers:

    R1 strength - novelty weakness - incremental improvement (i.e. no significance) - limited application R2 strength - appropriateness to miccai - limited application - evaluation weakness - missing technical details R3 strength - novelty weakness - limited application - incremental improvement

    All agreed the writing quality is excellent and the approach interesting/novel. As summarized by the primary AC, the main limitation was the incremental improvement over the more classical/traditional methods, thus the significance of this work is not demonstrated. While the clarity was rated as being excellent by all reviewers, the Primary AC raised certain key questions. The reproducibility was also being rated as good by all reviewers.

    Based on my personal reading of the manuscript, reviews and meta-reviews, and the rebuttal, I support the decision to accept this paper. I agree with authors’ rebuttal that the physics-based approach is perhaps the “scientific-correct” way for data augmentation, despite the incremental improvement based on the current implementation. I also with R2 that this is “one of the first of likely many papers” on physics-based data augmentation. This topic is appropriate for all MICCAI audience who works with deep-learning approach for US processing.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    6



back to top