Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Myeong-Gee Kim, SeokHwan Oh, Youngmin Kim, Hyuksool Kwon, Hyeon-Min Bae

Abstract

The attenuation coefficient (AC) of tissue in medical ultrasound has great potential as a quantitative biomarker due to its high sensitivity to pathological properties. In particular, AC is emerging as a new quantitative biomarker for diagnosing and quantifying hepatic steatosis. In this paper, a learning-based technique to quantify AC from pulse-echo data obtained through a single convex probe is presented. In the proposed method, ROI adaptive transmit beam focusing (TxBF) and envelope detection schemes are employed to increase the estimation accuracy and noise resilience, respectively. In addition, the proposed network is designed to extract accurate AC of the target region considering attenuation/sound speed/scattering of the propagating waves in the vicinities of the target region. The accuracy of the proposed method is verified through simulation and phantom tests. In addition, clinical pilot studies show that the estimated liver AC values using the proposed method are correlated strongly with the fat fraction obtained from magnetic resonance imaging (R^2=0.89, p<0.001). Such results indicate the clinical validity of the proposed learning-based AC estimation method for diagnosing hepatic steatosis.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87234-2_2

SharedIt: https://rdcu.be/cyl7X

Link to the code repository

https://github.com/MyeonggeeKim/Learning-based-attenuation-quantification-in-abdominal-ultrasound

Link to the dataset(s)

https://github.com/MyeonggeeKim/Learning-based-attenuation-quantification-in-abdominal-ultrasound

Reviews

Review #1

Please describe the contribution of the paper

In this paper, a deep learning-based method is proposed to quantify the attenuation coefficient from a pulse-echo single convex ultrasound probe. As the attenuation coefficient can be considered as a quantitative biomarker of hepatic steatosis, the proposed method could be an interesting alternative to invasive method such as liver biopsy and high cost MRI. RIO adaptive transmit beam focusing and envelope detection scheme are used to improve the method accuracy. The method is trained on simulated data only and is evaluated on simulated data, phantom data, as well as on 30 subjects. The results on the patients data are compared to proton density fat fraction (PDFF) value obtained from MRI and a strong correlation is shown.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The use of a deep learning-based method to estimate hepatic steatosis from ultrasound data instead of liver biopsy or MRI is interesting and of clinical importance. Using K-wave to generate a large number of synthetic training data is interesting. Thorough ablation studies on simulation data are presented to assess the improvement gained from each aspect of the proposed method. The method is also compared to a baseline network. The correlation between the attenuation estimated by the method and PDFF from MRI is a really nice results. Finally, the paper is overall well written.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The proposed method is not compared to other works that have used learning and non-learning-based techniques to assess the steatosis liver level. For example: Transfer learning with deep convolutional neural network for liver steatosis assessment in ultrasound images from Byra et al. Hepatic steatosis assessment using quantitative ultrasound parametric imaging based on backscatter envelope statistics from Zhou et al. A more careful state-of-the art should be presented in the introduction.

The discussion of the results is not really precise, which makes it difficult to appreciate them: The quantitative results reported in the text do not correspond to what is reported in Table 1. The authors claim that “RAN-based networks experience only 0.0525 dB/cm/MHz drop in MAE (see Fig. 4 (b)) irrespective of the ROI depth”. However a drop in MAE respective of the ROI depth do appear. Why does the FCN results decreasing when considering the envelope detection scheme?

The phantom test could have been extended. Why not all the methods tested in Table 1 are tested on the phantom data? It is not clear how many ROI are tested in the phantom cases. It is unfortunate that the results with respect to the ROI locations are not reported.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Training details and hyperparameters are nicely reported in the paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Steering angles: In section 2.2, it is said that 7 angles are used. Whereas, from Fig 3 and section 3.2, it seems that only 5 are used. This is confusing.

It would have been nice to visually distinguish on Figure 6d the range of AC or PDFF values that correspond to normal liver, mild fatty liver and fatty liver.

The experiment description is repeated twice in the paper.

What ADC (p2) stands for?

Eq 2. What is n?

Is it not obvious that PAR stands for parameters in Table 1

Typo: p7: methods are -> method is
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents an interesting deep learning method to solve a clinically important problem. The method is sound and the ablation study on simulation data appreciated. The evaluation on phantom data and the 30 patients are also promising.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

4
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This paper proposes a learning-based technology to quantify attenuation coefficient from pulse-echo ultrasonic image. The method has been thoroughly evaluated in simulation, phantom and in-vivo study. The result indicate the clinical validity of proposed method.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The academic writing is fluent enough for interpretation. And the proposed method is proven carefully to have both clinic impact and engineering feasibility. The ROI adaptive Tx beam focusing (TxBF) and ROI adaptive normalization (RAN) are two key innovations in this study. Especially authors designed separated encoding network to generate trainable parameter in RAN, whilst the inputs are transmission beam pattern, this method brings ultrasound scanning protocol into the deep network and improves AC prediction significantly. It’s a good example to couple AI theory with physical rules. In evaluation, authors employed ablation studies, and sufficiently supported major contributions.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1.FCN is employed as a baseline network in ablation studies, but intuitively, I would suggest to replace RAN with normal batch normalization, FCN might be too simple as an adversary solution. 2.The scanning protocol is one-hot encoded into vector P, meaning it has to be selected in a fixed number of subsets. Thus question is raised, should we generated a kind of mapping strategy to project physical probe setup into a discrete P vector? Or it’s unavoidable to re-generate simulation trainset regarding different physical probe setup? Authors didn’t mention this information. 3.Authors didn’t mention stage’s definition of fatty liver, thus it’s unclear whether samples in-vivo study reveals the population distribution.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The method and experiment result introduced should be reproducible
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

It would be better for authors to provide more details about encoding physical probe setup into a discrete P vector.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The academic writing is fluent, the experiments and discussions are sufficient. The topic and quality of this submission is appropriate for MICCAI after minor modification.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

4
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

The attenuation coefficient is an important tissue parameter to obtain. There have been tomographic approaches used to estimate this but with limited success. The author here proposes a learning based method to quantify the AC from pulse-echo data obtained through a single convex probe. A region of interest (ROI) is chosen – note the author uses this acronym before defining which should be changed in the final submission. The study involved several levels of investigation. They employed the MATLAB based k-wave simulator to create a simulated data set at appropriate frequencies for abdominal scanning, and an appropriate NVIDIA card for acceleration.
They also designed a network with normalization to estimate robustly the attenuation coefficient and validated it against a fully connected network in silico, with phantoms and in-vivo comparing their results to MR based fat fraction estimation. this is a comprehensive validation of their architecture.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The architecture after the appropriate cropping, is an encoder followed by concatenation followed by a regression layer.
The encoding block is 3 layers, each layer consists of 3x3 2D convolutions, 1x2 stride, ReLU with a residual network component to manage vanishing derivatives. This is equivalent to differentiation with removal of reflections. After concatenation the regression layer consists of 3x3 2D convolution – a normalization also determined from the data (see below) and again leaky Rectified linear unit (ReLU). The normalization is a scale and shift using ROI determined parameters for each sensor. This normalization is subsequent to an initial shift by the mean and scale by the standard deviation, for each sensor. These are also determined from fully connected ReLU architectures, one for each parameter, shift and scale . Appropriate depths are chosen for the ROI for this application. A reasonable number of simulation phantoms were used for training, 3200, with 400 for validation and 400 for testing. The minimization objective functional is an L1 norm for difference between the ground truth attenuation and the prediction of the network, with a Tikhonov type L2 norm regularizing term. However the term to which the L2 norm is applied is not made clear in the context. This must be corrected in the final publication but should not preclude the papers inclusion in MICCAI. The ADAM optimization algorithm is used with appropriate initial learning and decay rates. The experimental results are nice. Phantoms and in-vivo tests were used. The ultrasound system was the Verasonics with a convex probe, and the results were compared with a 3T MR system.
The simulation test showed improvement over a fully convolution network FCN as a baseline and especially with depth of the ROI. Several tests were carried out with and without ablation comparing their network with the FCN with both RF and enveloped data (ENV). As expected the ENV data had some stabilization properties. The phantom tests indicated the RAN (ROI adaptive normalization – their method) outperformed the FCN consistently regardless of the depth.
The in-vivo test was also impressive, here showing a strong correlation of the attenuation coefficient determined by their RAN network with the MRI determined proton density fat fraction, a standard for measuring hepatic steatosis.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

One point that will need correction in their final paper: they speed of sound for subcutaneous fat is in the range: 1400 - 1700 m/s which is good for the lower bound. The upper bound however, is not representative of fat. I don’t know if this is a misprint, but this should be changed. Fat should have a speed of sound less than about 1511 m/s. There are other structures in liver that will have higher speed of sound, this may be a misprint, but it should be corrected.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Data not available but the method clearly explained and some of the code is available. this is acceptable
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
Clearly written for the most part.
1. please correct the upper range for fat unless you have reasons for giving fat such high values. Fat should generally be below 1511 or so. (m/s)
2. The minimization objective functional is an L1 norm for difference between the ground truth attenuation and the prediction of the network, with a Tikhonov type L2 norm regularizing term. However the term to which the L2 norm is applied is not made clear in the context. This must be corrected in the final publication but should not preclude the papers inclusion in MICCAI.
The paper is well constructed, the inclusion of simulation (in silico), phantom and in-vivo results/experimentation is very good. Explanations are good, with the exceptions above.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The explanations are clear and there are 3 independent validations of the technology, the simulations, phantoms and in vivo. The results indicate improvement. The results
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper addresses an important clinical problem of diagnosing hepatic steatosis. This is done using ultrasound imaging and deep learning based attenuation estimation, and evaluated using simulated, phantom, and in-vivo images. All reviewers agree on the value of presenting this work at MICCAI, but they also pose points that need to be addressed in the final version.

Additionally, it would be good to report how sensitive the in-vivo results are to ROI selection and how the ROIs were selected for the given in-vivo study, as Fig. 6 shows varying locations for ROIs. Also, since this proposed regression method creates a single value, it would be good to elaborate why some ROIs appear to have varying color values.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

Author Feedback

We appreciate your time and valuable comments. We will update the final manuscript to cover the posed points.

Reviewer #1 – Q. Confusion over the number of steering angles used. The number of steering angles used is 7.

Reviewer #1 – Q. The authors claim that “RAN-based networks experience only 0.0525 dB/cm/MHz drop in MAE (see Fig. 4 (b)) irrespective of the ROI depth”. However, a drop in MAE respective of the ROI depth does appear. There was a typo. In the final manuscript, the mentioned sentence will be replaced by “RAN-based networks experience only 0.0525 dB/cm/MHz drop in MAE (see Fig. 4 (b)) depending on the ROI depth.”

Reviewer #2 – Q. Concerns about re-generating the simulation trainset when physical probe settings are changed. If the physical probe settings are changed, the trainset needs to be re-generated.

Reviewer #3 – Q. Please correct the upper range for fat unless you have reasons for giving fat such high values. For the generality of the simulation dataset, the sound speed of the object representing subcutaneous fat (actually including skin and muscle) was set in the range of 1400 m/s and 1700 m/s.

back to top

Learning-based attenuation quantification in abdominal ultrasound