Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Wenting Jiang, Yicheng Jiang, Lu Zhang, Changmiao Wang, Xiaoguang Han, Shuixing Zhang, Xiang Wan, Shuguang Cui

Abstract

Automatic segmentation of hepatocellular carcinoma (HCC) in Digital Subtraction Angiography (DSA) videos can assist radiologists in efficient diagnosis of HCC and accurate evaluation of tumors in clinical practice. Few studies have investigated HCC segmentation from DSA videos. It shows great challenging due to motion artifacts in filming, ambiguous boundaries of tumor regions and high similarity in imaging to other anatomical tissues. In this paper, we raise the problem of HCC segmentation in DSA videos, and build our own DSA dataset. We also propose a novel segmentation network called DSA-LTDNet, including a segmentation sub-network, a temporal difference learning (TDL) module and a liver region segmentation (LRS) sub-network for providing additional guidance. DSA-LTDNet is preferable for learning the latent motion information from DSA videos proactively and boosting segmentation performance. All of experiments are conducted on our self-collected dataset. Experimental results show that DSA-LTDNet increases the DICE score by nearly 4% compared to the U-Net baseline.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_2

SharedIt: https://rdcu.be/cyl5u

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper defines the task of hepatocellular carcinoma segmentation in DSA videos. A new dataset is collected and DSA-LTDNet is proposed.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Propose the task of hepatocellular carcinoma segmentation in DSA videos. A new dataset is collected for the new task. And several baselines and a new method is proposed for improving the performance.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The writing of the paper is poor and some of the sentences are hard to understand, e.g., “we rstly calculate the difference image between 2 adjacent frames in the last 15 frames of DSA videos and sum to gain the total pixel values.”.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Should be reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The writing of the paper can be improved.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. The key frame is selected from the last 15 frames. Why not selecting from all frames?
2. Frame difference is calculated by K-9 and K. How did the authors choose the parameter of 9?
3. The authors choose 2 more consecutive frames for training. Is there any evidence to prove that using another 2 frames will not cause problems?
4. The authors claim that they did not use data augmentation due to the special anatomical knowledge of HCC and liver area. However, during the capturing of the DSA video, there will always be motion influence. By using data augmentation, i do not think it will cause any problems.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

The contributions of the paper are twofold. First, they develop a large, annotated database of hepatocellular carcinoma (HCC) angiography videos, annotated with segmentations on the keyframes. Second, they propose a segmentation method that combines anatomical landmarks and temporal features to generate HCC segmentations.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The strengths of the paper can be summarized as follows:
- Tackling of an important problem with relatively low attention
- Use of a network architecture with three different segmentation networks jointly trained with a combined loss function
- Use of spatial and temporal features
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Weaknesses
- The validation could be improved. For instance, table 1 mention average dice scores, with the increment with respect to the baseline in parenthesis. It would be a better evaluation to include the variability on the dice score. Also, it is unclear to me if the improvement from the baseline (UNet) to the proposed method is statistically significant. Statistical tests could be used to establish it.
- Presentation. The authors refer to their method as DSA-LTDNet, however, such name is not mentioned in the evaluation tables.
- The use of the temporal features through a unet that is trained against the frame difference is hard to follow. The authors train the unet, using as input the frames and as output a difference between to frames? The network should learn that function with one convolution. I believe the authors do something smarter, but it is hard to guess from the text.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
- The community could benefit from this data, should the authors want to make it available.
- There is no information about the ethical committee approval for the dataset
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

In order to improve the paper, potentially for a journal version, I would like to recommend the authors to o Describe better the dataset. How big are the tumors? How many in average? o Improve the validation. Create, beyond an aggregated dice, a per-tumor dice. Use statistical tests. o Improve the explanation on the use of temporal features.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

It is an interesting, mostly unspoken, problem that can benefit from further work. The use of anatomical landmarks, motion features and segmentation networks jointly trained are interesting.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

4
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

This paper proposed a deep convolution network model to segment HCC in DSA videos. To boost the segmentation performance, a temporal difference learning network was designed to capture the distinct motion information of tumors, and prior anatomical knowledge was introduced to locate tumor regions.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The topic of the paper is of great significance. The comparative experiment is reasonable. The amount of experimental data is sufficient.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The proposed method has serious logic problems.The TDL and LRS networks designed by the author have no meaning in the entire model. According to the author’s description, the results of these two networks are sent to the FFS network for subsequent segmentation. However, the gold standards of the output of these two networks are known, so why sending the result of the approximate gold standard trained by the network to the subsequent network rather than these two gold standards.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility of the article is low, and the authors should provide data and code.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Authors should add a set of segmentation results by directly inputting FD, LM and KF into the FSS network. If possible, authors should conduct experiments on public datasets.
Please state your overall opinion of the paper

probably reject (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Method innovation is limited. The proposed method has serious logic problems. Lack of public dataset for validation, the experimental results are not reliable.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The reviewers agreed that the paper has some merits. But all reviewers think the writing of the paper is quite poor and experimental validation needs to be further improved.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Author Feedback

Thank all reviewers for the constructive suggestions. We are happy that our work is evaluated as ‘interesting and signiﬁcant’ in idea (R2&R3), ‘novel and interesting’ in solution (R2), ‘reasonable’ in comparative experiment (R3), and most importantly, ‘an interesting, mostly unspoken, problem that can beneﬁt from further work’ (R2). Writing&Reproducibility (R1&R2&R3): We will polish our writing and clarify contents in the revised version. We are trying to public our data and code. Key frame selection (R1): The radiologists’ experience implies that tumors will not appear before the last 15 frames of DSA videos. Thus, we select key frames from the last 15 frames. Why k-9 (R1): Experimental results suggest us to choose 9. Results for the parameters of 5,7,9,11,13 are as follow: 71.68%, 71.79%, 72.32%, 71.06%, 70.80%. The model with k-9 has the best result, thus we use k-9 eventually. We analyse that result of 9 being the best is due to its bigger difference than that of 5, 7 and less noise than that of 11, 13. 2 more consecutive key frames (R1): We selected 2 more consecutive frames for data augmentation and required an experienced radiologist to confirm and annotate all the chosen frames. The experiment using 1 chosen key frame from each DSA video for the training, was also conducted. The result (67.87%) is slightly worser than the current baseline. Data augmentation (R1): We tried usual data augmentation methods. Our experiments show none of them could enhance segmentation. Spatial augmentation methods, like flipping or rotation, are not suited to our task. Radiologists’ experience tells that livers and liver tumors are usually located in the upper left part of DSA images. Flipping or rotation will break such spatial structural prior. Besides, the experiments with color transformations(e.g. contrast transform, 70.81%) are comparable to the current baseline without augmentation. Tumor size & statistical test(R2): The range of tumor size is [6.5, 217] and the average is 69.37 in pixels. For the testing result of DSA-LTDNet, the average per-tumor dice is 72.65%, which is better than that of baseline (69.23%). We divide tumors into 3 size ranges: [9.5, 90], [90, 160], [160, 217]. The correspoding average dices of 3 ranges are 67.53%, 88.32%, 91.32%. TDL (R2&R3): we have tried three common used ways, optical flow, frame difference, background subtraction, to represent the frame motion of dsa videos. Table 2 shows their results and baseline+FD performs best (71.73%). Such motion information as FD is simple and direct, while humans can obtain more complex motion information. Thus, we propose to use a TDL network to learn better motion implicitly. TDL w/o supervision is difficult to converge(69.90%). We creatively initialize TDL guided by FD, and finetune it with a relatively small loss in DSA-LTDnet. Baseline+TDL w/ supervision(72.32%) achieves the best result in Table2. LRS (R3): If we replace LRS with the ground truth of liver masks, the result is 72.13%, similar to baseline+LRS(72.01%). Doctors will take extra time and efforts to delineate ‘gold standards’ in clinical usage in this case. To meet the practical needs, an end-to-end manner without extra manual annotation during usage is the mainstream of CAD System just like our architecture. Logic problems & method innovation (R3): Our method’s novalties lie in: 1.adopting anatomical priors and learnable temporal features to segment HCC; 2.using an end-to-end network architecture with three sub-networks jointly trained with a combined loss function, which are appreciated by R2. Directly inputting FD, LM and KF into the FSS network (R3): If we only input KF (key frame) into FSS, it is exactly the baseline(70.75%). Baseline+FD(calculated), Baseline+TDL w/ supervision(FD Learned), Baseline+LM(annotated), Baseline+LRS(LM Learned), Baseline+FD+LM(both calculated) are as follow: 71.73%, 72.32%, 72.13%, 72.01%, 72.34%. Public dataset(R3): There is no public DSA dataset as we know.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors have clarified the key novelties of this paper (i.e. main concern of R3) in the rebuttal letter. That is adopting anatomical priors and learnable temporal features to segment HCC, and using an end-to-end network architecture with three sub-networks jointly trained with a combined loss function. In addition, the authors clarified the misunderstandings of R3 which lead to a low score. The claimed novelties and clarified misunderstandings are reasonable to the AC.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposes a segmentation method for HCC. The method uses anatomical landmarks and temporal features extracted from angiography videos. The method is interesting, and the topic is relevant. A large and annotated database for HCC has been developed for evaluation. But reviewers raised concerns about clarity and novelty. In the rebuttal, the concerns are satisfactorily addressed.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

4

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The manuscript is valuable and definitely has merit over the state of the art. The authors have properly responded to the criticisms by reviewer #3, which, in my view, is grounded and could serve for improving the manuscript, but not sufficient to reject a good quality manuscript. Aggression and demeaning tone in reviews or rebuttals is out of place, always.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

back to top

Hepatocellular Carcinoma Segmentation from Digital Subtraction Angiography Videos using Learnable Temporal Difference