Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yingda Xia, Jiawen Yao, Le Lu, Lingyun Huang, Guotong Xie, Jing Xiao, Alan L. Yuille, Kai Cao, Ling Zhang

Abstract

Pancreatic cancer is a relatively uncommon but most deadly cancer. Screening the general asymptomatic population is not recommended due to the risk that a significant number of false positive individuals may undergo unnecessary imaging tests (e.g., multi-phase contrast-enhanced CT scans) and follow-ups, adding health care costs greatly and no clear patient benefits. In this work, we investigate the feasibility of using a single-phase non-contrast CT scan, a cheaper, simpler, and safer substituent, to detect resectable pancreatic mass and classify the detection as pancreatic ductal adenocarcinoma (PDAC) or other abnormalities (nonPDAC) or normal pancreas. This task is usually poorly performed by general radiologists or even pancreatic specialists. With pathology-confirmed mass types and knowledge transfer from contrast-enhanced CT to non-contrast CT scans as supervision, we propose a novel deep classification model with an anatomy-guided transformer. After training on a large-scale dataset including 1321 patients: 450 PDACs, 394 nonPDACs, and 477 normal, our model achieves a sensitivity of 95.2% and a specificity of 95.8% for the detection of abnormalities on the holdout testing set with 306 patients. The mean sensitivity and specificity of 11 radiologists are 79.7% and 87.6%. For the 3-class classification task, our model outperforms the mean radiologists by absolute margins of 25%, 22%, and 8% for PDAC, nonPDAC, and normal, respectively. Our work sheds light on a potential new tool for large-scale (opportunistic or designed) pancreatic cancer screening, with significantly improved accuracy, lower test risk, and cost savings.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_25

SharedIt: https://rdcu.be/cyl5Z

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors developed an automatic method to classify pancreatic ductal adenocarcinoma, nonPDAC tumors, and normal region.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The problem is clinically important

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. What do they mean by “non-PDAC” tumors? As different non-PDAC tumors may appear differently in the radiologic imaging, what is logic behind grouping different non-PDAC tumors.
    2. Training performance is missing.
    3. Since, authors have used a two stage algorithm where in the 1st stage abnormal was distinguished from normal and in the 2nd stage abnormal was categorized into PDAC vs non-PDAC. I am curious how in the 2nd stage authors have handled the misclassified cases.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    1. More details on the parameters are required and how change of parameters affect the performance?
    2. Since the lesion was manually segmented it may suffer from inter-observer variability.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. See item 4 and 6
    2. Please provide whether there is any statistically significant difference between the best human and algorithm
  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The problem is relevant but many information is missing for using it in clinical application.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    4

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    This paper investigates the feasibility of using non-contrast CT scans (instead of multi-phase contrast CT scans) to detect and classify pancreatic lesions. The method includes a coarse localization step and a classification module, which is based on a vision transformer.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is easy to follow
    • Large and diverse evaluation dataset
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Some conclusions and claims dot not appear well supported.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper appears to provide enough details to reproduce the results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • Something that strikes an eye: Segmentation of the pancreas and pancreatic lesions, as well as their classification, is a popular research topic and many different methods were developed for these tasks. However, Related Work section appears to primarily cite methods, often very similar, developed by one particular group (7 out of 13 citations). Since some of those papers describe extensions or modifications of the same approach, I suggest editing Related Work section and introduce a reader to a more diverse set of methods.
    • Page 6: Reader study: a one-sentence description of the reader study does not provide sufficient details to understand the study design, the tasks, the instrumentation, independent, dependent, and confounding variables, and many more. For example, how was the task formulated? Were the radiologists using the same PACS viewers they use in practice? Experience level? Were they aware of the age and gender of each patient? These demographic characteristics are considered important diagnostic clues. While I understand that this study was not the main objective of the paper, the claims and conclusions that the authors drew from it do not appear well supported.
    • Page 6: “A detection is considered successful if the intersection over the ground truth is > 0%”: Please describe which detection metric was used to measure the overlap (e.g., Dice, IoU, etc). Please elaborate why such a low threshold (basically 0) was chosen. Is a one voxel overlap between 3D masks with 8000 voxels each (20x20x20) considered successful?
  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • Some conclusions and claims dot not appear well supported.
  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    3

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    Authors developed a method for screening of different pacreastic tumor

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The problem is demanding

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    See item 7

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. Authors classified PDAC tumor, non-PDAC tumors, and normal pancreas. What are the cases included in the category of non-PDAC tumors.? only PNET or PNET+ cyst?
    2. Please provide justification behind considering different non-PDAC into one single category because different non PDAC tumors may appear differently in CT.
    3. More details about vision transfer is required particularly how did they use it for current application.
    4. Training performance is missing.
  • Please state your overall opinion of the paper

    probably reject (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Limited novelty.
    2. Lack of information for implementation of the method
  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident



Review #4

  • Please describe the contribution of the paper

    1.This is the first work aimed to detect pancreatic cancer in non-contrast CT scans. 2.The manuscript has proposed an Anatomy-aware Hybrid Transformers for the segmentation and classification of pancreatic cancer.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1.The clinical contribution of the research is novel. 2.The adopted dataset is large enough to demonstrate the effectiveness of the proposed method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Only one radiologist performed the manual annotations of pancreatic cancer. The dataset cannot exclude the intra-observer error in the annotation process.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Well.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1.The manuscript claims that “outperforming the mainstream ‘Segmentation for Classification’ paradigm”, but only an “S4C with UNet” is compared. More experiments about S4C should be conducted. 2.The format of references should be consistent, including abbreviation of the conference name and capitalization of the journal name.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1.The overall logic of the manuscript is very good, and the description of the proposed method is clear. 2.The performance of the Anatomy-Aware Transformers exceeds an experienced radiologist in the detection and classification of pancreatic mass.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    2

  • Reviewer confidence

    Confident but not absolutely certain




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This work presents a Transformer-based approach to classify pancreas cancer using non-contrast CT images. Specialists mainly rely on contrast CT for diagnosis of pancreatic cancer. So this is an interesting work proposing to use a non-contrast image. Overall, the manuscript is well-written and easy to follow. Author evaluated their proposed approach on a relatively large dataset with a good mix of normal and abnormal cases. I have a few major concerns as following:

    1) The proposed approach heavily relies on the segmentation of pancreas and the mass. However, authors did not provide any performance report on accuracy and efficiency of the segmentation module. 2) Pancreas cancer detection has been studied in the past and there has been numerous automated approaches proposed for tackling it. Nevertheless, authors selected a couple of baselines to compare. 3) Based on results provided in Table 1, there doesn’t seem to be a significant difference between the selected baseline and the proposed approach. 4) Since this approach is proposed to replace contrast CT, it would be interesting to see a comparison between the proposed approach and Radiologist/Specialist’s performance using contrast-CT. I am not sure if it is fair to conclude AI is outperforming human since human was not trained to perform diagnosis using non-contrast CT. 5) An ablation study is recommended to add STD to performance results to be able to evaluate the sensitivity of the proposed approach w.r.t. data split.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    4




Author Feedback

We thank all reviewers and AC for their constructive feedback.

The main contributions of this work are: 1) For the first time, non-contrast CT (NCCT) is proposed and validated as an effective imaging modality for full-spectrum taxonomy of pancreatic mass/disease screening using deep learning. This sheds light on new computing tools for large-scale opportunistic or designated pancreatic cancer screening of (significantly) improved accuracy, lower test risk, and cost savings. 2) We ingeniously/cleverly transfer the manual annotation from contrast-enhanced CT (CECT) to NCCT as supervision, to train an anatomy-aware hybrid transformer as our model. 3) Our framework is trained/validated on a large-scale dataset with 1627 patients and outperforms a group of expert (abdominal/pancreas) radiologists by evidently large margins. 4) Note that this work is not aimed to replace the CECT imaging based differential diagnosis protocol (confirming pancreatic tumor findings), but complementarily screen/detect/flag suspicious patients using NCCT only and refer them for CECT pancreatic follow-up exams, to improve cancer patient care jointly.

In the following, we will address the main comments raised for better clarity.

  1. Segmentation performance. Segmentation masks are generated by registration from CECT where radiologists can see the mass more easily, so they are not perfectly aligned on NCCT. Note our main task is to detect positive patients at the patient level, thus robust to segmentation. The cross-modality mapped (surrogate) masks serve as supervision for the segmentation branch, trained jointly with the classification loss that is supervised by pathology-confirmed labels. DSC scores between our prediction and the transferred (imperfect) masks on the test set are: 0.81/0.45 for pancreas/tumor. As a reference, on Task07 of MSD challenge (CECT with manual annotations). The winner nnUNet model obtained 0.80/0.52 for pancreas/tumor, respectively.

  2. SOTA performance on CECT. Zhu et al. [28, MICCAI19] reported 94.1% sensitivity (PDAC) and 98.5% specificity (normal) on CECT. Our method achieved 95.2% and 95.8% (PDAC and other nine types of nonPDACs versus normal) on NCCT, showing comparable performance. Due to space limit, we mainly compare against the previous SOTA approach [28].

  3. The improvements of performance over the baseline. In Table 1, two-class “abnormal vs. normal” classification is our main metric for screening purposes. We achieved 3.8% absolute increase in sensitivity while maintaining the same specificity as our reimplemented baseline of [28].

  4. Compare with human performance on CECT. Contrast enhanced CT scans are developed because the low image contrast of NCCT is not ideal for human vision. As reported in [Annals of Surgery 2009; DOI: 10.1097/SLA.0b013e3181b2fafa; Sec. “Results”, Parag. 1], on CECT, the pancreatic cancer diagnostic sensitivity/specificity (PDAC vs. nonPDAC+normal) of radiologists is 82%/66%, while ours on NCCT is 79%/89% under the same two-class setting, demonstrating the strong potential on using NCCT. NCCT is widely used as the first-line imaging modality to detect/screen various lesions, especially in low- and middle-income countries, likely due to the absence of a reliable supply chain of contrast agents and insufficient expertise (Lancet Oncology 2021; doi.org/10.1016/S1470-2045(20)30751-8).

  5. Add STD of performance. We will add this information in the revised manuscript. Our preliminary results show that the STDs of sensitivity and specificity are less than 1%.

  6. Missing details of nonPDAC. NonPDAC includes nine subtypes of abnormalities [24, CVPR2021] . PDAC is of the highest priority among all pancreatic abnormalities with a 5-year survival rate of ~10% [Grossberg et al, CA 2020] and is the most common type (about 90% of all pancreatic cancers). This is the main reason that we currently group all abnormalities into two classes of PDAC and nonPDAC. This notation will be added in revision.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Authors adequately addressed reviewers’ concerns and comments in their rebuttal response. The tackled problem and the proposed solution are both very relevant to MICCAI and in my opinion, of interest to MICCAI audience.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    3



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The 3.8% improvement is based on the authors reimplementation, which may not very accurate due to optimized parameters settings. Also, I do not think adding another column or two for comparison with other SOTA would be challenging due to the limited space. The authors also ignored other critical issues in their rebuttal including the statistical analysis, the training performance, and the effect of misclassified instances from the first stage. They focused the contribution of the work, which was not questionable by the AC or the reviewer.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    13



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I found the paper to be very clear and quite interesting, providing a detailed description of the framework and focusing on the radiologic comparison in screening process using non-contrast CT, which is clinically relevant. The rebuttal appropriately answered several of the important questions and criticisms. I agree that some statements should such as outperforming radiologist performance can be somewhat inappropriate, given there is no clear benchmark here, however I believe this paper would lead to interesting discussions on the interpretation and workflow integration.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    1



back to top