Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Golara Javadi, Samareh Samadi, Sharareh Bayat, Samira Sojoudi, Antonio Hurtado, Silvia Chang, Peter Black, Parvin Mousavi, Purang Abolmaesumi

Abstract

Motivation: Accurate detection of prostate cancer using ultrasound data is a challenging yet highly relevant clinical question. A significant roadblock for training accurate models for cancer detection is the lack of histopathology labels with high resolution that correspond to the presence of cancer in the entire imaging or biopsy planes. Histopathology reports only provide a coarse, representation of cancer distribution in an image region; the distribution of cancer in itself is only approximately reported, making labels generated from these reports very noisy. Method: We propose a multi-constraint optimization method in a co-teaching framework with two deep neural networks. These networks are simultaneously and jointly trained, where each network uses data identified by its peer network as less noisy, to update itself. We propose two additional constraints based on the statistics of cancer distribution and noisy nature of labels to the conventional co-teaching framework. Results: We demonstrate the effectiveness of the proposed learning methodology using a challenging ultrasound dataset with 380 biopsy cores obtained from 89 patients during systematic prostate biopsy. Our results show that our proposed multi-constraint optimization method leads to a significant improvements in terms of area under the curve and balanced accuracy over baseline co-teaching method for detection of prostate cancer.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_65

SharedIt: https://rdcu.be/cymbs

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The prostate cancer classification network in the ultrasound imaging is proposed. The weakly supervised problem of prostate cancer detection is reformulated as a multi-constraint optimization problem. The co-teaching framework and an adaptive weighted loss are combined to analyze the sequence of the ultrasound image.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Combining co-teaching method and an adaptive weighted loss to manage noisy labels.
2. The cancer involvement information is used for adaptive weighted loss to reflect the uncertainty associated with the reported histopathology.
3. In Fig. 3, the portion of the red regions on each colormap is correlated with the involvement value for that core.
4. In table 1, the proposed method outperforms the baseline co-teaching method.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. There is no explanation why the temporal ultrasound data (especially 200 frames) is chosen as an input to the network.
2. The performance improvement is not significant when applying the max constraint proposed as the main technique. The AUC of the methods with co-teaching (baseline) and the max constraint are 0.61 and 0.64, respectively.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Provided information is limited and thus the reproducibility is limited.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. lack of description on the network architecture. Details of the CNN and dense layers described in figure 1 should be provided. In addition, the dimension of the input data should be described.
2. A reason for using ultrasonic temporal data as the input to the classification network should be supported by evidence.
3. Details on how the adaptively weighted loss from the estimated involvement probability can be applied should be provided.
4. In section 2.2, ROI is set to 2 mm by 18 mm. However, the ROI presented in figure 3 is different. Details on how the colormap with variable ROI size is obtained should be provided.
5. The result of conventional prostate cancer classification model should be added as an ablation study.
6. Detail on the measurement hardware should be provided.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

It is reasonable to use co-teaching methods and adaptive weight loss based on involvement information to solve the problem of the prostate cancer classification. However, since no comparison with the existing prostate cancer classification network is provided, the superiority of the proposed method is unclear. In addition, effectiveness of the max-constraint method presented as the main contribution seems very limited. The details of the proposed method are are limited in general. For example, 1) how the weight of the loss is adjusted, 2) why the proposed network architecture is chosen, 3) why temporal data is used as an input to the network.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This paper aims to apply Co-teaching to the Prostate Cancer detection task, with several improvements. The histopathology report is used as a coarse label for learning, and the coarse label is used to restrict the model, including Max Constraint and Involvement Constraint. Compared with standard Co-teaching (Baseline) on private datasets, the authors claim the significant improvements can be observed.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Interesting and novel cross-modality learning paradigm
2. Good performance for the PC detection task
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The details of the dataset is missing, which makes this paper less convincing;
2. Through the co-learning is a strong baseline, many other cross-modality learning appraoches can be used, which are not discussed in this paper.
3. The proposed method has several hyper-parameters such as lambda, a set of layers. However, the authors did not provide how they select these hyperparameters. What was the search range?
4. The authors would be great to improve Table and Figure. The analysis with Fig. 2 look interesting but some missing information might make the reader difficult to fully understand the flowchart.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

No code and data link are provided.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. The image size input to the neural network does not seem to be given in the paper.
2. The batch size is set as 2024, is it wrong?
3. Does different feature generator model are used? Such as, using the ResNet or DenseNet?
4. The paper mentioned that the loss function can use any loss function. It is interesting to compare other loss functions, with the goal to evaluate the purposed improvements generalization performance?
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Good paper organization while the implementation details are missing.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

The paper presents a weakly supervised learning method to train a network using noisy lable.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The contribution of the paper worth publishing in miccai. The proposed loss function sounds reasonable.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The paper is difficult to follow. The discription of the method is also difficult to undrestand. More visual results could be added.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

They provided the details of the training but no code/demo is provided.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

1-Eq 1 why max(y^)=y~? 2-One line before Eq 2: “R(e) is the ratio of a batch that would be selected in each epoch and increases” I think it is decrease not increase. 3-please clarify that the network is 1D or 2D? ROI is 2D but 1D convolutions are used. 4-the discription of the loss function is difficult to follow. 5-please include an image similar to Fig.3 that compares previous co-teaching with this method.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think the contribution and the novelty are good but the paper is difficult to follow especially the loss function they used.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

3
Reviewer confidence

Somewhat confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

Although the Reviewer 3 gave a high score but without strong justification, while reviewer 1 raised concerns on the performance interpretation. Therefore, i would like to invite the authors to rebuttal, in particular, to clarify the results demonstrating the significance, both clinically and statistically, of the added value - with respect to this relatively small data set. In addition, address comments from reviewers 1 and 2.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

4

Author Feedback

The reviewers noted that our approach is a novel learning paradigm, which achieves good performance for prostate cancer detection. They asked for clarification of the results demonstrating their statistical significance with respect to dataset size, comparison with existing methods, and additional details in terms of implementation, the network structure, and the hardware for data acquisition.

CLARIFICATION OF RESULTS The main optimization technique proposed in our work is a combination of two constraints: a) max constraint that limits the detected cancer ratio in benign cores, which is crucial in our optimization problem because it helps increase the specificity of the model. Low specificity is also a well-known issue in detection of prostate cancer with other imaging modalities such as multi-parametric MRI [1] and b) involvement, as reported by pathology which constrains the fraction of detected cancer in each biopsy core. In an ablation study (Figure 4), we show the effect of each of these constraints. Starting with a baseline AUC of 61% with co-teaching, the results improved to 64% by incorporating the max constraint. The results further improved to 70% by applying the involvement constraint, which further shrinks the search space for the max constraint.

DATASET SIZE Our work is in the context of image-guided intervention, where significant effort is normally invested to experimentally acquire the data. Our dataset of 89 patients has been obtained over a period of two years and is the largest temporal ultrasound data for prostate cancer to-date from sextant biopsy patients. We do not consider our dataset small within this context.

COMPARISON WITH EXISTING METHODS The main approaches used in the weak supervision literature are: (a) incomplete supervision, when there are many unlabeled samples; (b) inexact supervision, when labels are available only at image-level vs. pixel-level; and (c) inaccurate supervision, when labels are noisy to some extent [2]. Our proposed approach is a remedy to inaccurate supervision, with key contributions that address the specific problems related to prostate cancer. Our weak supervision approach differs from cross-modality learning methods in the literature since our focus is on learning from the noisy labels of one modality rather than learning from two image modalities.

A review of the literature shows that temporal ultrasound data is promising for detection of prostate cancer [3,4] compared to methods that only use a single ultrasound image. As a result, we use this protocol as a base for our data analysis. We have compared our learning paradigm against recent studies [3,4] on similar datasets. Our results indicate that we statistically significantly improve the performance using the proposed method. We will add these details to the paper. The clinical standard for evaluation of the performance of an imaging technology for prostate cancer detection is set by the PROMIS study (740 patients, 11 clinical sites) [1] comparing mp-MRI with ultrasound-guided sextant biopsy. In that study, the reported sensitivity and specificity of mp-MRI for detection of aggressive cancer was 88% and 45%, respectively, vs. those of 48% and 99% for the sextant biopsy, respectively. In comparison, with temporal ultrasound, at sensitivity of 87%, the specificity of our proposed network is equal to 50%.

ADDITIONAL DETAILS The reviewers asked for additional implementation details including the network structure and the hardware for data acquisition, which we will add to the paper and to supplementary material in the form of a figure. We will expand the description of the figure captions to further elaborate on the details of the methodology.

[1] Ahmed, H., et al.The Lancet, vol. 389, pp. 815-822 [2] Zhou, Z National Science Review vol. 5, pp. 44–53 [3] Javadi, G., et al. MICCAI. (2020), pp 524-533 [4] Javadi, G., et al. IJCARS vol. 15, pp 1023-1031

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Overall, this study described an important and interesting application with a promising learning based approach. The rebuttal argued 89 patients is not a small data set citing difficulties in data acquisition - an irrelevant argument as the reviewers questioned this from a modern deep learning perspective in which large data set is required for training. However, the study showed convincingly improved results based on this small data set. I also would like to acknowledge the complex clinical issues surrounding the prostate cancer diagnosis, where pinpoint the clinical relevance based on a single fixed detection algorithm can be challenging or arguably beyond the scope of this technical paper. The rebuttal compares the results with that from PROMIS trial which adopted a much more saturating biopsy sampling, an oversimplification in my opinion considering the difference in quality of reference standard. Nonetheless, this work presents the state-of-the-art results in applying learning based methods in assisting prostate cancer diagnosis on temporal ultrasound imaging, with clear and structured presentation, therefore, i do recommend to share it with the miccai community.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

8

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Reviewers raise questions about using temporal ultrasound data, the significance of results, small sample size. The rebuttal provides clarification to these issues. The proposed method combines co-teaching and adaptive weighted loss, which is interesting and practical. While the dataset size seems small, the authors stated that it is the largest ultrasound data for prostate cancer from sextant biopsy patients. Taking into account reviewers’ and authors’ information, I recommend accepting this paper.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

8

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors statement about Max Constraint in the Experiments and Results section in pages 8-9 “the Max constraint, the network is forced strongly to predict higher probability of benign than cancer for all benign signals” raises a concern about the network bias. Also, the results with the “Involvement constraint” as the main technique is another point to see which of the constraints has more effect or domination. For the comparison to the methods in [3], and [4] to be fair it should be on the same data sets not similar data sets as mentioned in the authors rebuttal.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

12

back to top

Training Deep Networks for Prostate Cancer Diagnosis Using Coarse Histopathological Labels