Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Hongwei Li, Fei-Fei Xue, Krishna Chaitanya, Shengda Luo, Ivan Ezhov, Benedikt Wiestler, Jianguo Zhang, Bjoern Menze

Abstract

Radiomic representations can quantify the properties of regions of interest in medical image data. Classically, they account for pre-defined statistics of shape, texture, and other low-level image features. Alternatively, deep learning-based representations are derived from supervised learning but require expensive annotations from experts and often suffer from overfitting and data imbalance issues. In this work, we address the challenge of learning representations of 3D medical images for an effective quantification under data imbalance. We propose a self-supervised representation learning framework to learn high-level features of 3D volumes to complement existing radiomics features. Specifically, we demonstrate how to learn image representations in a self-supervised fashion using a 3D Siamese network. More importantly, we deal with data imbalance by exploiting two unsupervised strategies: a) sample re-weighting, and b) balancing the composition of training batches. When combining our learned self-supervised feature with traditional radiomics, we show significant improvement in brain tumor classification and lung cancer staging tasks covering MRI and CT imaging modalities.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87196-3_4

SharedIt: https://rdcu.be/cyl1u

Link to the code repository

https://github.com/hongweilibran/imbalanced-SSL

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper proposes a self-supervised representation learning framework using a 3D siamese network. In this framework, two unsupervised strategies, sample re-weighting (RE) and sample selection (SE), are used to deal with data imbalance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The unsupervised strategies to deal with data imbalance are novel. It is very interesting since the proposed strategies do not need any label information.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. Extending the original 2D network to 3D Siamese network is not novel.
2. The baselines in the experiments are limited. Most baselines are the components of the proposed framework. Only one baseline is from previous work.
3. The proposed methods use the unsupervised clustering method to deal with the data imbalance problem. But we have no idea if the cluster membership is related to sample classes. The analysis is not strong enough.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I have concerns about the reproducibility related to the k-mearns section.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. Add analysis about the relationship between the cluster membership with sample classes.
2. Add previous works as baselines.
3. The results of Trad.+3DSiam+RE method and Trad.+3DSiam+SE method on two datasets are inconsistent. Which method should we choose for a new dataset?
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed RE and SE strategies to deal with the data imbalanced problem is interesting. But there are concerns about the experiments and analysis. See above for reference.
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

1
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

The paper tackles the task of extracting meaningful data representations from 3D medical data. For this the authors propose to use a 3D Siamese network (extension of an existing 2D approach), which gets two differently augmented versions of the same sample as an input and tries to reduce the distance between the resulting feature vectors. To tackle class imbalance the resulting learned representations are also clustered with the k-means algorithm and re-weighted or reselected during the training pipeline. The learned representations are shown to synergize with Radiomic features and increasing their performance on two exemplary tasks in the CT and MRI domain.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Overall the paper is well written.
- The Siamese network approach already exists for 2D applications, but showing its potential in 3D is a nice take away.
- Radiomics suffer from being influenced by technical variants, but the way the pipeline is set up for the Siamese networks the resulting representations are designed to be robust to some of these influences. Therefore, the learned representations and Radiomics make a good combination (although I missed this argumentation in the paper).
- Due to the self-supervised nature of the approach it can be applied to almost any problem.
- The authors also showed the potential of their approach on data sets with limited size.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The two different strategies to tackle class imbalance perform differently well for different tasks and configurations. Therefore, a take-away which method to use or at least a discussion and comparison of the two strategies is missing.
- The evaluation metrics chosen are hindering comparison to related work.
- A linear SVM might not be the perfect choice as a classifier. As far as I am aware RF or XGBoost mostly perform best in Radiomics based classification tasks.
- As visualized in Fig. 1 the rotation augmentation was performed after cropping which leads to cropping artefacts. First rotating and then extracting the region of interest would’ve been cleaner.
- Calling the resulting feature representations “Radiomic Representations” is a stretch in my opinion.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The methods are well described and the used data sets are public. Therefore, I see no problems with the reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

As mentioned in S3 it would be nice to have some argumentation in the direction that the resulting features are more robust against technical variance. As mentioned in S4 a discussion regarding the different imbalance settings is missing. The References section does not follow the guidelines (especially regarding the amount of authors and that for some references there is an “et al.” after listing 10 authors) .
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

For me the strengths of the paper outweigh the weaknesses by far. In my opinion it introduces some very interesting concepts to the domain of medical image processing and the paper is well written.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

7
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper
- data imbalance in self supervised learning -extending 2d siamese to 2d The paper highlights the importance of data imbalance in self-supervised training of medical images. To solve this issue the propose to first cluster the data via k-means into clusters and then either weight the samples depending on the size of the cluster they belong to or sample a new balanced batch from each cluster. They show this balanced self-supervised training approach is superior to naive self-supervised learning. They also show representations achieved by balanced self-supervised learning complement those of 3d radiomics and together they outperform even supervised learning in some applications.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Good ablation studies that highlights the importance of balancing batches when training self-supervised learning.
- The two proposed methods for balancing the batch are not novel but does the job.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- How come there is need to use the tumor segmentation in Brats. Why couldn’t you use the whole volume? You could resize it of course the same way you do for NLST.
- It seems the data is prepared in a way to help k-means. For Brats the segmentation masks are used which removes a lot of noise from the healthy tissues and for NLST the lung is extracted which also helps a lot with clustering. I wonder what the results would be without these pre-processing steps. Keep in mind that in real world we may not have have access to such detailed data curation techniques.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
- Some implementation details are mentioned,
- The authors do not provide error bars over the reported results, nor they mention whether they average over multiple runs with different seeds to obtain the reported results.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
please refer to the weaknesses. I recommend the authors to also try running their method on full frame without any targeted croppings. in section “RE/SE Module to Handle Imbalance” , they mention “we assign it a weight of N/fj”. Please be more specific on where the weight is applied. I assume in the loss function, but it has to be mentioned more explicitly.
- one common approach in medical imaging when data is very limited is to use pertained models on imagenet. It would be good to compare how the representations extracted from a pertained imagenet compares to that of self-supervised learning when used together with pyradimics.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper highlights the importance of balancing batches when training self-supervised learning models. I think it can generate discussion at the conference.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

All reviewers recognize the importance and impact of the approach proposed to deal with data imbalance as well as the complementarity between handcrafted radiomics features and learned representation. Potential weaknesses of the paper is the use of a linear SVM only and comparisons to a limited set of baselines.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Author Feedback

We thank the reviewers for their valuable comments. We would like to address them in the camera-ready version. Below we respond to the reviewers’ comments in several aspects.

(1) Analysis about the relationship between the cluster centers with sample classes. Thanks for the valuable comments. We will add this analysis into the Supplementary.

(2) Comparison and discussion of the two strategies (R1, R3) Thanks for the constructive comments. We will discuss the two strategies in the Results section.

(3) Choice of classifier (R3) We will add the results of random forest classifer into the results section.

(4) Reference format (R3) We will address it in the camera-ready version.

(5) Error bar of the results (R4) Thanks for the constructive comments. We will add the mean and standard deviation after multiple runs with different seeds.

(6) Reproducibility of k-means section (R1). We will release the codes of the two modules.

(7) Clarity on “RE/SE Module to Handle Imbalance” (R4) We will clarify it in the camera-ready version.

back to top

Imbalance-Aware Self-Supervised Learning for 3D Radiomic Representations