Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

# Authors

Seung Yeon Shin, Sungwon Lee, Ronald M. Summers

# Abstract

We present a novel unsupervised domain adaptation method for small bowel segmentation based on feature disentanglement. To make the domain adaptation more controllable, we disentangle intensity and non-intensity features within a unique two-stream auto-encoding architecture, and selectively adapt the non-intensity features that are believed to be more transferable across domains. The segmentation prediction is performed by aggregating the disentangled features. We evaluated our method using intravenous contrast-enhanced abdominal CT scans with and without oral contrast, which are used as source and target domains, respectively. The proposed method showed clear improvements in terms of three different metrics compared to other domain adaptation methods that are without the feature disentanglement. The method brings small bowel segmentation closer to clinical application.

SharedIt: https://rdcu.be/cyl4a

N/A

N/A

# Reviews

### Review #1

• Please describe the contribution of the paper

The key contribution of the paper is a new learning approach for unsupervised domain adaptation. The idea is to disentangle intensity and non-intensity features into two separate encoders through a clever learning framework. The non-intensity representations are then adapted to a new domain using unlabelled data and adversarial training both on feature and output level.

The non-intensity features seem more suitable for adaptation to new domains. The application is on small bowel segmentation in non-contrast CT (target) learned on contrast CT annotations (source domain).

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper has many strengths with a clever and well motivated learning framework, very thorough and comprehensive analysis (ablation study), and good set of sensible baselines for experimental comparison. The results are promising with some interesting observations such as the finding that supervision on the target domain with little data is inferior to domain adaptation from a different domain where more data is available. This underlines the importance of the proposed approach.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

No major weaknesses. Some related work is missing (see details below) and another weakness is related to the small amounts of test data on the target domain. Not entirely clear if this sample size is sufficient to conduct the statistical tests.

• Please rate the clarity and organization of this paper

Excellent

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Good level of detail is provided for the methodology, and it should be possible to re-implement the approach. The data does not seem to be available, so the experiments won’t be reproducible.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

This is a very solid paper with an interesting methodology and very thorough analysis and experimentation. The use of gradient images is simple yet effective and I was very pleased to see that key comparisons had been carried out to study the contribution of individual components and choices.

The fact that the non-intensity features are better for domain adaptation and that intensity features did not yield improvements are interesting findings that might be valuable in other applications.

Given the large number of hyper-parameters (weights of different losses), it will be difficult to assess how robust the learning framework will be under different experimental settings. A sensitivity analysis would further strengthen the paper.

A key reference (Kamnitsas et al. IPMI 2017) is missing which may be the first work to propose adversarial training for unsupervised domain adaptation in medical image segmentation: https://link.springer.com/chapter/10.1007/978-3-319-59050-9_47

accept (8)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Solid paper ticking most boxes including a good methodological contribution, thorough analysis and experimentation, and clear presentation.

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

5

• Reviewer confidence

Very confident

### Review #2

• Please describe the contribution of the paper

The main contribution of this work is a deep domain adaptation approach for segmenting bowels in contrast and non-contrast CT scans. The authors seem to address the problem of whole bowel segmentation. Most prior works that are focused on radiotherapy treatments typically consider a small portion of the bowel that is adjacent to the treated target. Hence, in terms of the application, there is some novelty here.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper tackles the problem of segmenting the whole bowel - small and large bowel, ileum, and all organ structures contiguous with the bowel. This is a challenging problem due to very large deformations and variabilities. In this respect this work is novel. Also, the paper considers the problem of segmenting on both contrast and non-contrast CT scans.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The method used in the work needs to be explained better. The main methodological contribution as stated by authors is the feature disentaglement approach. However, the way the method is implemented, it doesn’t quite seem like an disentanglement approach.

• Please rate the clarity and organization of this paper

Poor

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The methods as described in the paper are hard to replicate.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The method used in the work needs to be explained better. The main methodological contribution as stated by authors is the feature disentaglement approach. However, the way the method is implemented, it doesn’t quite seem like an disentanglement approach. In deep learning, the disentanglement, say between domain invariant content (anatomical structures) and style (domain specific features/textures) are derived using the specific encoders. Whereas in this case, this seems to be done by basically using the intensity and the gradient images (considered as non-intensity). These are then passed through separate encoders. This is not feature disentanglement. Also the rationale for using the intensity encoder to consider every single pixel intensity on its own doesn’t make sense. There is often a lot of appearance variability even in the same domain (say due to differences in contrast uptake, malignancy) so how much this branch helps is unclear. Even the authors say that this branch is optional. So all the work is basically done by the gradient images. But then this is just a different representation derived from the images that is used for segmentation. This then is not a disentanglement approach. Also, how this accomplishes domain invariant segmentation should be explained better. The losses used should be explained in more detail and the rationale for those losses be also explained.

The paper also lacks rigor in citing prior works. There were disentanglement based methods for multiple abdomen organs segmentation even in last year’s MICCAI. See for example Jiang et.al “Unified Cross-Modality Feature Disentangler for Unsupervised Multi-domain MRI Abdomen Organs Segmentation” and also “PSIGAN: Joint Probabilistic Segmentation and Image Distribution Matching for Unpaired Cross-Modality Adaptation-Based MRI Segmentation”. Another work that considers the unsupervised domain adaptation is also work by Dou Qi, “Unpaired multi-modal segmentation via knowledge distillation” in TMI 2020.

strong reject (2)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The method as described is not a feature disentanglement approach and looks to be incorrect. Overall, the paper needs to be written much more clearly. The results are not compelling and are hard to understand why they work because the method is unclear.

• What is the ranking of this paper in your review stack?

8

• Number of papers in your stack

5

• Reviewer confidence

Very confident

### Review #3

• Please describe the contribution of the paper

This paper introduces domain adaptation from oral contras to non-oral contrast CT scans for small bowel segmentation.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The target problem, small bowel segmentation can be difficult and time consuming. However, it is not easy to see high value the domain adaptation from oral contrast to non-oral contrast scans.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Technical novelty is weak. In case of intestines, tracks can be more important than overall volume segmentation. From this point of view, the problem which they try to solve - domain adaptation from oral contrast to non-oral contrast is not very attractive and may not be difficult.

• Please rate the clarity and organization of this paper

Satisfactory

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

There is no mention about code and data publication.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

reject (3)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Technical novelty.

• What is the ranking of this paper in your review stack?

4

• Number of papers in your stack

5

• Reviewer confidence

Very confident

# Primary Meta-Review

• Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper has the highest variance in my stack. While R1 sees no major weaknesses, and commends the clever learning framework and thorough evaluation, R2 finds the method poorly explained and is doubtful whether the approach can be considered disentanglement. The latter point appears to be the main reason for their very low score. R3 is also critical of the work, citing a lack of technical novelty and is doubting the clinical relevance. Unfortunately, review 3 is rather short and mostly does not justify its claims. I am therefore weighing it lowly in my recommendation.

Due to the high variability I would like to give the authors the opportunity to address the negative points in a rebuttal. Most importantly, please discuss the points raised by R2 about whether the approach can be considered disentanglement.

• What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

13

# Author Feedback

We appreciate the reviewers and the area chair for their time and valuable comments. We address the issues raised regarding the validity of feature disentanglement and importance of the problem.

Regarding the validity of feature disentanglement in the proposed method (R#2, This is not feature disentanglement.’), We think that feature disentanglement can be achieved in various ways as done in the previous works that are presented in the second last paragraph of Section 1. It can be achieved by: 1) applying desired prior assumptions on the disentangled features [14], 2) modifying the network architecture to induce each branch to extract different types of features [2], or/and 3) applying appropriate transformations for the input to each branch (as exemplified in Unsupervised Part-Based Disentangling of Object Shape and Appearance’’, CVPR’19). Our approach shares the same principles with 2) and 3).

Regarding the comments on the roles of the intensity and non-intensity encoders (R#2, Rational for using the intensity ~’ & All the work is basically done by the gradient ~’), we think there were a number of misconceptions about the overall framework. In the proposed method, disentanglement is achieved using a unique auto-encoding architecture paired with augmented input, not either of them, as explained in Fig. 2 and Section 2.2. While the intensity features are extracted by processing each voxel independently using 1x1x1 kernels, the non-intensity features are extracted by taking as input the gradient image instead of the original image in the non-intensity encoder. In addition to the applied constraints on either the kernel or the input type in each encoder, the reconstruction performed in the proposed network is also an essential component. Fig. 2 of our supplementary material shows the reconstruction results either from the intensity and non-intensity features. It implies that the non-intensity features have more information on the small bowel.

More specifically on the comment (R#2, Rational for using the intensity ~’), based on the observation that the intensity features are less transferable across domains, adaptation is performed exclusively on the non-intensity features with the help of the disentanglement in the proposed network. It is optional to use the intensity features for segmentation prediction as explained in Fig. 2 and Section 2.3. It is justified by looking at Ours + int. feat.’ in Table 1, which does not improve the performance compared to the one without using the intensity features (Ours’). This coincides with the following comment of R#2 that how much this branch helps is unclear’, so not a criticism anymore.

More specifically on the comment (R#2, All the work is basically done by the gradient ~’), all the networks in Fig. 2 are essential for the proposed method excepting the explained optional connection. feat. & out. level DA w/ grad.’ in Table 1 corresponds to the proposed network without the intensity encoder and reconstruction decoder, which is evidently without disentanglement, and we guess it is the one that the reviewer might confuse with the proposed method. It showed a worse result as explained in the second paragraph of Section 3.1.

Regarding the clinical relevance of the problem (R#3, the problem which they try to solve ~’), it is clinically important to be able to identify the small bowel on scans either with or without oral contrast since scans could be done either way. This first attempt to develop an unsupervised domain adaptation method for small bowel segmentation becomes more important considering the high difficulty of labeling the small bowel. We are aware that extracting the path would be more difficult than segmentation. However, it cannot be a reason of relegating the clinical benefit of this work and even the technical novelty. The segmentation is still useful for detecting lesions, blockages, and for distinguishing the bowels from adjacent lesions in the mesentery.

# Post-rebuttal Meta-Reviews

## Meta-review # 1 (Primary)

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

After reading the rebuttal, I believe the major criticisms of R2 to be answered. There appear to have been a number of misconceptions which were adequately clarified by the authors. I am therefore following the “accept” recommendation of R1, who thinks this is a clever learning framework, with a thorough evaluation, good baselines, and promising results.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6

## Meta-review #2

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The response is clear. The authors have addressed the main concerns raised by the reviewers.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

4

## Meta-review #3

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper presented a framework of unsupervised domain adaptation. The authors answered the ‘disentanglement ‘ question, but not convincing. Additionally, I think the question of R#3 is about the discussion in the difficulty of the problem in practice, while the authors discussed the clinical relevance.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

12