Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

# Authors

Ping Wang, Jizong Peng, Marco Pedersoli, Yuanfeng Zhou, Caiming Zhang, Christian Desrosiers

# Abstract

In this paper, we present a Context-aware Virtual Adversarial Training (CaVAT) method for producing anatomically plausible segmentation. The proposed method uses reinforcement learning to include any non-differentiable constraint in a semi-supervised setting. Specifically, we incorporate complex anatomical constraints into the VAT framework, boosting the network’s robustness to adversarial examples that maximize both prediction divergence and constraint violation. Experiments on a well-known cardiac segmentation dataset show the effectiveness of our method in terms of segmentation accuracy and improved constraint satisfaction.

SharedIt: https://rdcu.be/cyhL7

N/A

N/A

# Reviews

### Review #1

• Please describe the contribution of the paper

The paper describes a method to leverage on constraints for semi-supervised training of a deep net to segment images. The crux of the proposed method is an auxiliary loss function, which detects constraint violations in segmentations obtained for images perturbed using the adversarial approach. This loss is stochastic, with gradients computed using the Reinforce rule.

the method is generic and could in theory be applied to any task with structured output and any constraint. In the paper, it is demonstrated to improve the performance of a deep net trained to segment cardiac images from the ACDC data set, when only 3-5% of the training labels are used, with the constraint that each class should form a single connected component.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) the idea of using the reinforce rule to minimize the expected number of constraint violations is simple and generic, yet potentially very powerful. 2) the paper is very clear and well written

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

A) it is a pity that the method is only demonstrated with one constraint. B) … and on one data set. C) The paper does not discuss why the image needs to be perturbed in an adversarial way to run the loss l_cons. It is also not shown experimentally what happens when l_cons is used on the original image. D) I miss a discussion of the limitations of this approach. E) Just for reference, a comparison to another method of semi-supervised segmentation would be nice. This is a minor weakness though, because the advantage of the proposed method is that it accommodates non-differentiable constraints.

• Please rate the clarity and organization of this paper

Excellent

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The description of the method is clear enough for me to attempt to implement it.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

I develop here on some weaknesses listed in point 4. ad A) could you suggest other constraints/reward designs for which the method could be used? ad C) what would happen if you evaluated the l_cons loss for r_u=0 and without the VAT loss? Is the adversarial perturbation really necessary in your method? ad D) Are there constraint types for which the loss is ineffective?

Additional question: Q1. one non-obvious weakness is that l_cons only reinforces samples (patches, in this case) that satisfy the constraints. What about penalizing samples that violate the constraints?

Minor comments: i. the role of the subscript i in (7) is not clear until the next section. ii. there is an interrupted sentence in the beginning of 3.2 iii. I cannot find the number of samples used (m) in the paper

Probably accept (7)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

My recommendation is influenced by two main arguments:

• a simple, generic, well described idea, very compelling
• the method was only demonstrated to work with a single constraint type
• What is the ranking of this paper in your review stack?

2

• Number of papers in your stack

5

• Reviewer confidence

Confident but not absolutely certain

### Review #2

• Please describe the contribution of the paper

This paper presents a semi-supervised learning method which incorporate segmentation constraints over virtual adversarial training to produce anatomically-plausible segmentation. The method is validated on the cardiac image segmentation task.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The idea to incorporate segmentation constraint to produce anatomically-plausible segmentation is reasonable.
• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The rationale of proposed method was not clearly elaborated. For example, why reinforcement learning is used in the method, what is the particular advantage.

2. Novelty of proposed method is incremental over VAT.

3. It seems this paper has not been well prepared. (1) No introduction for background in the abstract section. (2) Incomplete sentences/paragraphs in Section 3.2, which is in fact a crucial part in the method.

4. Semi-supervised medical-image segmentation has been a popular topic with a number of methods being proposed recently also in MICCAI 2020. However, the authors didn’t compared with any of the recent SOTA methods.

5. How can the authors demonstrate that there method can produce more anatomically-plausible segmentations? Is there any evidence showing the connections between non-connectivity and anatomically plausibility.

6. The method is only validated on one dataset, which may lead to bias conclusions.

• Please rate the clarity and organization of this paper

Poor

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors claim to release the code.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. In fact this paper is very related to semi-supervised learning, hence keyword “semi supervised” should be included in the title.

2. A figure could be provided to give a clear illustration for the proposed method.

reject (3)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

My rating is mainly from two perspectives:

1. Technical novelty of this paper is incremental and the use of reinforcement is has not been well motivated.
2. Lack comparisons with SOTA and the evaluation is only conducted on one dataset.
• What is the ranking of this paper in your review stack?

4

• Number of papers in your stack

4

• Reviewer confidence

Very confident

### Review #3

• Please describe the contribution of the paper

This paper proposed a semi-supervised method with an additional non-differentiable constrain attached to the VAT framework, to enhance model robustness and accuracy. They have validated their performance on cardiac image segmentation.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

++ The problem they tackled, i.e., how to exploit the shape priors (e.g., the anatomical constraints) in semi-supervised learning, is interesting.

++ A new usage of reinforcement learning, i.e., using reinforcement learning to solve the non-differentiable loss term.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

For methods, there are several ambiguous explanations, which make it hard to follow. – In the 2nd term in Eq.3, how the constrain violation term could lead to better anatomical shapes? – For someone who is not familiar with reinforcement learning, I do not quite understand why using reinforcement learning could solve the non-differential loss? Since you used reinforcement learning (RL), what do other commonly-used RL terms including environment and status, refer to in the context? – It would be greatly helpful if there is a brief figure to illustrate your framework and insights. – There may have a typo in Eq.4. It should be $x_{u}+r_{u}$, instead of $x+r_{u}$ ,right?

For experiments, I noticed that the improvements over VAT are not significant, i.e., <1% in Dice.

• Please rate the clarity and organization of this paper

Satisfactory

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Since several components are coarsely described with no code available, I am not positive about it.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

It would be great if the authors could elaborate more on several details in methods

1. Regarding the 2nd term in Eq.3, (1) How the constrain violation term would help the network to learn more complex or better anatomical shapes? (2) Why using reinforcement learning could solve the non-differential loss?

2. Regarding Eq. 6, how to sample the discrete segmentation mask? What exactly the constrain mean to direct the reward function to be 1 or 0?

3. In the 3rd paragraph in Section 3.2, you mentioned you utilized a constrain to impose the segmentation regions to have a single connected component. I was wondering if it could generalize well to other applications. For cardiac segmentation on organ structures, it should be fine as the segmented regions like LV, Myo and RV, indeed should be one single piece. However, for some lesions, like the myocardial infarction regions, since they are naturally scattered, is the proposed constrain also appliable here?

4. It would be greatly helpful if there is a brief figure to illustrate your framework and insights.

5. In experiments, I am curious about the performance of VAT + shape constrain in [28].

probably reject (4)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. The explanations on the key component, i.e., the constrain violation, is ambiguous and hard to follow.
2. The performance gains are less significant, i.e., <1% in Dice.
• What is the ranking of this paper in your review stack?

4

• Number of papers in your stack

5

• Reviewer confidence

Somewhat confident

# Primary Meta-Review

• Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

Strengths

• The idea of using reinforcement learning to include non-differentiable constraint to produce anatomically-plausible segmentation is highly compelling.
• The paper is very clear and well written.

Weaknesses

• The method is only validated on one dataset.
• The authors did not compare with any of the recent SOTA methods.
• Novelty of proposed method is incremental over VAT, wrt to all metrics.

Regarding the remark of Rev#2 « Novelty of proposed method is incremental over VAT. i.e., <1% in Dice. »: I completely agree. Thus I advise the authors not to « boldify » the top 1 result, unless statistical significance has been demonstrated, cf for example 77.40 and 77.85 are likely to be statistically similar given the std deviation. However let me acknowledge that the increment in terms of Dice is not what the authors put forward in the text, as they mainly comment on the non-conn metric. That said, I believe even the experimentations with the VAT/Entropy min bring interesting insight as to how to adversarial examples to enforce constraint in segmentation.

Despite some justified limits noted by the reviewers, I think the paper has some merits, from a methodological perspective, since proposing a framework that allow to incorporate non-differentiable constraints in segmentation network is a hot topic and of great importance. Also the paradigm of reinforcement learning has received scarce attention for image segmentation models. I believe the authors can bring clarifications to some key questions.

The questions to address in the rebuttal are inspired by all reviewer’s concerns: 1) It would be good to highlight the benefits of using reinforcement learning in this context, for reader unfamiliar with this topic. In this regard, please also explicit « Why using reinforcement learning could solve the non-differential loss? »  A figure would be definitely helpful to help readers unfamiliar with RL. 2) Please explain why method can produce more anatomically-plausible segmentations 3) Please discuss the limitations of this approach. 4) Please discuss why the image needs to be perturbed in an adversarial way to run the loss l_cons. It is also not shown experimentally what happens when l_cons is used on the original image.

Other important points, as noted by Rev#2:

• This paper is related to semi-supervised learning, hence keyword “semi supervised” should be included in the title: the superiority of the method outside of a semi-supervised setting has not been shown, so
• Please remove the unfinished sentence of section 3.2 « In this paper, we consider a well-known and broadly »

In addition of course, if space is left, the authors can also correct any misunderstanding and address additional points raised by the reviewers.

• What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

8

# Author Feedback

• Novelty of our method (R2)

As underlined by the Meta and R1, we propose a generic and efficient approach for adding any segmentation constraint on top of any segmentation network. For R2’s comment, we argue that our method is not an extension of VAT, but rather uses adversarial training to generate examples violating given constraints without having prior knowledge of these constraints. Our goal is not directly improving the segmentation Dice but instead reducing anatomically-impossible predictions by the network. Experiments show our method to yield a statistically greater constraint satisfaction (non-conn) than VAT and Entropy minimization.

• Benefits of RL (M,R2,R3)

The link between RL and our method is two-fold. 1) As the policy network in RL, the segmentation network predicts label probabilities at pixels. However, the agent’s action in RL and segmentation labels in our work are discrete. 2) Like the reward function in RL, the segmentation constraints in our work can be added on a generic learning strategy, without requiring prior knowledge on that constraint or a differentiable loss function. We achieve these goals via the Reinforce algorithm that is commonly used in policy gradient RL methods.

• Anatomically-plausible segmentation of method (M,R2,R3)

Previous work as [3] showed that a high Dice does not always imply a good segmentation perceived by clinicians. For instance, a segmented region with 90% Dice may still contain holes or disconnected regions that are anatomically impossible. Unlike approaches focusing solely on accuracy measures like Dice, our method also considers complex topological constraints like connectivity which cannot be easily modeled in a loss function. We use adversarial training to concentrate the learning on examples violating the constraints, so the network can learn to avoid making such incorrect predictions on new examples. We demonstrate our method on a hard-to-model constraint with a high applicability: connectivity. Results show our method to give less disconnected pixels compared to other tested approaches (paired t-test p < 0.05).

• Need for adversarial perturbation (R1,R3)

The constraint loss based on the Reinforce algorithm produces a non-zero gradient only if the constraint is violated. Since the constraint can be complex and is not known beforehand, there is no guarantee that it will be violated during training hence that the network will learn to avoid this violation. In our method, we use adversarial training to focus on examples with a non-zero gradient for the constraint loss, thus speeding-up the learning. Meanwhile, the standard VAT term increases the network’s robustness to noise. To answer R1’s question, we tested our method without adversarial training (i.e., r_u=0) and found it gave a similar Dice but a higher HD and constraint violation (non-conn).

• Limitations of our method (M,R1)

Limitations include: 1) the Reinforce algorithm requires sampling several segmentations from the predicted probabilities otherwise the optimization may be unstable; 2) although our method allows adding any constraint, evaluating some constraints during training may be computationally expensive.

• Other datasets and SOTA (R1,R2)

Our method can be added as a plug-in on top of any segmentation approach. We further demonstrate its advantage by adding it on two SOTA approaches, Co-training (Co-T) and Mean Teacher (MT), and testing it on the PROMISE12 dataset (5% labeled):

   | Dice | HD | Non-conn Ent. min | 50.21 (1.93) | 13.52 (3.95) | 30.33 (6.46) VAT | 50.96 (3.42) | 13.44 (3.57) | 31.96 (2.23) Co-T | 48.92 (2.74) | 9.72 (0.92) | 33.09 (4.06) MT | 67.97 (1.88) | 11.72 (0.76) | 11.33 (3.10) CaVAT | 57.45 (1.23) | 15.60 (2.25) | 21.53 (1.31) Co-T + CaVAT | 63.01 (1.62) | 4.21 (0.72) | 16.10 (3.49) MT + CaVAT | 70.87 (1.70) | 10.17 (0.60) | 9.67 (2.24)

• Paper title (R2)

We will modify the title to include the term semi-supervised.

# Post-rebuttal Meta-Reviews

## Meta-review # 1 (Primary)

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper presents novel and powerful ideas to constrain image segmentation, as acknowledged by the reviewers. There were issues on the motivation, the impact and the limitations of the method. To my opinion, the authors have addresses these issues in their rebuttal, hence this paper deserves acceptance for its interest regarding methodology.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

## Meta-review #2

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

I believe the paper is quite borderline, with an interesting RL based application to shape constraints for segmentation, but with substantial drawbacks: clarity of exposition, limitation in experiments (few baselines), limitations in quality of results, etc.

It seems to me that the Primary MR values the RL application to anatomical constraints and would like to see the paper accepted. I mostly agree, and believe the rebuttal helps clarify some of the issues, including adding new results on a new dataset with new baselines.

I am learning towards a conditional accept, given that the authors include everything they added to the rebuttal with a careful analysis, even in cases where their method does not improve statistically on VAT. The main reason for acceptance is the uniqueness of the method-task combination, which makes it worthwhile to discuss at MICCAI.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

12

## Meta-review #3

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The author added a context-aware virtual adversarial training module into the semi-supervised segmentation pipeline, aiming to preserve the anatomical structure better. The problem to tackle is interesting and well motivated. The experimental results is also promising. The AC agrees that the dice score should not be the only metric to focus on. However, there are several limitations of current versions: more explanation of RL is needed, the module is only tested on one dataset. A lot of edits are needed to address these concerns. Recommend to reject based on current version.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

15