Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Shota Harada, Ryoma Bise, Hideaki Hayashi, Kiyohito Tanaka, Seiichi Uchida

Abstract

Ulcerative colitis (UC) classification, which is an important task for endoscopic diagnosis, involves two main difficulties. First, endoscopic images with the annotation about UC (positive or negative) are usually limited. Second, they show a large variability in their appearance due to the location in the colon. Especially, the second difficulty prevents us from using existing semi-supervised learning techniques, which are the common remedy for the first difficulty. In this paper, we propose a practical semi-supervised learning method for UC classification by newly exploiting two additional features, the location in a colon (e.g., left colon) and image capturing order, both of which are often attached to individual images in endoscopic image sequences. The proposed method can extract the essential information of UC classification efficiently by a disentanglement process with those features. Experimental results demonstrate that the proposed method outperforms several existing semi-supervised learning methods in the classification task, even with a small number of annotated images.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87196-3_44

SharedIt: https://rdcu.be/cyl2V

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    Author have proposed propose a practical semi-supervised learning method for UC classification by newly exploiting two additional features, the location in the colon e.g., left colon) and the image capturing order, both of which are often attached to the individual images in the endoscopic image sequences. The proposed method can extract the essential information of UC classification efficiently by the disentanglement process with those features. Experimental results demonstrate that the proposed method outperforms several existing semi-supervised learning methods in the classification task, even with a small number of annotated images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Some strength of the papers are mentioned below:

    1. Author have used disentangled representation of learning for endoscopic image classification which shows promising result.
    2. To compensate for the limitation of UC-labeled data using temporal ordering information, author have introduced order-guided learning, which considers the relationship between temporally adjacent images.
    3. Order-guided learning can obtain the effective feature for classifying UC from unlabeled images by considering the relationship between the temporally adjacent images.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper need to be revised as some text are not visible in the experiment result which does not help to provide the suggested comments.
    2. No explanation on the selection of training and testing datasets for the experiment.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    NA

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    The author has contributed for paper entitled “ Order-Guided Disentangled Representation Learning for Ulcerative Colitis Classification with Limited Labels”. The paper is very interesting and written in an amazing way. Here I would like to provide some comments mentioned below:

    1. It would be appreciated to look to the paper before converting in to the paper. Some important details from experiment results are not shown and looks empty.
    2. The organization of dataset is not clear. How author has categorized endoscopy image sequences in training and testing set needs to be clearly mention here.
    3. What software/tool is used with performance is not added here, which is important to look for the performance.
    4. This would be really interesting to see how this method work on the other abnormalities related to GI tract.
  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Author has provide a good technical work for the proposed methodology and have clarified on the techniques and the datasets used for this work. The result analysis and further investigation provide the desired outcome for the validation of the work.

  • What is the ranking of this paper in your review stack?

    4

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #2

  • Please describe the contribution of the paper

    The paper presents an order-guided disentangled representation learning SSL framework for ulcerative colitis classification. The provided experiment demonstrates that the proposed method performs superior against several SSL learning methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • A novel SSL framework for ulcerative colitis classification using location and temporal information. The framework utilizes disentangled representation learning to separate UC features and the location framework.

    • The provided experiments demonstrate the efficacy of the method over the existing SSL methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • One of the main weaknesses of this work is that the proposed method is not well motivated neither explained, and thus, it has been hard to understand the novelty of the whole framework.

    • The provided experiments are limited in assessing the efficacy of the model.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility is poor. The paper has not provided much details on how someone can implement their framework and reproduce the results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    • The paper picks upon an interesting problem of endoscopic image classification in the limited labeled settings. However, the proposed framework is not well-motivated, and some key aspects of the model are not well discussed. Some questions include: Is having two networks B_u and B_loc, along with the L^d and L_adv losses, enough to ensure disentanglement? Usually, there is an unbounded number of generative factors in the medical imaging dataset, so how can we know that these two networks have captured what they are supposed to capture? A separate analysis is done in usual disentanglement work by probing the network, which is missing in this work. I think such an analysis would be constructive to understand the model’s disentangling ability.
      1. The intuition behind the use of ordinal loss is interesting, and the schematic of Fig 2 C is equally helpful. But how can we know that this behavior is achieved with the provided loss? One suggestion would be to look at the embedding of the images used in Fig 3 to confidently answer that the model has learned the ordering of the images in the sequence.
    • Why only one label ratio was used to perform experiments? The usual papers in SSL often use multiple label ratios, right? If the framework behaves similarly in other label ratios, it would further enhance the efficacy.

    • The use of disentangled representations for SSL setup has been explored in the medical imaging community before. Authors are encouraged to discuss such works. e.g., https://arxiv.org/pdf/1907.09607.pdf

    Overall, the paper can be improved a lot in terms of motivating the problem and clarity in writing. It isn’t easy to understand the work in the current version. I encourage authors to focus on the writing in their revised version.

  • Please state your overall opinion of the paper

    borderline reject (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper lacked some key analysis required to understand the model’s behavior, and the evaluation for SSL is performed only on a single label ratio setting.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    6

  • Reviewer confidence

    Confident but not absolutely certain



Review #3

  • Please describe the contribution of the paper

    The authors proposed a novel method for ulcerative colitis (UC) classification with limited labels. The proposed method utilizes additional UC location and temporal information to improve the learning and disentangling of UC-only features that therefore can improve UC classification.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    +research motivations and contributions are well described and easy to follow. +I appreciate the way of using disentanglement method to make the model can focus on learning the disease-related features. I believe it is a very good contribution for general medical images as well. +a good amount of experiments in various forms makes the evaluation strong, including baseline comparisons, ablation study, and some visual results of UC prediction on temporal sequence.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    -A formal summary of overall objective may be necessary. Especially, what losses are used for labeled (UC label and location label) and unlabeled data respectively. If I understand correctly, all the data have the L_seq, and the data with UC label additionally have those disentanglement losses. Then how about those data with location label but no UC label? Do those disentanglement losses except L^c_u apply to them? Even without the UC label, the adversarial losses should be helpful for disentanglement right? Or they are just ignored, so only the data with both UC label and location label are considered? -It seems that the FixMatch, which is the SOTA semi-supervised baseline claimed by the authors, is actually worse than a simple full-supervised baseline when R=0.1 (huge drop of F1, and slightly less accuracy)? Should we expect that the semi-supervised method to be better when label is limited? This may make readers question if the authors tune the FixMatch correctly with those changed details and make a fair comparison with the SOTA semi-supervised methods.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    method: almost clear dataset: a private dataset evaluation: clear

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    1.For table 1 and 2, it would be better to also report the variance of multiple runs, since some numbers are really close. 2.For ablation study, it would be good to add a setting that only has “order”.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    A good and completed paper. Novel and promising method. No obvious flaws.

  • What is the ranking of this paper in your review stack?

    2

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper receives diverged review ratings. Please the authors address the issues and concerns as listed by the reviewers in their comments in box 4 for weakness and the questions raised in box 7 for detailed inquiries. In particular, (1) please explain the motivation of the paper as inquired by R#2 and R#3, (2) please explain the experimental setting/results as inquired by R#1, R#2 and R#3, (3) please provide a summary of the overall objective as inquired by R#3.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    6




Author Feedback

We would like to thank all the reviewers for their insightful comments and positive evaluation. For instance, Reviewer 1 (R1) commented that the paper is interesting and showed promising results for ulcerative colitis (UC) classification. R2 and R3 commented that the paper proposed a novel method. We would like to address the concerns raised by the reviewers as follows.

  1. About the motivation of the paper (Meta review, R2, and R3) [Background] UC classification has two difficulties. First, the appearance of endoscopic images varies drastically by various factors, such as the location where the images are taken. Second, attaching ground-truth (GT) is difficult even for the specialist, and thus we cannot expect enough training samples with GT — so, the task often becomes semi-supervised. [Naïve solution] Disentanglement is a reasonable choice to relax the first difficulty. It can separate the useful features for the UC classification from the unnecessary features (representing the location information and other factors). However, the second difficulty prevents disentanglement from achieving its expected performance. [Our motivation] Our main idea is to introduce the temporal ordering information of endoscopic image sequences, which can be obtained automatically. Simply speaking, since the temporally adjacent frames tend to belong to the same UC label, the lack of labels can be compensated by bringing the adjacent frames closer together in feature space even if some frames in an image sequence are without UC labels. Based on this idea, we developed a new disentanglement method (with “ordinal loss”) that can utilize the temporal prior.

  2. About the experimental setting (Meta review, R1, and R2) [Dataset details] Some of the dataset descriptions are not visible. The dataset was collected from a specific facility. Thus, we intentionally anonymized the facility name that collected the dataset in order not to violate the rules of the double-blind review. [Train/test separation] As explained in Section 4.1, random splitting is used. [Label ratio] We used various other label ratios, in addition to R=0.1. For example, when R=0.2 and 0.3, the accuracy of the proposed method was 87.53 and 85.88. Since they have no large difference from R=0.1, we omitted them due to the page limitation.

  3. About fairness of the comparative method (Meta review and R3) R3 concerns that the SOTA semi-supervised baseline FixMatch was worse than a simple full-supervised baseline and the modification of the FixMatch was unfair. The reason why we modified the FixMatch is that the original FixMatch is optimized for the CIFAR dataset. In their original code, they use a strong perturbation since the feature distribution of each class is well separated in the dataset. In our data, this strong perturbation gives a bad effect since our data are much complex. Therefore, we used weak perturbation instead of the original one. Indeed, the performance of the original FixMatch are 70.59, 47.13, 56.52, 90.66, and 76.63, and these performances were worse than the modified version according to Table 1.

  4. About a summary of the overall objective (Meta review and R3) Disentangled representation learning for the proposed method aims to separate the image features into UC and location-dependent features. These features are obtained via multi-task learning of UC and location classification. In our problem setup, partial UC labeled images, fully labeled location labels, and order information in sequences are given as training data. To address this setting, we introduced order-guided learning based on a characteristic that temporally adjacent images tend to belong to the same UC label. More precisely, for the UC-labeled images, the sum of the classification losses L^c_u, L^c_loc, and the disentangle losses L^adv_u, L^adv_loc, L^d_u, L^d_loc is used. For UC-unlabeled images, we introduced the ordinal loss L_seq instead of L^c_u and ignored the disentangle losses L^adv_u, L^d_u.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper studies the ulcerative colitis classification problem by a semi-supervised learning method, with the aim to exploit the location in the colon and the image capturing order. The motivation makes sense, which is implemented by the presented method. Experimental results support the proposed method. The authors’ rebuttal has largely addressed the questions/concerns raised by the reviewers. So I recommend an acceptance to this paper.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    5



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a practical semi-supervised learning method for UC classification by newly exploiting two additional features, the location in the colon e.g., left colon) and the image capturing order, both of which are often attached to the individual images in the endoscopic image sequences. The idea is interesting. Two reviewers give high marks and the other one has concern about the motivation and the experiment setting.

    In the rebuttal, the authors clearly illustrate the motivation of this paper and clarify the experiment setting. Therefore, acceptance is recommended.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    9



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The reviews are contradictory and also the confidence levels and expertise levels of the reviewers are not high. However, it seems that the motivation for the paper has not been written well and I am not convinced that it has been explained well even in the rebuttal. Secondly, it looks like there are lots of details about the datasets, theorectical reasoning and experiments that are missing. So although this paper addresses an interesting topic it is still not ready for publication.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    5



back to top