Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Qi Bi, Shuang Yu, Wei Ji, Cheng Bian, Lijun Gong, Hanruo Liu, Kai Ma, Yefeng Zheng

Abstract

With the rapidly growing number of people affected by various retinal diseases, there is a strong clinical interest for fully automatic and accurate retinal disease recognition. The unique characteristics of how retinal diseases are manifested on the fundus images pose a major challenge for automatic recognition. In order to better tackle the challenges, we propose a local-global dual perception (LGDP) based deep multiple instance learning (MIL) module that integrates the instance contribution from both local scale and global scale. The proposed module consists of a local pyramid perception module (LPPM) that emphasizes the key instances from the local scale, and a global perception module (GPM) that provides a spatial weight distribution from a global scale. Extensive experiments on three major retinal disease benchmarks demonstrate that our framework outperforms many state-of-the-art deep MIL methods, especially for recognizing the pathological images. In addition, the proposed module is also validated on multiple backbones and achieves substantively superior performance, indicating the effectiveness and generalization capability. Last but not least, the proposed deep MIL module can be conveniently embedded into any convolutional backbones via a plug-and-play manner and effectively boost the performance.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_6

SharedIt: https://rdcu.be/cyl9L

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper suggests an approach for automatic retinal disease screening approaches based on deep learning techniques. It proposes a standardized deep MIL scheme for retinal disease recognition and claims to be adopted to any convolution neural networks (CNN). It suggests a local-global dual perception based machine instance learning (MIL) module.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1- The paper supposes a fully automatic retinal disease screening ap- proaches based on deep learning techniques which is a novel approach 2- The suggest approach can be conveniently adapted to any CNN backbones in a plug-and-play manner and substantively boost the performance by a large margin. 3- he proposed module could eectively boost the recognition performance which may empower the diagnosing approaches for some retinal diseases
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1- The authorys should provide some details regarding why their approach outoerformes other benchmark approaches that are stated in the literature review. 2- The literature review needs some improvement to cover more state of the art current approaches in the field of retinal diseases diagnsing
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

some cutrrent paper do exist in the same area of research but unable to confirm reproducibility
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

1- some statistical analysis regarding the accuracy and performance of the suggested ML based approach should be provided 2- The literature review need to be comperhensive to ensure the completenes and novelty of the research work introduced in the paper. 3- The MIL based approach should be detailed more for small researches to understand the criteria and fooundations hehind such research work
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is somewhat well organized The paper technically sounds but with some simple comments the topic of research is of unmet needs for humanity
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Somewhat confident

Review #2

Please describe the contribution of the paper

The authors propose a novel approach for an automatic retinal diseases screening from fundus images. The proposed framework is based on a local-global dual perception (LGDP) combined with multiple instance learning to integrate the instance from both local and global scales. The proposed framework was validated on two public datasets and one private dataset to evaluate its performance in the screening of Diabetic retinopathy, glaucoma and age-related macular degeneration from fundus images. Different metrics were computed to compare the performance of the proposed framework to state of the art methods.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Overall, the paper is clear, well structured and easy to follow.

Results are clearly discussed, the approach is tested on multiples datasets and provides consistent performance improvement on each dataset. An ablation study is conducted to justify the introduction of the LPPM and the GPM branches. Table 3 indicates that the proposed framework can bee embedded in different CNN architectures, which is also an interesting contribution.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The MIL formulation might give a miss-leading sense of originality. In the paper, each pixel in the last feature maps is considered as an instance. Two types of aggregation are then done across instances (local-one with the LPPM followed by a weighted sum in the Local-global Perception Fusion). This is not different from any classical CNN-classification formulation. Moreover, here, instances are not permutation-invariant, which is usually assumed in the MIL approach.

With this regard, the main contribution is the LPPM module and particularly the top-k max-pooling operator. However, table 1. indicates that GPM (which is just a 1x1 convolution) consistently (but marginally) outperform LPPM alone. It is therefore slightly unclear why the combination of both would provide such a boost in performance and further experiments should be conducted, in particular on multi-class problem (not just binary).
- It is unfortunate that the SOTA comparison is mostly limited to two public datasets. Diabetic Retinopathy has been widely researched and numerous recent papers have proposed methodologies tested on APTOS. In addition, there are many more DR-related dataset (FGADR, Messidor, EyePacs…) that could have been used to evaluate the proposed approach.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The private AMD dataset is not well described.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The LPPM module and particularly the top-k max-pooling operator is an interesting contribution.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Extensive experiments were performed to compare the performance of the proposed framework to state of the art methods.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

In this paper, the authors proposed a local-global dual perception deep MIL module for retinal disease recognition. Instance responses from both local and global scales are considered and integrated, so as to better tackle the challenge of how various pathologies are presented on the retinal image.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The challenges of fundus images diseases classification are well discussed and analyzed.
- Appling multiple instances learning (MIL) to fundus images diseases classification is reasonable.
- Based on MIL, the authors proposed a local-global dual perception. The proposed module can be adapted in a plug-and-play manner.
- Experiments are conducted on two publicly available datasets and one private dataset.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The related work about applying multiple instances learning to retinal fundus images is missing.
- The proposed GPM is a little simple.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

/
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. What dose the detailed implementation of SOTA deep MIL methods, including [25,16,21,22]?
2. The proposed GPM is a little simple since the GPM only contains a Conv1 layer and a Relu layer. Also, the motivation of why design GPM in this manner is missing.
3. The related work about applying multiple instances learning to retinal fundus images is missing. It would be better to introduce the related work to support the contributions as claimed.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method is well-motivated and the experimetns are sufficient.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a retinal disease classification using multiple instance learning. The contribution resides in a global+local scale pyramidal module.

One reviewer appreciates the possible versatility of the method in other CNN architecture, invites authors for more recent work on retinal diagnosis, but without suggesting refs.

A second reviewer also appreciate the plug-n-play nature of the MIL contribution, but questions the originality of the MIL formulation.

A third reviewers further notice the plug-n-play advantage, but also wants more refs on fundus segmentation.

All reviewers agrees on a well-written, well-evaluated method, with a highlighted appreciation of the plug-n-play nature of the proposed global+local module. Two note a lack of related refs on retinal diagnosis. The complexity of the method is described as simple, which is an advantage in my opinion in this context, which merit to be emphasized in a different wording.

For these reasons, this paper constitutes a notable contribution in the MIL literature, with an easy reusable local+global dual perception module that has been extensively evaluated. Recommendation is towards Early Acceptance.

Oral: All three reviewers has recommended an oral podium as well as a young scientist award. I believe the potential impact of this dual module can contribute to the field by further popularizing MIL approaches.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

Author Feedback

We appreciate the meta-reviewer and reviewers’ effort in reviewing our submission, along with the positive comments. Still, it is important to clarify some issues raised by the reviewers. Please see below for details. #To meta-reviewer. Q1: We will cite more recent references on retinal diagnosis in our camera-ready manuscript. Q2: Although deep MIL relies on convolution operations, in terms of the problem formulation, granularity of feature representation, and generation of probability distribution, they are quite different. For details, please refer to our reply to the 1st comment of Reviewer2. Q3: We will cover references on fundus segmentation in our camera-ready manuscript. #To R1. Q1: We will provide more details on statistical analysis. However, we would like to stress that, all of our experiments on the MIL based approaches are conducted under the same parameter settings, and five-fold cross validation results are reported, which is fair enough to demonstrate their different performances. Q2: We will cover more references on retinal diagnosis and fundus image processing in our camera-ready version. Q3: The details of MIL formulation and methodology has been provided in our supplementary materials. #To R2. Q1: It is actually different from the classic CNN, and the reasons are listed as follows. (1) For last feature maps, in classic CNN, the channel number is usually quite large, such as 512. However, in MIL, the instance representation must be converted from the last feature maps. Specifically, if there are N categories, the instance representation has only N channels, which is much smaller than classic CNN feature maps. (2)For the aggregation, in classic CNNs, the last feature map is fed into several fully connected layers and then generate the probability distribution. However, in MIL, the aggregation can directly produce the probability distribution. In this way, fully-connected layers in classic CNNs, which usually occupy a lot of parameters and are likely to be over-fitting, are removed. (3) For the granularity of feature representation, each instance corresponds to a N-dimension vector, and describes its response on each of the N scene categories. This is also different from existing CNNs. Q2: (1) The motivation and objective of our work is to recognize the retina diseases, not to grade them to different classes. Hence, it is more proper to formulate this task as a binary classification, rather than multi-class classification. The validation of our method on other DR multi-class tasks is our future work. (2) Compared with traditional CNNs, which preserves the global semantic information, our GPM assigns different weights to the instances and highlight the feature responses for key instances, while our LPPM highlights the responses in each local window. Hence, both our GPM and LPPM leads to a performance gain. On the other hand, as the combination of both GPM and LPPM highlights the feature responses of key instances from both the global and local perspective, the performance gain is more. We will provide this discussion in our camera-ready version. Q3: Indeed there are many other DR-related benchmarks, but they focus on the retinal disease grading task. However, as the objective of our work is to recognize the disease, rather than grade the disease into several different patterns, it may be not that proper to directly utilize such benchmarks. Adapting our MIL in such tasks and validating its effectiveness can be our future work. #To R3. Q1: To the best of our knowledge, this is the first work to investigate deep MIL for retinal recognition. We will cover more references on classic MIL and its application in retinal images. Q2:As the objective of our GPM is to acquire a spatial weight matrix for the importance of each instance, we solve this by using a convolution layer. It can be easy to use more complicated network structures for further performance gain, which is interesting for further investigation.

back to top

Local-global Dual Perception based Deep Multiple Instance Learning for Retinal Disease Classification