Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Nkechinyere N. Agu, Joy T. Wu, Hanqing Chao, Ismini Lourentzou, Arjun Sharma, Mehdi Moradi, Pingkun Yan, James Hendler

Abstract

Radiologists usually observe anatomical regions of chest X-ray images as well as the overall image before making a decision. However, most existing deep learning models only look at the entire X-ray image for classification, failing to utilize important anatomical information. In this paper, we propose a novel multi-label chest X-ray classification model that accurately classifies the image finding and also localizes the findings to their correct anatomical regions. Specifically, our model consists of two modules, the detection module and the anatomical dependency module. The latter utilizes graph convolutional networks, which enable our model to learn not only the label dependency but also the relationship between the anatomical regions in the chest X-ray. We further utilize a method to efficiently create an adjacency matrix for the anatomical regions using the correlation of the label across the different regions. Detailed experiments and analysis of our results show the effectiveness of our method when compared to the current state-of-the-art multi-label chest X-ray image classification methods while also providing accurate location information.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87240-3_77

SharedIt: https://rdcu.be/cyl6U

Link to the code repository

https://github.com/DIAL-RPI/AnaXNet-Anatomy-Aware-CXR-Classification

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper develops a neural network approach to chest x-ray multi-label classification and bounding-box localization of abnormality using two-modules: a detection module and a graph convolutional anatomical dependency module.

The method is evaluated on the Chest ImaGenome dataset which uses frontal chest x-rays from the MIMICCXR dataset. The method compared favorably with alternate approaches (Faster R-CNN, GlobalView and CheXGCN).
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The problem addressed is significant and the methodology is applicable in other applications. The graph-based formulation is interesting and novel. The results are impressive. Table 3 shows excellent performance. The approach involved a number of technically challenging aspects. The promised availability of the curated dataset is also a strength. Figure 2 and Table 4 (should be a figure) are nice illustrations of the approach.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Statistical significance of the results should be shown. Result failures should be shown.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The method appears to be reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Statistical significance and failures help place the results in context and would be a welcome addition.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The strengths of this paper are the novelty of the method and the quality of the results. Overall, it is well-executed and well explained.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

This paper proposed a global and local visual features based classification framework for multi-label CXR findings. ROI features were extracted by fast-RCNN and then the global feature was obtained by GCN based on ROI features. Global and local features were combined by attention-like mechanism.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Comprehensive experiments were conducted to prove the effectiveness of the proposed method.
2. The proposed algorithm can achieve better performance than other methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. Writing is not rigorous enough and the method part may be need to be rewritten.
2. More example images such as Fig.2 are necessary.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

no comments
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. In equation (2), the summation’s index may be not i but m?
2. In equation (5), Ri and Qi should be concatenation but not multiplicated.
3. Some symbols in equations (2), (4) and (5) were not described.
4. Table 1. should not be in page 4 and before the Dataset heading.
5. Global view based on AnaXNet should be moved in experiment part not in conclusion part.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This is an interesting work and the validation experiment is comprehensive.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

This paper targets at chest X-ray diagnosis by co-modeling anatomies and disease types. In specific, anatomical structures are localized with a detection model, and their corresponding features are extracted and modeled with self-attention with a GNN. Such a model design enables explicit regularization of location related disease occurance and disease location reasoning.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The work purposes to co-model anatomies and diseases for the task of disease classification. The method has certain novelty, since diseases are commonly anatomy dependent, and co-modeling the two aspects can potentially serve as regularization and reduce false positives. The experimental results on a large scale dataset seem to prove its effectivenss by comparing to classification and detection types of baseline models.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Although the work is interesting, there are a few issues need to be clarified.
1. The co-occurrence matrix utilizes a filtering threshold to avoid overfitting to rare occurrences. How does this design impact the modeling of long-tailed events? Is it possible to model the co-occurrence as a probability rather than 0/1?
2. The anatomy detection results are utilized to crop the feature maps to enforce the model to focus on related areas. Can such hard cropping makes information loss for the disease classification? Is there any failing cases? Also, what is the accuracy for the anatomy detection? Can the errors of anatomy detection impact the final classifcation? If so, how can such isses possibly resovled in the future?
3. Related to the above point: some related works (although in the domain of semantic segmentation) that model a soft relationship can be discussed for further work. https://arxiv.org/abs/2102.01256 https://ieeexplore.ieee.org/document/7493497
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The dataset and models are well explained for reproducing.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Refer to weakness section for details.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This work presents an interesting and promising approach for improving the diangositic classifcation by co-modeling with anaotmies, mainly because the diseases are mostly anatomy dependent. The experimental results on large scale dataset reflects its potential, while more attention related existing approaches can be further compared for verification. A few issues raised above can be addressed to clarify the contributions and throw light on future reseaches.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

8
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The overall evaluation of this work is good. As shown below, reviewer one summarized carefully the contributions of this work. The average score is also aligned with early accept threshold. R3 mentioned some limitations which do not seem fatal. This paper also studied the related literature very well.

The problem addressed is significant and the methodology is applicable in other applications. The graph-based formulation is interesting and novel. The results are impressive. Table 3 shows excellent performance. The approach involved a number of technically challenging aspects. The promised availability of the curated dataset is also a strength.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

Author Feedback

We are grateful to the reviewers and area chairs for their hard work, and positive reception of this work. We would like to clarify a few points:

1) R1 has requested more examples in Fig 2. The choice here was mainly due to space limitations. We will make an effort to fit one more example in the final version.

2) We have fixed a few issues raised by R2: a) “In equation (2), the summation’s index maybe not i but m?” Index is fixed b) “In equation (5), Ri and Qi should be concatenation but not multiplicated.” Fixed, missed the symbol ; c) “Some symbols in equations (2), (4) and (5) were not described.” Either the symbols (variables) are described in the text above due to being steps of the model, e.g., Qi, Zi, Ri, or new parameters (weights) are described after each equation. d) “Table 1. should not be on page 4 and before the Dataset heading.” e) “Global view based on AnaXNet should be moved in the experiment part, not in the conclusion part.” Both d and e are stylistic preferences. On the updated version, the dataset is on the same page as the start of Section 3.1. Dataset.

3) R4: “Related to the above point: some related works (although in the domain of semantic segmentation) that model a soft relationship can be discussed for further work.” We will add the listed related work to references.

4) R4: “The co-occurrence matrix utilizes a filtering threshold to avoid overfitting to rare occurrences. How does this design impact the modeling of long-tailed events? Is it possible to model the co-occurrence as a probability rather than 0/1?” Indeed, as the reviewer correctly points, there exist several options for constructing an adjacency matrix, including weighted, exponentially weighted or dynamic adjacency matrices [1]. There is an inherent trade-off between computational complexity and accuracy, as weighted adjacency matrices densely connect all nodes. On the other hand, simpler binary versions are more computationally efficient but how well long-tailed dependencies are captured and how much overfitting occurs depends on the filtering threshold. Similarly, there exist several options for the GNN model [1]. We note that further exploration of the aforementioned design choices is beyond the scope of this work.

5) R4: “ The anatomy detection results are utilized to crop the feature maps to enforce the model to focus on related areas. Can such hard cropping makes information loss for the disease classification? “ “Also, what is the accuracy for the anatomy detection? Can the errors of anatomy detection impact the final classification? If so, how can such issues possibly resolved in the future?” Table 2 shows intersection over Union scores (IoU) calculated between the automatically extracted anatomical bounding box (Bbox) regions and a set of single manual ground truth bounding boxes for 1000 CXR images. In situations when a part of the image is not included in any of the detected Bboxes, some disease features could be missed. An additional global feature extractor can help with this issue. Nevertheless, in our current version, the overall accuracy of AnaXNet is superior to GlobalView and that shows the impact of this issues is limited.

[1] Zhang, Ziwei, Peng Cui, and Wenwu Zhu. “Deep learning on graphs: A survey.” IEEE Transactions on Knowledge and Data Engineering (2020).

back to top

AnaXNet: Anatomy Aware Multi-label Finding Classification in Chest X-ray