Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Canfeng Lin, Huisi Wu, Zhenkun Wen, Jing Qin

Abstract

Malaria is one of the main threats to global health. Manual examination of thick and thin blood smears is the current gold standard for diagnosing malaria. However, it is of extremely low throughput and susceptible to human bias, and hence, automated detection tools are highly demanded in practice. Developing an automated detection algorithm is a quite challenging due to (1) the wide range of variations in bright field microscopy images, and (2) more importantly, the severe class imbalance problem in this task. While recently proposed balanced group softmax is somehow able to alleviate the problem of class imbalance, the crucial prerequisite for its success is that the samples can be correctly categorized into different classes. We present a novel importance-aware BGS (IaBGS) to address the class imbalance problem and thereby improve the detection performance. Our main idea is to introduce a relation module (RM) before the group softmax module in the network to learn the relationships between different cells. We then figure out the feature of a cell by considering the relationships between this cell and other cells in the input image with different cells having different learned weights. In the RM module, we leverage both the appearance features and locations to calculate the feature of each cell to take full advantage of the relationships to obtain more discriminative features for BGS. By this way, the proposed IaBGS is able to more effectively and accurately solve the class imbalance problem, and thereby achieve better detection performance. We conducted extensive experiments on a famous dataset to evaluate the proposed IaBGS. Experimental results demonstrate the effectiveness of the proposed approach, consistently outperforming state-of-the-art methods. Codes will be released upon publication.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_44

SharedIt: https://rdcu.be/cymaW

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper focuses on the automatic detection of infected and uninfected cells in the blood smear images. To relieve the class imbalance problem, this paper introduced an importance-aware balanced group softmax for loss calculation. Comparative experiments were conducted on the BBBC041 dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper introduced an importance-aware balanced group softmax for loss calculation to relieve the class imbalance problem in detection.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper focused on the class imbalance problem but introduced a relation module. Is there any relation between them?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
    1. The paper focused on the class imbalance problem but introduced a relation module. Is there any relation between them?
    2. The author mentioned that the previous methods are incapable of tackling the wide range of variations in different microscopy images. How about the generalization performance of the proposed model?
    3. Written needs to be improved, especially in Sec 2.1. Besides, some errors.
    4. How to solve the occluded cells’ detection as mentioned in the conclusion?
  • Please state your overall opinion of the paper

    borderline accept (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper focused on the class imbalance problem but introduced a relation module. Is there any relation between them? Written needs to be improved, especially in Sec 2.1

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    2

  • Reviewer confidence

    Very confident



Review #2

  • Please describe the contribution of the paper

    The authors propose an novel automatic method for malaria blood smear cells detection, based on the importance-aware balanced group softmax, which is realized by introducing the RM before the balanced group softmax module.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors introduce perfectly the problem. Overall, the work is fascinating. The authors propose a new way to handle class imbalance in malaria parasite detection.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I am quite certain that bibliography misses some important scientific manuscript in this field. Just to cite a few:

    • https://www.sciencedirect.com/science/article/pii/S193152441730333X
    • https://link.springer.com/chapter/10.1007/978-3-030-13835-6_7
    • https://www.mdpi.com/1424-8220/18/2/513
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors describe in a good way their method and state that they want to release the code upon pubblication. It is certainly reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    In Methodology, the authors state they employ FRCNN with Resnet-50 as the backbone as the baseline method. Following, they operate with multiple scales. Why and how many? In 2.3, the authors resize the image to 1333x800 to reduce memory usage. Why do they use this resolution? Please explain. Again, they perform a random flip. Why? Is it a normalization strategy to avoid overfitting? Please motivate the choice.

    In general, the article wants to face the issue of class imbalance but, reading through their manuscript, I cannot discover how the tested dataset is imbalanced and in what manner. Please, add some details regarding this problem. Otherwise, it seems that class imbalance is not an issue at all.

    Overall, the work is fascinating. I also suggest the authors add a few lines to describe how they intend to face different staining problems if applying this method to another dataset, as stated in the introduction.

    Moreover, it could be interesting to test this approach to more complicated datasets (e.g., MP-IDB), presenting challenging coloration scenarios.

    The manuscript contains several English-language issues (e.g., in Fig.1, “When training, (who?) only need…”.

    Finally, I suggest fixing the numbering order of the citation.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In Methodology, the authors state they employ FRCNN with Resnet-50 as the backbone as the baseline method. Following, they operate with multiple scales. Why and how many? In 2.3, the authors resize the image to 1333x800 to reduce memory usage. Why do they use this resolution? Please explain. Again, they perform a random flip. Why? Is it a normalization strategy to avoid overfitting? Please motivate the choice.

    In general, the article wants to face the issue of class imbalance but, reading through their manuscript, I cannot discover how the tested dataset is imbalanced and in what manner. Please, add some details regarding this problem. Otherwise, it seems that class imbalance is not an issue at all.

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    4

  • Reviewer confidence

    Very confident



Review #3

  • Please describe the contribution of the paper

    The authors introduce a new way to deal with strong class imbalance during training: importance-aware balacne group softmax. They show that their approach outperforms other models in terms of detecting and classifying cells infected with the malaria parasite from normal red blood cells

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper clearly motivates the approach with the prevalent problem of class imbalance in biomedical data. The solution is nicely described and executed on a publicly available data set. The relation module is a novel idea and seem to improve accuracy

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The method should be applied to other imbalanced biomedical data sets. Table 1 and 2 lack variances. Pls do experiments multiple times to provide means and s.d. that allows to estimate significance of difference.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Data is publicly available. Code is not provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    Some things have to be better described: What is in the others category? Why multi-scale feature extraction? Fig 1 could be nicer.

  • Please state your overall opinion of the paper

    Probably accept (7)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    A good idea for an important problem. Should be applied to > 1 datasets to show usefulness

  • What is the ranking of this paper in your review stack?

    1

  • Number of papers in your stack

    5

  • Reviewer confidence

    Confident but not absolutely certain



Review #4

  • Please describe the contribution of the paper

    The authors propose a ‘relation module’, which adds a loss tied to nearby objects in the input image, to address class imbalance. They combine this with a CNN using Group Softmax and evaluate on a malaria image dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The issue of mitigating class imbalance effects when training models is very important.

    The proposed Relation Module is an interesting idea, and is well-explained.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper (in common with many ML papers on malaria) does not address vital domain issues and use-cases that are central to malaria diagnostics.

    The malaria dataset (more generally, datasets with strong class imbalances) appears ill-suited to evaluate the proposed relation module.

    Please see Comments for details.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The author checklist does not match the contents of the paper (eg no error estimates for results).

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

    There are two categories of comments: malaria-related and ML-related.

    Malaria-related: The paper (in common with many ML papers on malaria) does not address vital domain issues and use-cases central to malaria. So either the malaria motivation content should be removed and the paper can focus on the ML method; or the paper should honor malaria use-cases and needs. It is of course fine to use a malaria dataset to evaluate a method, independent of any intent to impact malaria diagnostics.

    If the goal is to develop methods that will concretely improve malaria diagnosis and treatment, please consult the following references for info about (a) use-cases and their associated performance specifications, and (b) important malaria-specific ML issues: (a) WHO’s “Malaria Microscopy Quality Assurance Manual V2”; WHO’s “Malaria Diagnosis Guide”; CDC’s information (https://www.cdc.gov/malaria/diagnosis_treatment/clinicians2.html); Ashley, E., et al. “Spread of artemisinin resistance in plasmodium falciparum malaria”. New England J of Medicine; White “The parasite clearance curve” Malaria J (2011). (b) Poostchi et al “Image analysis and machine learning for detecting malaria” 2018; Mehanian et al. “Computer-Automated Malaria Diagnosis and Quantitation Using Convolutional Neural Networks” ICCV 2017; Delahunt et al. “Fully-automated patient-level malaria assessment on field-prepared thin blood film microscopy images” IEEE GHTC 2019 (also on arXiv); Horning et al. “Performance of a fully-automated system on a WHO malaria microscopy evaluation slide set” Malaria J 2021; Linder et al “A Malaria Diagnostic Tool Based on Computer Vision Screening and Visualization of Plasmodium falciparum Candidate Areas in Digitized Blood Smears” PLoS One 2014; Manescu et al “Expert-level automated malaria diagnosis on routine blood films with deep neural networks” Wiley 2020.

    I believe ref [24] has been misunderstood, based on the sentence.


    ML-related:

    The special dataset conditions required to make the Relation Module appropriate for use are not sufficiently addressed. In particular:

    Is the relation module relevance for this dataset? The relation module is “considering the relationships between this cell and other cells in the input image”. However, in thin blood films malaria-infected cells are typically rare (for vivax, 1 per 1000 uninfected cells). So there will usually be no other infected cells in the input image. This will be generally true for any dataset with large class imbalances. Also, infected cells are spread roughly randomly through a blood film, so the location of another infected cell would not have relevance. Perhaps malaria blood films are not good datasets on which to evaluate the proposed relation module. The relation module requires a dataset where class objects have spatial correlations, eg “moving cars (class 1) always contain humans (class 2)”.

    Are weakly-supervised methods appropriate comparisons for a fully-supervised method?

    AP and mAP have various definitions. Please specify what definition you are using.

    In the malaria context, the goal is to count infected cells (and identify the malaria species). Is bounding box overlap an appropriate measure of success at this? Also, is averaging over classes (mAP) appropriate in class imbalanced situations?

    The combination of RBCs and leukocytes into one class (sec 3.1) is interesting, because leukocytes resemble vivax late-stage parasites but they are totally unlike RBCs.

    Are there confidence intervals / error estimates for Tables 1 and 2?

    Does the statement (page 2) “These models still cannot solve the problem of class imbalance well, resulting in their performance is still not satisfactory” accurately reflect the results of the cited methods? Are any failures in performance traceable to class imbalances, or are they due to other issues?

    Note: The ranked ordering given below is a dummy value, and meaningless, as I did not compare the various papers.

  • Please state your overall opinion of the paper

    reject (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The dataset is not appropriate to the method, and what datasets the method would be appropriate for is not described. The malaria-specific motivation (pgs 1-2) does not reflect actual needs in malaria diagnostics.

  • What is the ranking of this paper in your review stack?

    3

  • Number of papers in your stack

    5

  • Reviewer confidence

    Very confident




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper aims to detect malaria cells from blood smears, particularly under the challenge of imbalanced training data. The reviewers have some major concerns on the techniques and experiments, including:

    1. The connection on Relation Module, importance-aware balanced group softmax, and the class imbalance problem is not explained clearly. In addition to showing the results, why it works and how it works need more details on the motivation and design.
    2. The method is only tested on one dataset. The generalization capability of the method is not validated and compared. On the experimental dataset, the severity of imbalanced classes and how the proposed method overcomes the problem is not analyzed with details. How will the class imbalance problem affect other methods compared to the proposed one is not clear. The reviewers also brought up questions/suggestions on the related works, and biological applications, and many detailed questions on the design and implementation. Please consider to address them in the rebuttal. In addition to the above major questions/comments from the reviewers. The AC is thinking to simulate some scenarios of datasets with different levels of imbalanced data, by intentionally addling or deleting some data from some classes and on different datasets, so the effectiveness of the techniques to overcome class-imbalanced problem can be better validated.
  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    6




Author Feedback

We appreciate the affirmative comments and constructive suggestions from all reviewers. While well recognizing the significance and technical novelty of our paper, they also raised some questions and concerns. We believe we can address all major questions and concerns in the final version.

(1) Why and how the RM works (R1 and R4) While BGS is able to alleviate the problem of class imbalance, the prerequisite for its success is that we can obtain representative features of samples so that they can be correctly categorized into different classes. Unfortunately, due to severe class imbalance, the features of samples in infrequent class (infected class) are easily diluted or ignored. The RM is proposed to extract more representative features for these samples by considering the relation between one cell and other cells in the input image with different cells having different learnt weights. With the RM, the BGS can more effectively deal with class imbalance, as demonstrated in the experiments. The RM leverages both appearance features and location information to figure out the relation between cells and then generates a new feature map by integrating the relation into original feature map generated from RPN. The new feature map is fed into BSG to help tackle class imbalance. We’ll further elaborate why and how the RM works in the final version.

(2) Details on the class imbalance problem of the dataset (R2) In the dataset, there are 77523 uninfected cells, accounting for 96.77% of all cells, and infected cells 2590, accounting for 3.23%, which shows the severity of the class imbalance problem in this task. As pointed out by the literature, the imbalanced training data will seriously unbalance the weight norms of deep learning models. With the standard softmax, this imbalance will be further amplified as it handles all classes together (our experiments also demonstrate this). We’ll integrate these details and discussions in the final version.

(3) Generalization ability (R1 and R3) While we have not conducted experiments on other datasets, we have performed extensive experiments to evaluate the proposed model. The application has a wide range of variations, but with the proposed model, we achieved the best performance in the dataset, which corroborates its capability in dealing with samples with large variations. As suggested, we’ll attempt to harness it on more datasets with class imbalance problem and report the results in the final version.

(4) Simulated scenarios proposed by AC Thanks for the valuable suggestions and it is really a good way to evaluate its capability. However, in the dataset, there are usually multiple classes with different numbers in one image. We have to filter the images and then add or delete them to simulate different proportions, which is quite time-consuming. We’ll endeavor to incorporate the results in the final version if we can complete the experiments.

(5) Results in Tab 1 and Tab 2 (R3) They are means of multiple experiments and we’ll add variances in the final version.

(6) Occluded problem (R1) The RM employs both the appearance feature and location information to extract features, which, in some sense, are helpful to distinguish occluded cells.

(7) “other” category (R3) If the “other” category is not set, the classifier may generate multiple category labels. In a group, the “other” category refers to the category that does not in the current group.

(8) Weakly supervised (R4) We do compare the proposed model with weakly-supervised method (MOFF in Table 2).

(9) AP and mAP (R4) All AP and mAP refer to AP50 and mAP50. The mAP is the average of AP in all categories.

In summary, as mentioned by reviewers, this paper proposes a novel and interesting method to relieve the class imbalance problem, which is challenging in many medical image computing tasks. We believe the proposed method will inspire many readers in MICCAI society and eventually has the potential for being impactful.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In the feedback, the authors addressed some major questions and concerns from the reviewers. Why and how the RM works need to be elaborated in the final version, along with the dataset details and experiments on other datasets.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    11



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    There are mixed reviews. Main issue is the relatedness between the relation module and the objective of handling class imbalance. There are many other questions around evaluation. Clinical use cases specific to melaria are asked as well. The rebuttal has clarified the design motivation of relation module. It is reasonable to consider relations for improving the detection performance. The rebuttal, however, did not address R4’s question on clinical cases in melaria. It can be understanable that a MICCAI paper would focus on method development. However, the final version should revise the introduction to reflect the reviewer’s concern.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    8



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes a modification of the softmax loss for handling class imbalance in malaria cell detection and the method obtains competitive results on a public dataset. The rebuttal responds adequately to most of the reviewers’ concerns. It provides additional experimental details and clarifies the role of the proposed relation module to address class imbalance. Additional references and context on malaria should be included in the document, as requested by R2 and R4.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

    7



back to top