Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Jiangpeng Yan, Hanbo Chen, Kang Wang, Yan Ji, Yuyao Zhu, Jingjing Li, Dong Xie, Zhe Xu, Junzhou Huang, Shuqun Cheng, Xiu Li, Jianhua Yao

Abstract

Segmentation of whole slide images (WSIs) is an important step for computer-aided cancer diagnosis. However, due to the gigapixel dimension, WSIs are usually cropped into patches for analysis. Processing high-resolution patches independently may leave out the global geographical relationships and suffer slow inference speed while using low-resolution patches can enlarge receptive fields but lose local details. Here, we propose a Hierarchical Attention Guided (HAG) framework to address above problems. Particularly, our framework contains a global branch and several local branches to perform prediction at different scales. Additive hierarchical attention maps are generated by the global branch with sparse constraints to fuse multi-resolution predictions for better segmentation. During the inference, the sparse attention maps are used as the certainty guidance to select important local areas with a quadtree strategy for acceleration. Experimental results on two WSI datasets highlight two merits of our framework: 1) effectively aggregate multi-resolution information to achieve better results, 2) significantly reduce the computational cost to accelerate the prediction without decreasing accuracy.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_15

SharedIt: https://rdcu.be/cyl9U

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

For region segmentation of WSIs, to fuse the information of local resolution and high resolution, this paper proposed a Hierarchical Attention Guided framework.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper generated hierarchical and sparse attention maps for effectively aggregate multi-resolution information. The experimental results show the weighted selection guide map can reduce inference areas and improve the prediction process maintain segmentation performance.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1.WSIs segmentation task is mentioned in the abstract and introduction section, but the experimental part is not based on WSI, but some small selected patches from WSI. 2.The experiment does not give the detailed strategy to get the small images from WSIs of Camelyon2016. 3.The experimental results were not compared with the other existed segmentation methods on the two public datasets. 4.The sum of Cancerous images and Noncancerous images in Table 1 is inconsistent with the number of Images.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

There are some difficulties to reproduce the paper, The experiments are conducted on some 2048*2048 patchs from WSIs not on the WSIs. And how to get this images is not mentioned in the paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The author should explain about the relationship between WSIs and images, and how to select these images for WSIs?
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The method proposed in this paper has some new ideas, but the experiments are not applied on WSIs, so it is not clear whether the proposed method can get the similar results on WSIs.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

1
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

A Hierarchical Attention Guided (HAG) framework to keep geo-graphical relationships in WSIs was proposed. The framework contains a global branch and several local branches to perform prediction at different scales.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. An important problem of keeping global information in WSIs is tackled;
2. The paper is well written;
3. Testing on publicly available Camelyon16 and HCC datasets;
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. I’d suggest to expand on related work, there has been some effort in developing multi-resolution networks as well as networks for preserving the global context, for example: a) HookNet: multi-resolution convolutional neural networks for semantic segmentation in histopathology whole-slide images https://arxiv.org/abs/2006.12230 b) Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images https://arxiv.org/abs/1807.09607 c) I also agree that [19] summarises some of the related work quite well, but I guess expanding on the work specific to the publication would be beneficial too.
2. If possible, adding the confidence intervals in Table 2 would be beneficial;
3. Potentially, there are other cancer types where the approach would show more advantages, where global relationships are even more important than in Camelyon16/HCC, for example, in prostate cancer classification/segmentation, for example PANDAS dataset?
4. I think adding a description for macro/meso/micro/non-selective/selective fusion in one place not too far from Table 2 itself will improve readability and interpretability of the results.
5. Adding the performance for a baseline model, state-of-the-art model, to Table 2 for Camelyon16/HCC colud help to interpret the results. Does ‘micro branch’ represent such a model?
6. Maybe comparing the proposed methods to some multi-resolution approaches from the related work?
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Good that the methods are tested on public data. However, sharing the source code will highly improve the reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. I’d suggest to expand on related work, there has been some effort in developing multi-resolution networks as well as networks for preserving global context, for example: a) HookNet: multi-resolution convolutional neural networks for semantic segmentation in histopathology whole-slide images https://arxiv.org/abs/2006.12230 b) Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images https://arxiv.org/abs/1807.09607 c) I also agree that [19] summarises some of the related work quite well, but I guess expanding on the work specific to the publication would be beneficial too.
2. If possible, adding the confidence intervals in Table 2 would be beneficial;
3. Potentially, there are other cancer types where the approach would show more advantage, where global relationships are even more important than in Camelyon16/HCC, for example, in prostate cancer classification/segmentation, for example PANDAS dataset?
4. I think adding a description for macro/meso/micro/non-selective/selective fusion in one place not too far from Table 2 itself will improve readability and interpretability of the results.
5. Adding the performance for a baseline model, state-of-the-art model, to Table 2 for Camelyon16/HCC could help to interpret the results. Does ‘micro branch’ represent such a model?
6. Maybe comparing the proposed methods to some multi-resolution approaches from the related work?
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

An important problem of keeping global information in WSIs is tackled and tested on publicly available Camelyon16 and HCC datasets.
What is the ranking of this paper in your review stack?

4
Number of papers in your stack

5
Reviewer confidence

Somewhat confident

Review #3

Please describe the contribution of the paper

This paper described a method for tissue image segmentation in H&E-stained histologic images using a multi-scale framework in combination with an attention mechanism to reduce to increase the segmentation performance as well as reading the inference time. The proposed method was applied to two distinct datasets and show good segmentation performances. The results also show that by using the selective fusion approach, computational cost can be significantly reduced without degradation in the segmentation performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-The proposed approach for selective fusion from multi-scale images to reduce the computational cost is novel.
- The paper is well-written and easy to follow in most parts.
- The selected topic is of high relevance for the research community in the field and the proposed approach can be beneficial in practice.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Comparison to other SOTA methods is not provided in the results section.
- While figure 2 shows the generic workflow of the proposed algorithms, the model architecture in details as well as the attention mechanism should be added to the manuscript (maybe as an appendix) for better understanding for the readers.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I think in order to reproduce the results, the code should be released.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- as far as could understand 2048x2048 images were used as the original high-resolution images in the paper and not the actual WSIs. So from the beginning, a cropping strategy is involved which may not be optimal. A justifying explanation about this issue should be discussed in the paper.
- It would be interesting to see the results without the proposed attention mechanism (i.e. P = (Pmacro + Pmeso+Pmicro)/3)
- Please compare the segmentation performance with other SOTA approaches applied on the Camelyon16 dataset.
- To solve the problem, here we propose a Hierarchical Attention Guided (HAG) framework (Fig.1).–> I think it should be Fig.2
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed approach for tissue segmentation in histologic images is novel and can be used in practice. However, the paper quality can be still improved (please refer to my comments)
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a Hierarchical Attention Guided (HAG) framework for WSI segmentation. The framework contains a global branch and several local branches to perform predictions at different scales. However, the authors miss important comparison results, they should compare with state-of-the-art WSI image segmentation and add some ablation experiments, such as the results without the proposed attention mechanism (i.e. P = (Pmacro + Pmeso+Pmicro)/3). In the final version, a clear experiment setting should be added. Some clear descriptions related to the fusion should be also clarified. The sum of Cancerous images and Noncancerous images in Table 1 is inconsistent with the number of Images.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

Author Feedback

In this work, we want to call the audience’s attention that in the clinical practice of WSI analysis, generating results in a timely manner is as equally important as accuracy, while most existing studies mainly focus on accuracy. We are glad that reviewers found our work “has some new ideas”(R1), “important” (R2), and “can be beneficial in practice” (R5). Nevertheless, this work is still an early attempt at solving this speed VS accuracy problem in this area and can be further improved. We appreciate all the constructive comments from reviewers and clarify our main revisions to address these concerns in the following:

1: About the baseline settings. R2 questioned that whether ‘micro branch’ represents a baseline model and advised us to improve readability and interpretability of Table2. R3 and the meta-reviewer also suggested to compare the segmentation performance with other SOTA approaches applied on the Camelyon16 dataset and more ablation experiments if possible.

• We apologize that we did not make Table2 more readability in the original version. As we roughly categorized current WSI analysis algorithms into the downsampling-based and patch-based methods, the ‘macro branch’ itself can be regarded as one typical downsampling-based baseline and the ‘meso/micro branch’ can be regarded as two typical patch-based baselines with different resolutions. We will add a clear description of this setting in the caption of Table2 in our final version. We also appreciate R2 for recommending other multi-resolution WSI analysis solutions (HookNet and MRN), we will discuss them in the introduction. Limited by the space of a conference paper, we did not provide an overall comparison study in this version, more experiments will be compared in our future extended version.

2: About the pre-process of WSI dataset. R1 asked how we selected WSI patches as images and doubted that these patches are not equal to the actual WSIs with the inconsistency in Table1. R3 understood our strategy to use 2048x2048 images but also asked for a justifying explanation of the cropping strategy. The meta-reviewer also asked for a clear experiment setting description.

• We appreciate R1 for pointing our typos in Table1 out and we will correct them in the final version along with the figure reference typo pointed by R3. • Due to the gigapixel size of the actual WSI and limited memory of GPUs, it is a common strategy to use high-resolution image patches in WSI-related researches. The following cropping strategy is adopted to generalize the images we used in experiments: A sliding window of size 2048x2048 and step 1792 (256 pixels overlap) is adopted to crop the image. We filter background crop by setting pixel with chroma (max[RGB] – min[RGB]) larger than 64 as foreground pixels and only keep the crop containing more than 5% foreground pixels. Notably, as the non-cancerous region in the Camelyon16 dataset is much larger than the cancerous region, we use a randomly sample sub-set of non-cancerous patches. We will add this description in the final version. • Though we did not conduct experiments on the whole/raw WSI image, from our perspective, we do not agree with R1 that the input crop is “small”. We also want to correct the R1 comments that “on the two public datasets” in which that we clearly pointed out that the HCC dataset is collected by our collaborators.

Other valuable discussions, for example, R2 thought that there are other cancer types (prostate cancer) where our approach would show more advantage when global relationships are even more important, will be carefully considered in our future work. We feel grateful that this work is early accepted and we would like to discuss the potentials of HAG framework with the community.

back to top

Hierarchical Attention Guided Framework for Multi-resolution Collaborative Whole Slide Image Segmentation