Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Patrick Godau, Lena Maier-Hein

Abstract

Shortage of annotated data is one of the greatest bottlenecks in biomedical image analysis. Meta learning studies how learning systems can increase in efficiency through experience and could thus evolve as an important concept to overcome data sparsity. However, the core capability of meta learning-based approaches is the identification of similar previous tasks given a new task - a challenge largely unexplored in the biomedical imaging domain. In this paper, we address the problem of quantifying task similarity with a concept that we refer to as task fingerprinting. The concept involves converting a given task, represented by imaging data and corresponding labels, to a fixed-length vector representation. In fingerprint space, different tasks can be directly compared irrespective of their data set sizes, types of labels or specific resolutions. An initial feasibility study in the field of surgical data science (SDS) with 26 classification tasks from various medical and non-medical domains suggests that task fingerprinting could be leveraged for both (1) selecting appropriate data sets for pretraining and (2) selecting appropriate architectures for a new task. Task fingerprinting could thus become an important tool for meta learning in SDS and other fields of biomedical image analysis.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_42

SharedIt: https://rdcu.be/cyhQE

Link to the code repository

N/A

Link to the dataset(s)

https://www.it.pt/AutomaticPage?id=3459

https://nearlab.polimi.it/medical/dataset/

https://derm.cs.sfu.ca/Download.html

https://endovissub2017-roboticinstrumentsegmentation.grand-challenge.org/Home/

https://data.mendeley.com/datasets/c7fjbxcgj9/2

https://datasets.simula.no/nerthus/

https://doi.org/10.7303/syn21903917

https://datasets.simula.no/hyper-kvasir

https://camma.u-strasbg.fr/datasets

https://challenge2019.isic-archive.com/data.html

https://ftp.itec.aau.at/datasets

https://vision.stanford.edu/aditya86/ImageNetDogs/

https://pytorch.org/vision/stable/datasets.html

Reviews

Review #1

Please describe the contribution of the paper

The authors propose the conversion of tasks characterized by video-based imaging data and labels into vector representations, using this to quantify task similarity. The aim was to enable meta-learning (approaches to assist in anticipating the best learning methods for new problems based on previous performance on similar tasks) and the authors investigated the ability of their representation to describe the similarity between tasks and its usefulness for selecting pre-training tasks for future problems and predict the best model for a new problem. The paper evaluates four approaches for quantifying the similarity of the tasks both on the basis of procedure/image type (ex. laparoscopy vs. colonoscopy) and problem type (ex. instrument counting vs. artefact detection). Although meta-learning approaches have been explored in other fields, including medical imaging, this appears to be the first study applying these principles in the context of surgical data science.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- New context: As recognized by the authors, some preliminary steps have been explored by others to apply meta-learning principles to medical images (Cheplygina et al. (2017) – https://arxiv.org/pdf/1706.03509.pdf), this appears to be the first time that these concepts have been investigated in the space of surgical data science, particularly in the context of video-based medical imaging applications. These principles have to potential to stimulate other future works in the MICCAI community and the
- Interest to MICCAI community: Given the rich surgical data science community involved in MICCAI and the breadth of learning models being explored, the application of meta-learning to this space would be of interest to many members of the MICCAI community.
- Multi-approach comparison: The inclusion of four different embedding/distance evaluation methods adds to the rigour of analysis.
- Organization: The paper tells a clear story and was well organized with many sub-headings to guide the reader. Equations were well explained within the space limitations.
- Supplemental information: The details provided in the supplementary file enhance the understanding and reproducibility of this work.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Discussion and limitations: Given the length constraints, there was insufficient space for a comprehensive Discussion to be included in the paper in addition to sufficient methodological detail. Therefore, although there is some description of the novelty and significance of the work, the discussion of results is fairly superficial. The major weakness of this paper is the lack of a discussion of limitations of the approach, as well as future practicality and directions.
- Statistical conclusions: Although some plots have been provided, the paper draws some conclusions regarding the correlation between the task similarity and pretraining task selection/model selection that are difficult for the reader to readily infer visually. Additional supporting information regarding the strength of the correlations observed, comparative performance of the four approaches, and trendline information is necessary to accept the conclusions.
- Level of detail: Given the multi-approach investigation and analysis included in this paper, many aspects of the work were not able to be described in detail within the length constraints. In particular, although the strategy for selecting hyperparameters is very broadly described, many of these values are missing in the paper. A more comprehensive description of the datasets and how tasks were grouped would also aid in reader understanding. Similarly, further description of the network architectures would be necessary to ensure reproducibility of the work, as the present description is minimal.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Overall, the reproducibility of this work is reasonable, provided that the authors follow through with the code-sharing items indicated in the reproducibility checklist. Specifically, the authors have indicated that training and evaluation code, as well as pre-trained models, will be made available. In addition, all datasets used were publicly available and are detailed in the Supplemental File. Some of the most important hyperparameters are described in the text, though many of these values are missing owing to the length limitations and the large number of analyses performed, and the strategy for hyperparameter selection is briefly described.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- Title: I feel that the title could be modified to be more reflective of the work presented. Specifically, I would recommend referring to Surgical Data Science for Endoscopic Biomedical Images rather than “Medical Image Analysis” broadly.
- 2.1 Task Fingerprinting: I would recommend moving the first two sentences of this section into the last paragraph of the Introduction to clarify the focus on endoscopic vision tasks without muddying the task representation.
- Vague words: Many places in the methodology use vague terminology, such as “generally” in the sentence beginning “For the second kind of embedding we generally proceed as follows:…”, in places where the steps should either have been taken in all cases or not at all. It would be clearer if these vague terms were removed from the Methods section.
- Comparison plots: As the pretraining and model selection plots for the other approaches aside from FED are only available in the supplementary material, which is not easily accessible for readers, it would help the reader’s understanding to include some of these results in the main paper. In particular, it is difficult to immediately understand a clear benefit of FED over the other approaches. Given the length constraints, this may be achievable by overlaying the trendlines for the other methods onto the plots in Figure 4 with a clear colour legend, in addition to the details below.
- Statistical conclusions: Additional supporting information regarding the strength of the correlations observed and trendline information is necessary for the readers to understand the conclusions that were drawn for H2 and H3 in particular. This information should be elaborated on briefly in the text or may be possible to add directly onto the existing plots, leveraging this space.
- Discussion and limitations: Although I recognize the length constraints, it is critical to acknowledge the limitations of the work. There appears to be a few lines of space remaining at the end of the paper and I strongly recommend using this space to include a brief description of these limitations.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Although this work has some weaknesses related to the lack of detail provided and discussion of limitations, I believe that this work would be of notable interest to the MICCAI community and is sufficiently novel with regards to the application of meta-learning techniques to surgical data science as a new context, warranting acceptance.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

Taking into account the problem of the size of the training data set in relation to the size of the deep model and their rarity, in particular in the area of Surgical Data Science, the authors proposed a method of standardizing the description of the problem, using the fingerprint concept, in order to use the similarity of the tasks for both selecting appropriate data sets for pretraining and selecting appropriate architectures for a new task.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The concept of using fingerprint to assess the similarity of tasks in the field of Surgical Data Science is new.
2. The authors proposed a new avoidance in the task description.
3. The authors confirmed the usefulness of the proposed concept of for both selecting appropriate data sets for pretraining and selecting appropriate architectures for a new task.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Due to the fact that the authors presented a new concept, it is not validated in detail.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The article is legible both in formal and visual terms. Includes state of the art references.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The authors proposed an interesting idea that, in consistency, may lead to overcoming some of the machine learner’s limitations in the field of Surgical Data Science.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The article deals with the important issue of the effective use of limited resources in machine learning to effectively solve various types of problems in the field of Surgical Data Science.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

The paper presents an approach in the area of meta learning for (bio-)medical image analysis. In doing so, the authors tackle the problem of quantifying task similarity with a concept that they refer to as task fingerprinting.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The main strength is the evaluation, which is quite comprehensive with 26 different tasks.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Even if the authors did a comprehensive evaluation, the results appear still quite superficial to me. That all the ‘scopic’ tasks cluster together and pathology vs. artefact cluster seems no surprise. On such a high-level, I would expect that a histogram would maybe already be able to deliver some results. The authors should comment on that.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The methods are described in detail and hence can be reproduced. I still hope that the authors make their source code available with the paper. For evaluation, the authors used publicly available tasks/images (Table 2 Supplementary Material), which is a plus in my opinion for reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The authors should definitely dive deeper into this fingerprint tasks selection approach. It would be very interesting to see if such an approach can support the selection of an architecture for a ‘sub-task’. In example, if the approach can make a suggestion in regards to a certain type of tumor. From my opinion, this is the level, at which most researchers struggle currently.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

To me it seems that the results are still too high-level and not surprising. In the current state, the presented approach would not support me in selecting an appropriate architecture for a new task, because the tasks differences are already too obvious. However, I think this is a very promising direction and with a deeper investigation, this could be a strong work.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

3
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper presents a new method, called task fingerprinting, for meta learning in bio/medical image analysis. This new method converts imaging data and labels to a fixed-length vector representation and then quantifies the task similarity in the fingerprinting space.

The key strengths include: 1) This may be the first time that meta learning is applied in the space of surgical data science. Thus, this paper would be of interest to many members of the MICCAI community. 2) The concept of using fingerprint to assess the similarity of tasks in the field of Surgical Data Science is new. 3) The evaluation is quite comprehensive with 26 different tasks.

The key weaknesses include: 1) Discussion is not enough. 2) Many technique details are missing.

Due to the length limitation, these weaknesses may be acceptable. Considering the topic, the strengths, and the agreements among three reviewers, I would recommend “accept”.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

Author Feedback

We would like to thank all reviewers for their valuable feedback and suggestions! To address their major concerns, we would like to use the additional 1/2 page space as follows:

enhancing the analysis by incorporating correlation quantification and statistical significance

discussing limitations and future directions of the approach

extending the technical details regarding the used models

RE Rev. 3: We thank you for the extremely thorough assessment and agree, that our work is solely an initial proof-of-concept and that an even broader validation has to be done, to verify applicability and refine hyperparameters. The key (and non-obvious) takeaway from our work for us was that the proposed task fingerprints do actually encode a wealth of relevant information on the underlying task. We were surprised that we were able to leverage the concept for three complementary aspects: semantic representation of data, weight transfer and model choice.

All in all, we are convinced that the suggestions by the reviewers will increase the quality of our paper.

back to top

Task Fingerprinting for Meta Learning in Biomedical Image Analysis