Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

# Authors

Sathyanarayanan N. Aakur, Sai Narayanan, Vineela Indla, Arunkumar Bagavathi, Vishalini Laguduva Ramnath, Akhilesh Ramachandran

# Abstract

The emergence of novel pathogens and zoonotic diseases like the SARS-CoV-2 have underlined the need for developing novel diagnosis and intervention pipelines that can learn rapidly from small amounts of labeled data. Combined with technological advances in next-generation sequencing, metagenome-based diagnostic tools hold much promise to revolutionize rapid point-of-care diagnosis. However, there are significant challenges in developing such an approach, the chief among which is to learn self-supervised representations that can help detect novel pathogen signatures with very low amounts of labeled data. This is particularly a difficult task given that closely related pathogens can share more than 90% of their genome structure. In this work, we address these challenges by proposing MG-Net, a self-supervised representation learning framework that leverages multi-modal context using pseudo-imaging data derived from clinical metagenome sequences. We show that the proposed framework can learn robust representations from unlabeled data that can be used for downstream tasks such as metagenome sequence classification with limited access to labeled data. Extensive experiments show that the learned features outperform current baseline metagenome representations, given only 1000 samples per class.

SharedIt: https://rdcu.be/cyl6x

N/A

N/A

# Reviews

### Review #1

• Please describe the contribution of the paper

The authors propose to classify pathogens from metagenome samples by constructing a graph from k-mer sequences and a pseudo-image from k-mer co-occurrences. They use node2vec to describe each k-mer and a CNN autoencoder to embed pseudo-images. Information is fused via an attention mechanism. Such approaches are of high interest to detect novel pathogens.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors propose an interesting approach that converts genomic sequence data into an image such that established image-processing techniques can be applied. Results demonstrate the proposed approach outperforms competing methods by a large margin.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Experimental evaluation does not include cross-validation. Is is unclear how MG-Net performs classification, because it is trained entirely unsupervised (using reconstruction loss).

• Please rate the clarity and organization of this paper

Satisfactory

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Reproducibility generally appears to be very good, although details about node2vec and VGG networks are missing, which I assume have been pre-trained.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
 Major Issues
------------
- Overall, the paper presents a very interesting approach to learn from metagenome data, but the experimental evaluation could be improved. Most importantly, no cross-validation has been performed and hence an estimate of the performance variance is missing. In addition, it would be helpful to provide an overall performance measure across all classes, e.g. balanced accuracy.

- Section 2.3: The authors explain that MG-Net is trained entirely unsupervised (using reconstruction loss), but in the experiments, it is evaluated with respect to classification performance. How this is done remains a mystery to me.

- Equation (3): X_i^g is the average of node2vec embeddings, bit X_i^lc is a set of feature maps. These can generally not be combined with elementwise multiplication. Please clarify.

- Section 2.4: "Empirically, we find that having a k-mer length of 5 and stride of 10 provides the best results." Since no cross-validation was performed, I am concerned that the authors tuned their method extensively on the test data, leading to inflated performance. In particular, if hyper-parameters of baseline methods are left at their default and are not optimized, the performance comparision would be biased in favor of MG-Net.

- Table 1: I am very suprissed that the increase from 5000 to "all" results in a huge spike for P. multo. T. pyoge. despite having less than 100 samples. How can this be explained?

- Section 3: It is not entirely clear what the inputs to baseline methods are. I assume that graph kernel methods use the graph from section 2.1. Are LR, SVM, MLP, DL methods using the pseudo-image as input? It appears that there is no baseline that combines graph and co-occurence data. Adding a table that explains what data is used for which baseline would clarify this.

- Section 3.1 and Table 1: I struggle to understand how the results without training data came to be. In particular, what role the Hungarian algorithm plays, and how the number of clusters were selected.

Minor Issues
------------
- Phi in equation (1) is never used again. I am assuming that it is related to f in eq (2).

- Section 2.1: I believe the indices of X_i^G should range from 1 to l, not n.

- Section 2.3: It is not clear whether VGG and/or node2vec networks have been pre-trained or have been trained from scratch.

- Table 1: Only 2 pathogenes have more than 500 samples, which makes the comparison rather confusing. Maybe selecting the same proportion per pathogen would be less misleading.

- Table 1: The two pathogens with smallest number of samples have the lowest performance. This could indicate the number of samples is insufficient. This would make for an interesting discussion.

- Table 2: Was the average scores across pathogenes computed by macro or micro averaging? An extendend table with per-pathogen performance would be very interesting.

- Table 3: It seems that the graph-based information contributes the most to the performance increase. It would interesting to add a graph convolutional neural network as a deep learning baseline using only structural information.


borderline accept (6)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The overall approach is interesting and novel, but the lack of cross-validation leaves me guessing whether the presented results are representative or overly optimistic.

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

5

• Reviewer confidence

Somewhat confident

### Review #2

• Please describe the contribution of the paper

This paper proposed MG-Net, a self-supervised representation learn ing framework that leverages multi-modal context using pseudo-imaging data derived from clinical metagenome sequences. The framework produces robust representations from unlabeled data and have limited access to labeled data.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) MG-Net representation is a noble idea and shows better performance across pathogens and host classes when compared with other metagenome representations including Node2Vec

2) MG-Net can be combined/integrated with other classifiers and shows better classification performance

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) The sample dataset is limited to show the performance of accuracy 2) The authors interchangibly used “representation” and “framework” which caused confusion understanding MG-Net 3) In section 3.1, it says “We evaluate under different settings to assess the robustness of the proposed framework under limited data and limited supervision” This doesn’t clearly explains whjat all different settings are considered.

• Please rate the clarity and organization of this paper

Good

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

No source/data provided

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

It is recommended to show performance results from more extended datasets.

accept (8)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper proposed noble representation, MG-NET and showed the performation accurancy better than other metagenome representations

• What is the ranking of this paper in your review stack?

1

• Number of papers in your stack

3

• Reviewer confidence

Very confident

### Review #3

• Please describe the contribution of the paper

The authors aim to provide a method to learn representations from unlabeled data, applied to the problem of meta-genomic analysis.

• Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The papers application domain is bioinformatics / genomics, however, the basic concept (pseudoimaging + structural reasoning) might be applicable to other domains in medical imaging (for example spectroscopy) where learning robust representations form unlabeled data would be very interesting.

• Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The network structure (VGG Encoder plus mirrored VGG Decoder) reminds about a U-Net with missing long skip connections. Why was no U-Net like structure applied here? Because of problems coupling the effect of the skip connections with the “feed in” of the structural representation of the bottleneck? Without the skips one would expect quite “blurry” reconstructions?

“Only 1000 samples per class” is a low number in meta-genomics, unfortunately its already a rather high number in other fields, thus in general terms, thinking about translation of the approach to other imaging tasks, it remains unclear if its possible so far.

Compared to the long introductions the core contributions in Section 2.2 and 2.3 are described very short are thus difficult to follow and less convincing then they could be.

• Please rate the clarity and organization of this paper

Satisfactory

• Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors checked “yes” for every question of the reproducibility checklist. I didn’t expect this from the paper, as i couldn’t find information there on providing code, models etc. (Note: I found a Sentence in the abstract :)) Still, reproducibility remains very difficult to judge.Technical reproducibilty should be given when code, models and parameters are released.

• Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
• Fig 2. gives a visual example of a “highly distinct pattern”. I agree that the shown “circular” pattern is very distinct for the human visual system with its tendency to “close forms”, but why would it be any different for a neural network compared to any other random repeating pattern? In particular given that this pattern crosses the diagonal / contains all points twice due to the symmetries.

The pseudo images are symmetric. Thus wasting 50% of memory. Any thoughts on that?

You state (in 2.3) “We obtain a robust representation by using attention-based reason mechanism given by …”. Why is this mechanism yielding a robust representation? Any proof, hint, empirical evidence? Section 2.3 is very short and would profit a lot from more elaborate explanations.

probably reject (4)

• Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I had a really hard time reading this paper, maybe because it is slightly out of the “typical” scope, but i guess also due to the very short description of the main contributions. The nice basic idea thus remains a bit vague to me and the paper was in the current form not able to convince me of the method, despite i see a good principal potential that it could work.

• What is the ranking of this paper in your review stack?

2

• Number of papers in your stack

3

• Reviewer confidence

Somewhat confident

# Primary Meta-Review

• Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The is a novel idea and application, especially for MICCAI. Reviewers commented about the novelty. Lack of cross-validation is an important concern.

Further the testing was done on small samples.

The authors should pay attention to the reviewers comments, especially where there are notational inconsistencies or lack of explanations, subscripts of the summations etc.

The authors should also comment why precision/recall is low for samples such as Mycoplasma bovis, B. trehalosi etc. This is not discussed.

• What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

# Author Feedback

We thank the reviewers (R1, R2, and R3) for taking the time to review our submission and providing thoughtful, constructive feedback and suggestions for improving the paper. We provide responses to the questions below. We could not include many details due to insufficient space. As suggested by R3, we will reduce the introduction and related work sections to accommodate the below information into the final version. Architecture (R1, R3): Our network structure follows a classic encoder-decoder format and is indeed inspired by U-Net. However, our goal is to learn robust representations from limited labeled metagenome data rather than reconstruction/segmentation. Adding skip connections like U-Net will allow the network to “cheat” and not learn robust, “compressed” representations. We augment the features from the CNN with structural features from the node2vec representations using Equation 3, where we flatten the feature maps so that they match the dimensions for element-wise multiplication in the attention mechanism. We consider the resulting representation to be robust due to the competitive performance of just k-means (Table 1, row 1) and a linear classifier (Table 2, row 1) that outperforms other structural and image-based representations across all baselines. Attention has shown to be robust in combining multi-modal inputs ([1]). We will verify and update any notation inconsistencies in the final version. Training (R1): Our framework is pre-trained in an unsupervised manner since the number of labeled metagenome sequences can be limited. We pre-train the network with the reconstruction loss and fine-tune for classification using a 3-layer neural network (DL) classification. We use other, simpler classifiers to evaluate the effectiveness of the learned representations. Details are in Section 2.4. All networks are trained from scratch during the pre-training phase. Baselines(R1): We use a mix of baselines to compare our approach. In Table 2, all baselines (linear classifier, SVM, etc.) are evaluated with both our MG-Net features as well as other graph-based and sequence-based baselines. In Table 3, we provide ablations of our approach with the deep neural network to use only image features (autoencoder only), image+structural features without the MG-Net architecture (Autoencoder + Structural Priors), and only structural features (from node2vec). Evaluation (R1, R2): In Table 1, we evaluate the end-to-end model with the deep neural network with varying levels of labeled sequences. When no labeled data is available (the first row), we use k-means to group the sequence into ground-truth clusters and use the Hungarian method to align predictions with labels for evaluation ([2]). This is used to evaluate the robustness of the learned representations to segment unknown, unseen data. We report the micro-average in both Tables 2 and 3 since the data is imbalanced. We could not provide class-wise breakdown for all baselines due to space constraints although Table 1 reports it for our full model. Scalability (R1, R3): Hyperparameter Tuning (R1): We use one subject from the training set to be the validation set to tune the hyperparameters (presented in Table 3). We use a grid search to obtain these hyperparameters as stated in the supplementary. We will move this information to the main paper. Cross-validation results (R1): For completeness, we present the 5-fold cross-validation results (std in braces) for our framework below. We will update table 2 with these numbers in the final version. Model Host Pathogen Linear 0.858 (0.009) 0.376 (0.011) LR 0.971 (0.002) 0.524 (0.009) SVM 0.973 (0.007) 0.532 (0.01) MLP 0.979 (0.006) 0.536 (0.007) DL 0.98 (0.005) 0.631 (0.009) [1] Lu, Jiasen, et al. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. NeurIPS 2019 [2] Ji, Xu, et al. Invariant information clustering for unsupervised image classification and segmentation.” ICCV. 2019.

# Post-rebuttal Meta-Reviews

## Meta-review # 1 (Primary)

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors have responded to the reviewer criticisms. The discussion of the precision/recall was missing from the rebuttal. However, the idea is novel and the application is interesting.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

## Meta-review #2

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposed a self-supervised representation learning method to make use of multi-modal context using pseudo-imaging data derived from clinical metagenome sequences. While the studied problem is interesting as recognized the reviewers, the presented neural network structure essentially follows the idea of U-net. Thus, this paper is a typical borderline paper, with both strength in terms of the studied problem and weakness in terms of methodology. The rebuttal addressed most questions/concerns raised by the reviewers. All things considered, my evaluation to this paper is neural, very slightly leaning toward positive.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

## Meta-review #3

• Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposes a really cool idea and with the clarifications in the rebuttal, I think it would make for a good contribution to MICCAI. Reviewer 3 brings up the issue of scope. I however believe that this is very much of interest to the MICCAi community and with updates to the paper, would make it clearer for the reader.

• After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept

• What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

6