Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Christoph Reich, Tim Prangemeier, Christian Wildner, Heinz Koeppl

Abstract

Time-lapse fluorescent microscopy (TLFM) combined with predictive mathematical modelling is a powerful tool to study the inherently dynamic processes of life on the single-cell level. Such experiments are costly, complex and labour intensive. A complimentary approach and a step towards in silico experimentation, is to synthesise the imagery itself. Here, we propose Multi-StyleGAN as a descriptive approach to simulate time-lapse fluorescence microscopy imagery of living cells, based on a past experiment. This novel generative adversarial network synthesises a multi-domain sequence of consecutive timesteps. We showcase Multi-StyleGAN on imagery of multiple live yeast cells in microstructured environments and train on a dataset recorded in our laboratory. The simulation captures underlying biophysical factors and time dependencies, such as cell morphology, growth, physical interactions, as well as the intensity of a fluorescent reporter protein. An immediate application is to generate additional training and validation data for feature extraction algorithms or to aid and expedite development of advanced experimental techniques such as online monitoring or control of cells. Code and dataset is available at https://git.rwth-aachen.de/bcs/projects/tp/multi-stylegan.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87237-3_46

SharedIt: https://rdcu.be/cymaY

Link to the code repository

https://git.rwth-aachen.de/bcs/projects/tp/multi-stylegan

Link to the dataset(s)

https://git.rwth-aachen.de/bcs/projects/tp/multi-stylegan

Reviews

Review #1

Please describe the contribution of the paper

This paper focuses on the temporal sequences simulation for live yeast cells in microstructured environments. Multi-StyleGAN is proposed for multi-domain data generation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

A novel model Multi-StyleGAN is proposed for multi-domain data generation.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The experiment results are incomplete and lack of comparison.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

reproducible
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. How many samples are used to verify the effectiveness of the proposed method, no mention.
2. What is the meaning of IS, FID, FVD in table 1? Why choose these metrics? The Inception score below and in the table 1 seems to be inconsistent.
3. No quantitative comparison with existing methods, such as StyleGAN and StyleGAN2.
Please state your overall opinion of the paper

reject (3)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

No quantitative comparison with existing methods, such as StyleGAN and StyleGAN2.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

2
Reviewer confidence

Very confident

Review #2

Please describe the contribution of the paper

This manuscripts presents a novel method for synthesizing a two-channel (brightfield + fluorescence) time-lapse microscopy sequence. The presented approach is promising, even though currently the authors only limited themselves to synthesizing very short sequences of length 3.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

• Very elaborate framework for synthesizing two-channel time-lapse data. . • Resulting images look realistic.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

• The generated sequences are rather short, only 3 time points long. • The topic of this research is very specific; reading this manuscript requires extensive knowledge of GANs. • It would also be interesting, just for completeness, to see quantitative comparison to other similar methods, even though they performed poorly.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The synthesized data set and the code will be made publicly available. The implementation and the hyperparameters are thoroughly described. Validation part requires slight clarification.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
1. The authors should explain, for non-experts, why they resorted to generating such short sequences having 3 time points and what it would take to be able to synthesize longer, more realistic ones.
2. It would still be interesting to see the quantitative results of the other similar methods, just for completeness, even though they produced unrealistic results.
3. I find the part about calculating the IS, FID and FVD scores rather confusing, as it is not entirely clear where the authors speak about the real data and when about the generated. Also at the beginning they mention that one image is uniformely sampled, and in conclusion state that the validation metrics were calculated over the entire data set length.
4. Page 3, Figure 2, caption: “A brightfied sample witha…” → “A brightfied sample with a…””; “…two yeast cell…” → “…two yeast cells…”.
5. Page 6: “…encoder is consists…” → “…encoder consists…”.
6. Page 7: “…sharp and divers…” → “…sharp and diverse…”.
7. Page 8: “…with in the loop cell segmentation…” → ““…with cell segmentation in the loop …”.
8. Figure S1, caption: “…between source sequence A and B.” → “…between source sequences A and B.”; “…vector of souorce…” → “…vector of source…”.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This is an interesting well-worked-through submission presenting a promising approach. On the downside, the manuscript is very technical and requires in-depth understanding of GANs.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

In this study, a Multi-StyleGAN model was proposed to simulate time-lapse fluorescence microscopy imagery of living cells based on real image data. The model can simulate the changes in cell morphology, growth, and interactions. It is an interesting study.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Novel application of CAN models to simulate the real time-lapse image data to be able to potentially capture the biological behaviors of cells.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The evaluation of the capability to capture the biological behaviors of cells and their associations with different experimental conditions was not well conducted and designed.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

code is available via GitHub.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

It is a novel application of CAN models to simulate the real time-lapse image data to be able to potentially capture the biological behaviors of cells. The simulation results looks interesting. However, it is better to quantitatively evaluate the simulation model in terms of cell behaviors under different experimental conditions will be great.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Novel application of GAN in the simulation of time-series cell images to study the potential cell behaviors under different conditions.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

4
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The reviewer recommendations are somewhat conflicting. The topic and proposed approach are interesting and the paper is clear. There are some points of concern that need to be addressed before the paper can be further considered. A recurring reviewer comment is the lack of quantitative comparison with existing methods.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

5

Author Feedback

We thank the AC and the reviewers for their comments and have begun editing the manuscript to resolve the questions raised. Seeing as there is some conflict in the reviews (3-7-7), we present our response in two parts. First, we address the key questions shared between reviews. Second, we respond to all three points raised by R1.

Key question(s):

The key question is that of a quantitative comparison with existing methods (raised by R1, R2 and in the meta-review). We have added the requested quantitative comparison to Table 1 and the Results section, in which our Multi-StyleGAN outperforms the other methods (values listed below). This supports our original finding that Multi-StyleGAN generates realistic sequences (Results, Discussion, Conclusion), while the results of the other methods are ‘qualitatively unrealistic and not biophysically sensible’ (Methodology ¶1 and Fig. S2). The added existing method metrics are:

Multi-StyleGAN (original Table 1)

StyleGAN2 FID (BF) 200.54 FID (GFP) 224.77 FVD (BF) 45.63 FVD (GFP) 35.22

StyleGAN2 3D FID (BF) 76.03 FID (GFP) 298.76 FVD (BF) 14.75 FVD (GFP) 31.48

We also agree that ‘reading this manuscript requires extensive knowledge of GANs’ (R2 4.2), in particular for the metrics (R1 7.2) and we have begun making changes to make it clearer and more accessible. We limited the study to short sequences (R2 7.1) for three reasons: 1) to the best of our knowledge Multi-StyleGAN is the first method for multi-domain sequence synthesis, 2) this was already quite challenging 3) limited data availability of multi-domain sequences for the presented application. Investigating various experimental conditions (R3) is beyond the scope of this study (outlined in Discussion ¶3 as future work). We merely demonstrate that the simulation captures underlying biophysical factors and time-dependencies.

Comments to Reviewer 1:

The recommendation made by R1 is in relatively stark contrast to R2 and R3. The key concern is the apparent lack of (quantitative) comparison of results (expressed in 4, 7.3 and 9 of R1’s comments). The original manuscript contains the following comparisons:

quantitative comparison with the real dataset benchmark (Results ¶3 and Table 1),

qualitative comparison with the real dataset (Fig. 5),

visual comparison with SotA methods (described in Methodology ¶1 and Fig. S2). We have also added the requested quantitative comparison, see general response.

R1 made three recommendations for improvement:

7.1: ‘How many samples are used […], no mention.’ This was in the original manuscript, albeit not clear enough:

Methodology: ‘metrics were computed over the whole dataset length’ (P. 6 line 3f.),

Dataset: ‘8148 sequences’ (Page 4, line 7), We have clarified and added a statement on the number of sequences (8148) to the Results.

7.2i: ‘What is the meaning of IS, FID, FVD in table 1? Why choose these metrics?’ These measure image quality and diversity relative to the training dataset; we chose widespread metrics to ‘facilitate future comparisons’ (Methodology ¶5). FID is the most widespread SotA metric for individual images, eg. [9, 21 - 23, 33, 34]. FVD is the related measure for sequences [37]. A key advantage of the inception score (IS) is that it can characterise the real dataset, providing a benchmark for comparison (Table 1 and Results ¶3). We will further clarify in the manuscript.

7.2ii: ‘The Inception score below and in the table 1 seems to be inconsistent.’ These are two separate sets of inception scores. The IS in Table 1 describes the performance of the algorithm, while that below characterises the real dataset. The original manuscript describes this in the Results (¶3) and in Table 1 (footnote). We have edited both for clarity.

7.3: ‘quantitative comparison with existing methods’: See key question above. The added quantitative comparison supports our original finding: Multi-StyleGAN significantly outperforms the SotA.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper proposes a new method for simulating live-cell time-lapse fluorescence microscopy images based on a generative adversarial network that can synthesize a multi-domain sequence of consecutive time steps. The method is showcased by simulating yeast cells in microstructured environments and is trained on a data set recorded by the authors in their laboratory. The primary application of the proposed method is to generate additional data for training and validation of cell image analysis methods. A main concern is the lack of quantitative comparison with existing methods, which the authors adequately address in their rebuttal. With the promised revisions and clarifications, I believe the paper is acceptable for MICCAI 2021.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

8

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This is a very nicely presented paper, with supporting documents, videos, etc. The reviewers have mixed opinions, and the lack of comparisons is a major concern. The rebuttal reports a baseline with StyleGAN, which however for some metrics is better than the proposed. This seems to be a borderline paper. I am just not overly excited about the technical contribution or the presented motivation of the need and use in practice, but I would not strongly argue for rejection either.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper studied an interesting problem of synthesizing multi-domain time-lapse microscopy sequences. I appreciate that the authors provided additional comparisons to StyleGAN2 and StyleGAN2 3D in the rebuttal. Regarding the evaluation metrics, IS and FID are commonly used for 2D natural images but FVD is not widely adopted in the literature. I understand there are no perfect evaluation metrics available so the usage of these metrics is reasonable. However, it is more appropriate if authors could finetune the Inception-V3 feature extractor used in calculating IS/FID on the cell dataset, as I believe the pretrained network was trained on the ImageNet for natural images. I recommend accepting the paper, but the authors should try to address all concerns in the final version.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

10

back to top

Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy