Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Lasse Hansen, Mattias P. Heinrich

Abstract

3D registration remains one of the big challenges in medical imaging, especially when dealing with highly deformed anatomical structures such as those encountered in inter- or intra-patient registration of abdominal scans. In a recent MICCAI registration challenge (Learn2Reg) deep learning based network architectures with inference times of \textless2 seconds showed great success for supervised alignment tasks. However, in unsupervised settings deep learning methods have not yet outperformed their conventional algorithmic counterparts based on continuous iterative optimisation (and probably won’t as they share the same objective function (image metric)). This finding has brought us to revisit conventional optimisation schemes and investigate an iterative message passing approach that enables fast runtimes (using iterative optimisation with only few displacement candidates) and high registration accuracy. We conduct experiments on three challenging abdominal datasets ((pre-aligned) inter-patient CT, intra-patient MR-CT) and carry out an in-depth evaluation with a set of selected comparison methods. Our results clearly indicate that optimisation based methods are highly competitive both in accuracy and runtime when compared to Deep Learning methods. Moreover, we show that semantic label information (when available) can be efficiently exploited by our approach (cf. weakly supervised learning). Data and code will be made publicly available to ensure reproducibility and accelerate research in the field of 3D medical registration (https://github.com/lasseha/iter_lbp).

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_20

SharedIt: https://rdcu.be/cyhP1

Link to the code repository

https://github.com/lasseha/iter_lbp

Link to the dataset(s)

https://learn2reg.grand-challenge.org

Reviews

Review #1

Please describe the contribution of the paper

An optimization technique denoted “loopy belief propagation” (LBP) is applied to 3D deformable medical image registration problems. It achieves good accuracy and attractive runtimes on both mono-modal and multi-modal abdominal CT-CT and CT-MRI registration scenarios.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

A sparse optimization technique is used together with state of the art feature descriptors, in the modern and very popular PyTorch environment as computational engine. Competitive performance is achieved, and some interesting findings presented - which might be useful for researchers in the field of medical image registration. It is refreshing to see classical iterative optimization schemes both explained and implemented in terms of the nowadays most popular Python-based deep learning engine - in a way helping to bridge the gap between those fundamentally different approaches.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

A clinical problem is neither introduced nor solved. The kind of registration problems addressed here are moderately difficult and implemented in many commercial solution nowadays at acceptable computation times. They had rather been picked for a competition among scientists to figure out how Deep Learning can best be applied to image registration problems. In fact, the manuscript does not stand independently - information about the Learn2Reg challenge is required for context, e.g. it is often referred to its various contestants.

The results in Table 1 put the presented method pretty much in the same ballpark than other methods, I can not see significant differences - the computation times are impressive though.

The terminology is confusing, which partly stems from the fact that classical image registration literature is mixed with a modern deep learning framework; still a lot of clarification would be possible here. In particular, I am confused about the “probabilistic discrete optimization using a sparse variant of loopy belief propagation for the joint regularized cost function”. Does this not just boil down to stochastic optimization? If yes, how is it still discrete if random samples are combined - and if it is discrete, would it not be better to make it continuously randomized?

Either way, this is a powerful sparse optimization - so why would one just use it to solve a local registration in 0.2 seconds? You should instead solve much more challenging problems, utilizing the optimization efficacy for more exhaustive search domains, getting impressive results where other methods fail or take too long!

Sparse & randomized optimization can yield impressive results, often at the price of lower precision/reproducibility - this is not evaluated at all in the paper.

You mention several times that you propose an “elegant” strategy - please rephrase, as this is not a scientific term.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Fully reproducible indeed, as public data is used and the full source code disclosed, already at the time of anonymous submission. This is as open as research can get, and hence quite useful & collaborative.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

This manuscript is a good (meta-)submission for the next Learn2Reg challenge, as people who have participated before would understand the context necessary here.

To stand as independent scientific work, the method description should be made more clear & concise, and put into a fuller context of existing work (of stochastic & sparse optimization). In addition, it should be applied and evaluated on a medical problem where it can really unfold its potential.
Please state your overall opinion of the paper

borderline reject (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Manuscript is hard to understand without the specialized context it uses; no clinical problem significantly improved upon.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

3
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

The paper presents a conventional image registration method based on iterative message passing approach. The optimization approach uses a sparse variant of loopy belief propagation for the joint cost function.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper was written clearly.
- Experiments were done with both good qualitative and quantitative details.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Some details are missing for the proposed method (see detailed comments below).
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Authors mentioned to release the data and code.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- “Our method comprises a spatially randomly distributed sampling of control points (keypoints) ….” How are the keypoints selected?
- What is defined by variable “q”?
- How are the candidate displacements dynamically determined?
- In equation computing outgoing messages, what is “d”?
- What is a “coarse and convex” binary mask and how is it obtained?
- Hausdorff distance should be computed for evaluation.
- What is axis parallel selection of motion vectors?
- There are typos throughout the paper, which should be corrected.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Missing details for the proposed method.
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

5
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper

This article revisits deformable registration using discrete optimisation and MIND, by extending deeds by Heinrich to a probabilistic discrete optimisation using a sparse variant of loopy belief propagation. This is compared against a number of approaches, both conventional (drop by Glocker) and deep learning based (VoxelMorph) - the latter enhanced by replacing the U-Net with a two-stream architecture and MIND loss function, as well as PDD-Net, also with MIND loss, and variants of their own method. Experiments are run on public data so that a direct comparison to LearnToReg league table methods can be made. The method is shown to be extremely fast and accurate.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- elegant extension of deeds, and extension of Voxelmorph and PDDNet to MIND
- comparison against relevant state-of-art on public data, with good results
- well written paper
- fully reproducible with code, data, …
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

/ novelty is a bit limited to a “tweak” in the optimisation scheme / results are not fully convincing, ie VoxleMorph+ is faster for one dataset, PDDBet more accurate for another… / only benchmarking, no clinical application or insight as to how good/fast do you need to be
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Code is provided, public data from Learn2Reg challenge used, pseudo-algorithms provided - paper can be easily adopted and tried by others
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

This is a nice paper which is also very dense: The proposed method forms an improvement over deeds, and is tested in various settings. Other state-of-the-art methods are expanded to work with the same similarity metric (MIND) for comparability - this on its own is a contribution already. The results are nice but not conclusive as there is no clear winner, and the challenge leader board had some higher accuracy reported.

Overall the paper very nicely revisits discrete optimisation for deformable registration as a serious standing alternative to the more hyped up deep learning based methods, but is more a practical benchmark paper than presenting new scientific methods or insights.

A discussion on the advantage of not requiring training, and consequently not mixing data between patients, is missing.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

High practical value, high reproducibility, some lack in innovation but well executed
What is the ranking of this paper in your review stack?

3
Number of papers in your stack

7
Reviewer confidence

Very confident

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Summary: A sparse variant of loopy belief propagation is applied to 3D deformable medical image registration. It achieves good accuracy and attractive runtimes on both mono-modal and multi-modal abdominal CT-CT and CT-MRI registration scenarios.

Positives:
- Elegant approach.
- Competitive performance is achieved in comparison to SOTA baselines.
- Implemented in the PyTorch environment.
- Clearly written, but readers need knowledge of Learn2Reg challenge.
- Experiments were done with both good qualitative and quantitative details.
- Fully reproducible.
Negatives:
- Results not entirely convincing because it doesn’t come first in all cases, either in terms of speed or accuracy. More discussion of the results may be needed to clarify the benefits in terms of a trade-off between speed and accuracy.
- Should apply the proposed method to more challenging problems that might really unfold its potential, but this would be for future work.
- Some confusing terminology. The method description should be made more clear & concise, put into a fuller context of existing work (particularly regarding its relationship to stochastic optimisation).
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Author Feedback

We thank all reviewers for their informative assessment of our work. All reviewers positively comment on the interesting findings of outperforming learning-based registration in both accuracy and run-time with a new variant of classical optimisation that combines stochastic search with discrete optimisation. Helping to “bridge the gap between those fundamentally different approaches” and reestablishing “discrete optimisation for deformable registration as a serious standing alternative to the more hyped up deep learning based methods” is seen as strength. There is also praise of releasing the full source code already for review and benchmarking the performance on public datasets to quote reviewer #1: “This is as open as research can get”. The only reviewer that did not directly recommend acceptance (borderline reject), still ranked our paper as the best in their stack.

Nevertheless there are some concerns that we would like to address in this rebuttal and (if accepted) a revised manuscript.

Motivation: We agree with reviewer #1 that the clinical motivation could be better introduced and less focused on the Learn2Reg challenge description. While this is not the direct focus of the presented work, enabling deformable multi-modal fusion (of thorax and abdomen) has numerous medical applications, e.g. for aligning pre-interventional scans for image-guided (radio)therapy and multimodal diagnostic. Inter-subject CT registration may be of lesser direct clinical use, but can enable statistical modelling of variations of abdominal organs for abnormality detection and to provide a canonical atlas space. We strongly believe that our method has great potential for even more challenging tasks in the future.

Results: Reviewer #1 notes the relatively small differences in Dice scores across comparison algorithms and Reviewer #4 points out our method does not perform best in every case. We firstly want to highlight that we performed rather extensive hyper-parameter tuning for all other methods to enable a fair competition. It could also be argued that Dice is not ideal for measuring subtle differences in registration accuracy. We hence computed extra accuracy results with Hausdorff distances as suggested by R#4 and evaluated it for the experiment CT-CT (centre column of Table 1) to highlight differences between a state-of-the-art continuous optimiser (Adam) and our proposed iterative LBP. We find initial values for the average 95% percentile Hausdorff of 30.14 vox, which are reduced by Adam to 26.1 vox. Our proposed method can substantially and statistically significantly (Wilcoxon p < 0.0002) improve upon this with an error of only 17.3 vox. To that end, our method outperforms both continuous and discrete SOTA registration.

Our method ranks first in accuracy in three out of four experiments (and second in the other) - among up to 6 competitive algorithms (thereof two deep learning algorithms) and is second fastest. VoxelMorph+ is at least 0.4 seconds faster, but we clearly outperform it in quality (up to 11% points improved Dice overlap).

Method: We would like to clarify some potentially confusing description that may have led to a lower appreciation of our method contribution: while there are certainly links to stochastic gradient descent, our proposed optimisation clearly benefits from the discrete setting of the search that finds the combinatorial minimum of a wide range of displacements. In contrast to commonly used loopy belief propagation (which iterates only over steps of message passing) we introduce a second and important outer iteration over the capture ranges of displacements. It is to the best of our knowledge the first that couples a discrete belief propagation optimisation with a sparsely distributed and locally adaptive solution space of potential displacements for each control point.

All minor suggestions will be incorporated and remaining typos and grammatical errors will be thoroughly fixed.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Reviewers had concerns about not targeting a particular specific clinical application, but given the widespread use of image registration, the authors explain how any advance like this would be useful. Accuracy was better overall than the other methods, and only 0.4 seconds slower.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

8

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

I appreciate the rebuttal from the authors. Overall, I believe this is an interesting paper, which looks more at extending optimization in registration strategies.

Overall, the main concern is that the results do not show a clear decisive win for this strategy, but of course that is not the only insight to be delivered for a paper. I think all reviewers agree that the idea is creative and somewhat unstandard, which makes it worthy of discussion. There are also clearly some advantages in the methods, which I believe warrant a more careful discussion, which the authors promise to do (including some details given in their rebuttal).
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

7

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
This paper may has merit to be included in miccai, but some practice in reasoning and justifying the work needs attention, to the degree that the acceptance may need a re-consideration.
- reinforcing aggressively the positive comments using the entire first long paragraph in a rebuttal seems inappropriate.
- In abstract, “and probably won’t as they share the same objective function” is speculative, as the performance of a registration algorithm, learning or not, is also a function of data (training and testing in the case of learning based method).
- using Learn2Reg results, in which most tasks have only several dozens of training data, to claim learning-based methods “often with a degradation in alignment quality.”, is biased without taking into account many other positive results reported in literature.
- why not all the learn2reg tasks be included? mr-to-us is a typical case that could question some of the over-generalised claims in the paper, e.g. “metric-supervised methods may be considered more general.” - by ignoring other many applications such robust metrics may not even exist.
- why the proposed method is “elegant” needs more explanation.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

30

back to top

Revisiting iterative highly efficient optimisation schemes in medical image registration