Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Diya Sun, Yungeng Zhang, Yuru Pei, Tianmin Xu, Hongbin Zha

Abstract

Deformable image correspondence is crucial in various medical image research. Existing deep learning-based registration and correspondence models mostly learn a nonlinear voxel-wise mapping function between volumetric images by metric space alignments in the spatial domain, without addressing the intrinsic structure correspondence. Thus, the registration requires prior affine transformation or landmark annotations to handle high-frequency perturbations due to pose and structural variations. This paper presents a novel and efficient correspondence framework via low-dimensional spectral mapping to handle the intrinsic correspondence of anatomical structures. We devise a novel multi-path graph convolutional network (GCN)-based embedding approximation module, relieving the time complexity in the eigendecomposition-based spectral embedding of volumetric images. We present a descriptor learning module and surpass the descriptor selection or hand-crafted descriptors. Experimental results demonstrate the efficacy of the core modules, i.e., the image embedding approximation and descriptor learning, for volumetric image correspondence and the atlas-based registration of craniofacial anatomical structures. The proposed approach achieves comparable corresponding accuracies with the state-of-the-art deep registration models, being resilient to pose and shape perturbations.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_16

SharedIt: https://rdcu.be/cyhPX

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper
The paper aims at medical image registration and in particular present results on Cone-beam computed tomography (CBCT) dataset where it significantly outperforms the baseline solution. Authors propose a fusion of standard techniques and deep learning
- superpixels and U-Net model for feature extraction
- eigenvector decomposition for spectral embedding approximation
- Spectral Map-based correspondence model - SMNet
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

I really appreciate the thoughtful combination of standard computer vision methods with deep learning and blindly using black9box DNN as a solution for everything. The authors also presented strong results nevertheless they may include more state-of-the-art methods in their comparison.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

I do not have any
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper seems to have included all parameter/configuration details.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

I do not have any here I do not have any here
Please state your overall opinion of the paper

strong accept (9)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The per is well written, fine to follow and the method fusin look interesting to me.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

This paper presents an unsupervised learning-based framework for 3D image registration. SLIC algorithm is used to summarize image information as a supervoxel graph, thus formulating the 3D image registration problem as finding optimal correspondence between two sets of supervoxels. To this end, relevant image-based features are extracted from an auto-encoding step, which are then used to derive supervoxel features using a spectral embedding module that approximates the graph spectral decomposition. These features are finally considered to solve the supervoxel matching problem. Experiments are conducted on craniofacial CBCT images.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Several works have already demonstrated the potential of spectral decomposition as an interesting image representation for efficiently registering image. The scalability of this decomposition to 3D images is a strong limitation to its use for medical images. The proposed framework that combine a module that approximate spectral decomposition and a module to extract multi-scale features that are both unsupervisely learned within a pairwise supervoxel matching formulation seems to be really original.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The description of the method is sometimes quite hard to follow.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

meet the standard requirement
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The main point that hampers the quality of the paper is related to the 8 pages limitation constraint. Ideally, more space would have been required to allow for larger figures and more detailed description of the method. Unfortunately, I don’t really have any suggestions to help authors improving the presentation while respecting the page constraints.
Please state your overall opinion of the paper

Probably accept (7)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The contribution is original and interesting. A strong point is also related to the proposed strategy to train the model in an unsupervised way.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

2
Reviewer confidence

Very confident

Review #3

Please describe the contribution of the paper
- The paper proposes an efficient method for medical image registration which is robust to pose and shape perturbation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The formulations of objective functions seem very reasonable and delivers core ideas proposed by the authors.
- I like the idea of finding correspondence in the spectral domain, which is much simpler than finding super-pixel-wise correspondence in the native graph space.
- The paper is very well written and easy to follow.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- In the introduction, the notion of graph (i.e., graph Laplacian) appears out of blue without any explanation. It should be clearly explained how the graph is constructed and why one needs such representations before mentioning the virtue of diagonalization of the matrix.
- SLIC algorithm should be explained more in detail as the graph construction is critical in the framework. How are the super-pixels constructed and what do the edges mean?
- The authors consistently claims the needs for efficiency; it would make more sense if the authors can include an example perhaps in the intro, e.g., how many super-pixels are typically needed for a specific application?
- Why do the authors want to ensure orthonormality in Z and why is it important? What happens if Z is not orthonormal? (This is a very important aspect but not properly explained in the manuscript.)
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors have agreed to release their code upon the acceptance of this paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

Perhaps if MICCAI allowed just one more page, I think this manuscript could have been much stronger. Please see my comments in the weakness and try to incorporate as much as possible in the later version.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Sufficient novelty for well-known problem. Good presentation of ideas and the work done by the authors.
What is the ranking of this paper in your review stack?

1
Number of papers in your stack

8
Reviewer confidence

Very confident

Review #4

Please describe the contribution of the paper

The authors proposed the efficient and effective deformable image registration method based on spectral mapping. The low-dimensional spectral embedding descriptors provide the more semantically meaningful matching compared with the hand-crafted descriptors from the raw volumetric images. To reduce the computational burden in spectral mapping, the efficient multi-path graph convolutional network is introduced. The proposed method is extensively validated on simulated and clinical CT data.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The main strength of the proposed method is robustness on large deformation cases due to pose and structural variations. This is clinically valuable as it doesn’t require the preprocessing like affine registration and simplifies the workflow. In addition, the novel spectral constraint for orthonormality is introduced in (3) with taking account into diagonality which allows faster approximation.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The main weakness of this paper is the lack of details about how to optimize the deformable registration cost in (5). (e.g., linear least square on very large matrix) This requires large memory particularly for 512x512x512 volume with the sub-millimeter voxel size. (e.g., brain cortical surface matching). In addition, the experimental results with only one set of hyper parameter is reported and the sensitivity analysis is not provided.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors will provide the training code for reproducibility. If the authors release the annotated synthetic and clinical dataset, it would be valuable to the community.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

I suggest authors to test the proposed spectral-embedding based deformable image registration to brain cortical surface matching where anatomical variability poses more challenge than craniofacial structure mapping. In addition, authors need to provide the details about optimization of the cost function particularly in terms of convergence. Sensitivity analysis on hyper parameter will make the experimental results more convincing. The memory and computational requirement on larger volume (e.g., 512x512x512) can be discussed.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is easy to read and clinically well motivated. All mathematical concepts are clearly described with well-defined notation. The experimental results on synthetic and real dataset is convincing. The robustness on the challenging deformable registration cases due to pose and anatomical variability is deserved. The novel multi-path GCN allows the fast approximation of spectral embedding. The author needs to discuss more about optimization of the cost function and provide the sensitivity analysis on hyper parameters.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

4
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a graph convolutional approach to the volumetric image correspondence approach by avoiding the costly spectral eigendecomposition of a volumetric image.

One reviewer just loves the method with no concern to report.

A second reviewer appreciates the faced challenge of handling 3D eigendecomposition, by relying instead on supervoxel graphs in an unsupervised setting.

A third reviewer further highlights the original approach of using supervoxel graphs, but wish for more methodological details.

All three reviewers have a general consensus on the originality of the proposed approach, namely relying a supervoxel graph to rethink the volume registration problem as a graph correspondence problem. This is novel and supported with a clear gain in registration performance on an application of cranio-facio correspondence.

For these reasons, Recommendation is towards Early Acceptance.

Oral: All three reviewers have also recommended an oral podium and a young scientist award. Due to the originality of rethinking the registration problem, combined with a very graphic application of cranio-facio maps, the authors should be invited as an oral presentation + in the award list.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

2

Author Feedback

We thank reviewers for their efforts to review our paper and their constructive comments. All reviewers recognize our contribution taking advantage of descriptor learning and the multi-path GCN-based embedding approximation framework to surpass the descriptor selection and relieve the time complexity in spectral embedding for volumetric correspondence.

Responses to reviewer #4: Supervoxel graph and graph Laplacian. The supervoxel graph G(S, E) is constructed based on the supervoxel decomposition. S denotes the set of supervoxel nodes, and E the set of edges connecting neighboring supervoxels. The graph Laplacian matrix is defined as L= D-A, where A denotes the symmetric affinity matrix and D the diagonal degree matrix. In order to avoid the computationally expensive supervoxel graph embedding by eigendecomposition of the graph Laplacian matrix, we proposed the GCN-based model to approximate the eigendecomposition of the graph Laplacian matrix. Note that if approximated spectral bases Z is similar to eigenvectors of L, the product Z^TLZ tends to be a diagonal matrix with the diagonal entry being related to eigenvalues. In the GCN-based simulation of eigendecomposition of the graph Laplacian matrix, the matrix diagonalization constraints are used, where the matrix Z^TLZ is required to be a diagonal matrix. Since the graph Laplacian matrix is symmetric, the graph Laplacian matrix has orthogonal eigenvectors. We imposed the orthonormality constraints on Z to simulate the eigendecomposition of the graph Laplacian matrix. We agree that the discussions on the effects of orthonormality in Z would be helpful.

SLIC-based supervoxel decomposition. We used the SLIC algorithm [1] to decompose the CBCT into visually homogeneous regions, where the k-means clustering iterations were applied to feature vectors of voxels. There are two parameters, i.e., the supervoxel number and the compactness or strength of the spatial regularization, in the SLIC-based decomposition. In experiments, the CBCTs with a resolution of 128 × 128 × 128 and the isotropic voxel size of 1.56 mm× 1.56 mm× 1.56 mm are decomposed in 15000 supervoxels without considering the air background.

Responses to reviewer #5: Optimization and memory complexity. The proposed framework is implemented using Tensorflow, where the network parameters are optimized using the ADAM algorithm by minimizing the loss function (Eq. 5). The learning rate is set to 1e – 5. The minibatch consists of two CBCTs with a resolution of 128 × 128 × 128. The training takes 16.7 hours with 20 epochs. The training requires approx. 9.8G bytes GPU memory. The model has approx. 28.5M parameters. The 3D U-net [4]-based module is promising to extract multi-scale features of the large volume (e.g., 512x512x512) in the patch-wise manner. The spatial complexity of the spectral embedding approximation module depends on the size of the sparse graph Laplacian matrix. The proposed SMNet has the potential to be generalized to high-resolution volumetric images.

Dataset and parameter analysis. The proposed model has been trained and validated on clinically obtained 408 craniofacial CBCTs. The hyperparameters of \gamma_1 and \gamma_2 on the correspondence and the constraints on spectral maps and bases are set to 0.5 and 2, respectively. We agree that the validation on other deformable registration tasks, such as the brain cortical surface matching, and sensitivity analysis of the hyperparameters would be helpful.

back to top

Spectral Embedding Approximation and Descriptor Learning for Craniofacial Volumetric Image Correspondence