Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews

Authors

Lei Ma, Daeseung Kim, Chunfeng Lian, Deqiang Xiao, Tianshu Kuang, Qin Liu, Yankun Lang, Hannah H. Deng, Jaime Gateno, Ye Wu, Erkun Yang, Michael A.K. Liebschner, James J. Xia, Pew-Thian Yap

Abstract

Facial appearance changes with the movements of bony segments in orthognathic surgery of patients with craniomaxillofacial (CMF) deformities. Conventional bio-mechanical methods, such as finite element modeling (FEM), for simulating such changes, are labor intensive and computationally expensive, preventing them from being used in clinical settings. To overcome these limitations, we propose a deep learning framework to predict post-operative facial changes. Specifically, FC-Net, a facial appearance change simulation network, is developed to predict the point displacement vectors associated with a facial point cloud. FC-Net learns the point displacements of a pre-operative facial point cloud from the bony movement vectors between pre-operative and post-operative bony models. FC-Net is a weakly-supervised point displacement network trained using paired data with strict point-to-point correspondence. To preserve the topology of the facial model during point transform, we employ a local-point-transform loss to constrain the local movements of points. Experimental results on real patient data reveal that the proposed framework can predict post-operative facial appearance changes remarkably faster than a state-of-the-art FEM method with comparable prediction accuracy.

Link to paper

DOI: https://doi.org/10.1007/978-3-030-87202-1_44

SharedIt: https://rdcu.be/cyhQZ

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper tackles the difficult challenge of predicting face deformation based on bone deformation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The prediction is claimed to have comparable accuracy to FEM but be significantly faster.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Other methods, such as dimension reduction or shape space, are also much faster once the initial work is done. A comparison would make the paper much stronger stronger.

40 training samples is usually few enough to avoid bias in gender, age, and race.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Any combination of norms requires magic weighing. The papers does mention default mu lambda k
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

The measures L_p and L_LPT are very similar and should be contrasted. K_i vs k_i is ill-defined. A typo?
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper lacks a deeper analysis what aspects are to be captured by applying a neural net, compared to existing highly efficient methods like dimension reduction or shape space methods.

The application of parallel models of bone and tissue appears competently executed but does not represent a novel approach.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

3
Reviewer confidence

Confident but not absolutely certain

Review #2

Please describe the contribution of the paper

The authors estiamte the deformation of patient’s face usin a deep neural network. The method is as accurate as FEM methods and much faster.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

To the best of my knowledge, the method is novel. The validation was done well.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The clinical application is not 100% clear to me. Can we use it to change the surgical planning for example?
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The method is clear and can be reproduced
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

It will be interesting to investigate the spatial distribution of the error and to identify cases where the prediction may be inaduqaute. In addition, it will be interesting to quantify or estimate the clinical application accuracy that is required for this application.
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

A fast and accurate surrogate to the FEM method is impressive and justify the publication in MICCAI.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Confident but not absolutely certain

Review #3

Please describe the contribution of the paper

This paper addresses the problem of simulating facial appearance changes following orthognathic surgery using deep learning. A point displacement network, which takes the patients pre-operative, facial point cloud, as well as the pre-operative bony point cloud with displacement vectors according to the surgical plan as input, and outputs a per-point displacement vector, which, applied to the pre-operative point set, results in a prediction of the surgical outcome. The network accomplishes results comparable with a finite element model, while being significantly faster.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

• The paper tackles a very interesting task, which, to the best of my knowledge, has not been explored in the context of deep learning yet. • As far as I know, formulating the problem of facial appearance change simulation as a point displacement problem is novel. • The paper is well structured and easy to read.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

• The proposed FC-Net is compared to a prototype FEM-based simulation, which seems to be focused on a very specific problem. A comparison to other FEM methods (e.g., also commercial software), is not made, neither in terms of time consumption nor in terms of accuracy. • It is not clear whether the results are clinically acceptable. Related work suggests that a displacement of below 2 mm is required for clinical significance of the method. • The evaluation seems somewhat shallow. The influence of individual aspects of the methodology are not evaluated or discussed.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Reproducibility of the paper is lacking in some aspects. Which framework was used to implement the method? Concerning the dataset, how many surgeons were involved in creating it? Was intra- and interobserver variability considered? Which test was used to check statistical significance?
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html

• While the proposed FC-Net is evaluated against one FEM-based simulation method, a comparison to other existing methods, e.g. [3-6], is not made. Some of these works report higher accuracies. Further, a discussion of the clinical relevance of the results is missing. Some of these studies suggest that only errors below 2 mm are clinically acceptable. • One of the major reported advantages of the proposed method is that it is time saving. However, it remains a bit vague which parts in traditional FEM pipelines are so labor intensive and time consuming. The reference method seems to tackle a very specific problem, requiring manual mesh creation. Is this the same for other pipelines as well? Was the FEM method implemented on the same computing infrastructure as the FC-Net? • I wonder how large the influence from integrating the bony movement vectors is. The architecture of the proposed network is heavily influenced by p2p-net [1], in particular, a single directional branch of p2p-net is adapted, as introduced in [2] (this work could be referenced / credited). Therefore, the novelty of the architecture lies mostly in the integration of the bony movement vector, but it is unclear how much it contributes to the overall prediction accuracy. • I am not familiar with self-attention in such a context. A more detailed explanation of the configuration of this module and the intuition behind it would be beneficial to understand its effects. Again, it is unclear how much it contributes to the overall network performance. • I am not sure why FC-Net is weakly supervised. From my understanding, fully labelled training pairs (pre- and post-operative point clouds) are needed for training the network. • The notations in the formulars of the loss function could be unified to improve understanding. In formula (1), y is used to denote points in PF-post’, while in formula (2), the same is described by p’. In the explanation and formula of LPT loss, Ki and ki seem to be used interchangeably, which is a bit confusing.
• The quantitative results and Fig. 3 suggest that the proposed method is better than FEM-RLSE, however, it is stated that qualitative results were quite balanced. It would also be interesting to see cases in which the FEM-RLSE performed better than the FC-Net. • A three-fold contribution is promised, but there seem to be only two points then. • Figures 1 and 2 are very small and the text is difficult to read. They could be altered to more effectively use the white space around the figures to make them more easily readable. • Why is the cropping of bony and facial models necessary? Is it to have more dense data in the ROI, or does the network otherwise also deform the unaffected parts?

[1] Yin, K., Huang, H., Cohen-Or, D., & Zhang, H. (2018). P2p-net: Bidirectional point displacement net for shape transform. ACM Transactions on Graphics (TOG), 37(4), 1-13. [2] Yang, B., Yao, J., Wang, B., Hu, J., Pan, Y., Pan, T., … & Guo, X. (2020). P2MAT-NET: Learning medial axis transform from sparse point clouds. Computer Aided Geometric Design, 80, 101874. [3] Cunha, H. S., da Costa Moraes, C. A., Dornelles, R. D. F. V., & da Rosa, E. L. S. (2020). Accuracy of three-dimensional virtual simulation of the soft tissues of the face in OrtogOnBlender for correction of class II dentofacial deformities: an uncontrolled experimental case-series study. Oral and Maxillofacial Surgery, 1-17. [4] Knoops, P. G. M., Borghi, A., Breakey, R. W. F., Ong, J., Jeelani, N. U. O., Bruun, R., … & Padwa, B. L. (2019). Three-dimensional soft tissue prediction in orthognathic surgery: a clinical comparison of Dolphin, ProPlan CMF, and probabilistic finite element modelling. International journal of oral and maxillofacial surgery, 48(4), 511-518. [5] Kim, D., Ho, D. C. Y., Mai, H., Zhang, X., Shen, S. G., Shen, S., … & Xia, J. J. (2017). A clinically validated prediction method for facial soft‐tissue changes following double‐jaw surgery. Medical physics, 44(8), 4252-4261. [6] Resnick, C. M., Dang, R. R., Glick, S. J., & Padwa, B. L. (2017). Accuracy of three-dimensional soft tissue prediction for Le Fort I osteotomy using Dolphin 3D software: a pilot study. International journal of oral and maxillofacial surgery, 46(3), 289-295.
Please state your overall opinion of the paper

borderline accept (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

While I think the results and evaluation are not entirely convincing, the problem formulation is interesting and clever.
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

5
Reviewer confidence

Somewhat confident

Review #4

Please describe the contribution of the paper

This paper proposes a machine learning approach to predict post-operative facial appearance changes in CMF patients undergoing orthognathic surgery The authors us as inputs pre-operative bony pointsets+movement vectors and facial point sets to predicct point-wise displacements using an encoder and an autoencoder+self-attention based on feature-encoding and -decoding modules, respectively, before concatenating the output into a MLP. The authors validate their results in 40 cases against the ground truth and a previously reported FEM approach in a subset of cases, concluding that the proposed approach is as good as the FEM, yet taking less time to compute.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Proposed approach applied to real clinical data
- The approach is innovative in contrast to the current state of the art (FEM)
- The adapted loss function to account for the preservation of intrinsic topology
- Evaluation with a previously reported approach based on FEM
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- 40 cases seem too few for training the proposed network, 32 cases effectively for training following cross validation
- There are no results from cross-validation versus the ground truth
- Experiments are missing to assess the effectiveness of using the self-attention module and the proposed LPT loss component of the loss function.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Enough information is stated in the paper for reproducibility, part from the actual data.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://miccai2021.org/en/REVIEWER-GUIDELINES.html
- It is unclear why an encoder is only used for bony point sets and movement and a full autoencoder used for facial point sets.
- A discussion is missing elaborating on how the authors think the feature-encoding and -decoding modules helped in the learning task?
- Justify why a non-rigid approach is used when registering the deformed bony model to its post-op equivalent, when the bony segments are moved by surgeons rigidly
- Clarify whether the input bony points and facial points have a 1-to-1 correspondence. The authors mention that both N and M are set to 4096 but it is unclear whether there is any correspondence or points are fed into the network in a consistent manner across patients.
- Details are missing related to the self-attention module introduced in the point cloud
- How did the authors concatenate 1x128 with Nx128 to produce a matrix of Nx256?
- It is difficult to assess from the results the relative importance of the loss components in learning the task. With the presented results, it is difficult to assess how convenient the introduced L_{LPT} component has in the learning task, or how bad it is by not introducing it. What are the limitations of it? Why k is decided to have a value of 8? There is nothing stopping the authors to have different values of k for L_d and L_{LPT}. Moreover, Lambda and mu could have been learnt by the network as additional hyperparameters.
- Fig 3a is not clear and it is difficult to interpret. Either the authors use silhouettes in 2D or preferably use 3D faces showing the results of FEM and their method versus the ground truth.
- It is unclear how the authors randomly selected 40 clinical cases. among how many? why only 40?
- The paper might benefit from a diagram illustrating the landmarks annotated by surgeons. Also please include how many surgeons did this and whether there is any variability by doing that.
- The authors describe in Sec 3 that they crop a region of interest but that step is not shown in Fig. 1
- Include results from cross-validation, and how you measure accuracy of predictions versus ground truth
- It would have been much better to see the 40 cases performed using the FEM model. Despite that, it is better to show the actual t-test result. Also, it will be beneficial to do the t-test against the ground truth
- In Table one the authors report maximum error, but this error is far less than the errors seen in Fig. 3 which are in the order of 10 mm. Unless landmarks are not set in these regions.
- I don’t see any discussion about the limitations and future work of the proposed approach
Please state your overall opinion of the paper

accept (8)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Formulating the problem from a data-driven perspective, yet comparing their results with an existing FEM model. Also that they use clinical data
What is the ranking of this paper in your review stack?

2
Number of papers in your stack

3
Reviewer confidence

Confident but not absolutely certain

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

There is consensus amongst reviewers that the paper should be accepted, and provide as main strengths the improvements in speed of the planning tool compared to FEM, novelty of the approach, and interesting topic which focuses on a data-driven perspective instead of current FEM techniques.

Reviewers 4 and 5 bring out several issues and possible improvements which would need to be addressed in the revised version of the paper, particularly in the evaluation of the method.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers).

3

Author Feedback

N/A

back to top

Deep Simulation of Facial Appearance Changes Following Craniomaxillofacial Bony Movements in Orthognathic Surgical Planning