Computer Methods in Biomechanics and Biomedical Engineering

I CAN’T BELIEVE IT’S NOT BONE!

Goal

The goal of this project is to train a generative network to create volumetric spaces of synthetic bone. The vision is that these could be used to generate possible and practical bone shapes for finite element modeling, including density-dependent material properties. The scope of this work encompasses recreating a surface point cloud that mimics a human metatarsal, specifically metatarsals 2-4. The reason for this is based on limited availability of data, which this problem eventually hopes to address.
The finite element models developed from synthetic data will be analyzed to form real material analyses which will then be used in supervised machine learning methods. This highlights our long-term goal which is to develop a large (n>10,000) dataset of finite element examples to train on. These examples will be fueled by relatively similar geometries and material properties as real human models, in hopes that the resulting network will map functions that can mimic finite element analysis on the bones. The product of these studies will allow for structural analysis of in vivo human metatarsals, which may lead to better understandings of bone strength, particularly for runners who commonly experience metatarsal bone stress injuries.

Dataset

Three-dimensional Computed Tomography (CT) scans quantified the spatial density distribution of 228 in-vivo metatarsals from current runners and ex-vivo human cadaveric metatarsals. The surface of these bones were extracted via alphashape after segmentation with Mimics 26.0 (Materialise, UK). The resulting data were saved in point clouds, and then rotated to align the x, y, and z axes to the top three principal components. To augment this dataset, each metatarsal was rotated 180o on its x and y axes, creating four training examples per scan.

Training

Here, we focus on comparing network structures in early training by restricting the time allotted to get an objective measure of network efficiency. Further hyperparameter tuning and training may yield better results, such as a larger transformer model with a generated example below. While modern generative networks are effective at creating point clouds from noise that represent everyday objects, these objects have much more spatial variability for effective learning. In these studies, bones have smaller variations along each axis paired with a greater need for surface reconstruction accuracy.

Objective Function

Each network was trained to minimize chamfer loss between the original and encoded/decoded data (Equation 1). Minimizing this will optimize the average distance between the points in each cloud, and vice-versa to penalize functions that collapse into a single point.

d_ch= max{∑_i^L[√((n_ix-m_x )^2 + (n_iy-m_y )^2+ (n_iz-m_z )^2 )]}
Equation 1. Chamfer Loss is used to determine the overall spatial difference between two point clouds of the same sequence length. (L = number of points in each cloud, n_i = one point in either cloud [n_ix, n_iy, n_iz] , m_i = closest point of comparative cloud [m_x, m_y, m_z].

Evaluation

The Jensen-Shannon divergence between batched sets of real point clouds and their encoded/decoded counterparts was used to evaluate the networks between each training cycle (Equations 2 and 3). While this was not used to optimize model parameters, it gives us a glimpse into the overall difference between the high dimensional datasets along many examples. The choice to hide this metric from training was made to preserve its unbiassed evaluation of the training. High divergence values will shed light on a network’s inability to extrapolate learned functions across many examples.

D(N||M)= ∑_(x∈X)[N(x)∙log⁡(N(x)/(M(x))]
Equation 2. Kullback-Leibler (KL) divergence measures the difference between two probability distributions and is denoted by D(N||M).

JSD(N||M)= 1/2∙(D(N||(NM))+D(M||(NM))
Equation 3. Jensen-Shannon Divergence (JSD) utilizes the average KL divergence between each dataset and the mean of the datasets (NM).

Managing Network Collapse

Network collapse is a common issue caused by a multitude of training downfalls that are not mutually exclusive. Frequently, inappropriate objective functions, high batch sizes, and superfluous augmentation can force networks into local minima in the loss gradient. Therefore, simplifying the objective function, minimizing data augmentation, and optimizing hyperparameters such as batch size and learning rate fight off the possibility of collapse.

Further Work

We revisited the implementation of FoldingNet and moved the graphing functions from the packaged Pytorch algorithms to an explicit graphing function as published by the original authors. This implementation proved to be more efficient in coordinating neighboring points, and found success in recreating 3D metatarsal point clouds as well as generating new, synthetic point clouds that resemble the training set. The next step in utilizing the generated models for finite elements is to form a scalar field along each point cloud that represents a mineral density distribution throughout.

Example point cloud of a real metatarsal in green with a encoded/decoded point cloud in orange.

Example of synthetic metatarsal as a filled 3D model.