LEARNING-BASED LANDMARK ESTIMATION OF 3D BODY SCANS by AHMED BARUWA A THESIS Presented to the Department of Computer Science and the Division of Graduate Studies of the University of Oregon in partial fulfillment of the requirements for the degree of Master of Science December 2023 THESIS APPROVAL PAGE Student: Ahmed Baruwa Title: Learning-Based Landmark Estimation of 3D Body Scans This thesis has been accepted and approved in partial fulfillment of the requirements for the Master of Science degree in the Department of Computer Science by: Daniel Lowd Chair Humphrey Shi Core Member Susan Sokolowski Core Member Jacob Searcy Core Member and Krista Chronister Vice Provost of Graduate Studies Original approval signatures are on file with the University of Oregon Division of Graduate Studies. Degree awarded December 2023 2 © 2023 Ahmed Baruwa 3 THESIS ABSTRACT Ahmed Baruwa Master of Science Computer Science December 2023 Title: Learning-Based Landmark Estimation of 3D Body Scans The use of anatomical landmarks spans a diverse set of applications because they are essential for understanding the human body. Several research studies have examined the correlation between body shape variations and human performance. Anatomical landmarks are useful for taking anthropometric measures that can be used to characterize body geometries that relate to human performance. In this thesis, we compare parametric models of the human body that were developed from two machine learning methods - Convolutional Neural Network (CNN) and the Lasso Regression Model, to serve as tools for scalable anthropometric measurement. The models were trained on two publicly available labeled body scan datasets: Civilian American and European Anthropometry Resource (CAESAR) and Shape Retrieval Contest (SHREC). The models were used to localize human body landmarks in several poses. This work provides a scalable approach for collecting anthropometric measures. 4 CURRICULUM VITAE NAME OF AUTHOR: Ahmed Baruwa GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: University of Oregon Obafemi Awolowo University DEGREES AWARDED: Master of Science, Computer Science, 2023, University of Oregon Bachelor of Science, Electronic and Electrical Engineering, 2019, Obafemi Awolowo University AREAS OF SPECIAL INTEREST: Data Science PROFESSIONAL EXPERIENCE: Data Analyst, Interswitch, 6 months Research Engineering Intern, InstaDeep, 4 months Data Scientist, KPMG, 10 months GRANTS, AWARDS AND HONORS: PUBLICATIONS: 5 ACKNOWLEDGEMENTS First, I would like to express my deepest appreciation to Professor Daniel Lowd, my advisor, whose guidance was instrumental to the success of this project. I am also incredibly grateful to Professor Susan Sokolowski for her unwavering support throughout my program. Her passion, foresight, and commitment to excellence were truly inspiring. The invaluable support she provided during the critical stages of my program played a pivotal role in my achievements. Furthermore, I extend my sincere gratitude to Professor Jacob Searcy from whom I gained extensive knowledge about the ever-growing field of data science. His willingness to address my numerous inquiries with satisfactory answers was truly invaluable. Also for sharing the SMPL models discussed in this project. I am indebted to my closest friends, Wemimo Ayannubi, Fatai Balogun, Seun Fadugba, Jon Rabourn, and Shama Sama, who stood by me steadfastly throughout my program. Their steadfast support and encouragement were indispensable, and I am grateful to the University of Oregon for bringing us together. To my family, I owe an immeasurable amount of gratitude for their unwavering belief in me. Their late-night calls, well wishes were vital in keeping me motivated and determined. This work was supported by the Wu Tsai Human Performance Alliance and the Joe and Clara Tsai Foundation. 6 TABLE OF CONTENTS Chapter Page I. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 A Parametric Model of the Human Body . . . . . . . . . . . . . . . 13 Significance of Our Work . . . . . . . . . . . . . . . . . . . . . . . . 13 II. LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 III. METHODOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Skinned Multi-Person Linear Model (SMPL) . . . . . . . . . . . . . 18 Atomistic CNN Model . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Holistic CNN Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Spatial Transformation with the T-Network . . . . . . . . . . . . . . 20 Least Absolute Shrinkage and Selection Operator (Lasso) Regression 21 IV. RESULTS AND DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . 22 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Civilian American and European Surface Anthropometry Resource . 22 Shape Retrieval Contest (SHREC) 2014 . . . . . . . . . . . . . . . . 23 Fitting Machine Learning Models on 3D Point Clouds . . . . . . . 25 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 26 7 Chapter Page Holistic CNN versus Atomistic CNN Training Setting . . . . . . . 26 The Effect of Dataset Sizes on Landmarking Quality . . . . . . . . 29 V. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 APPENDICES A.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 B.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 REFERENCES CITED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 8 LIST OF FIGURES Figure Page 1. The three postures in the CAESAR database . . . . . . . . . . . . . . . . 23 2. Average squared Euclidean distance between predicted landmarks and ground truth landmarks, comparing holistic CNN and atomistic CNN models on CAESAR data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3. Average Euclidean distance between predicted and ground truth landmarks for holistic CNN against atomistic CNN models on SHREC data . . . 28 9 LIST OF TABLES Table Page 1. Comparative analysis of measurement errors in lasso regression and atomistic CNN models on the SHREC dataset. All measurements are in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. . . . . . . . . . . . . . . . . . . . . . . 24 2. Comparative analysis of measurement errors in Holistic CNN model with and without the T-Network on the SHREC dataset. All measurements are in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. . . . . . . . . . . . . . . . 24 3. Atomistic CNN vs Surface-to-Surface Registration (STS) for SHREC . . . 25 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. 32 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. 33 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. 34 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. 35 5. Euclidean distances between predicted and ground truth landmarks for holistic CNN with and without T-Network . . . . . . . . . . . . . . . . . . . . 36 5. Euclidean distances between predicted and ground truth landmarks for holistic CNN with and without T-Network . . . . . . . . . . . . . . . . . . . . 37 5. Euclidean distances between predicted and ground truth landmarks for holistic CNN with and without T-Network . . . . . . . . . . . . . . . . . . . . 38 5. Euclidean distances between predicted and ground truth landmarks for holistic CNN with and without T-Network . . . . . . . . . . . . . . . . . . . . 39 10 Table Page 6. Atomistic CNN models for SHREC using varied training set sizes . . . . . 40 7. Lasso regression models for SHREC using varied training set sizes . . . . 41 8. Holistic CNN models for SHREC using varied training set sizes . . . . . . 41 9. Lasso regression models for CAESAR using varied training set sizes . . . 42 10. Holistic CNN models for CAESAR using varied training set sizes . . . . . 45 11. Atomistic CNN models for CAESAR using varied training set sizes . . . . 48 11 CHAPTER I INTRODUCTION Anatomical landmarks are essential for taking anthropometric measurements of the human body. Some scholars in the field of human nutrition proposed a relationship between human performance and 3D body shape, metrics like fat composition (Ng et al., 2016). Objective attributions to human body performance through the use of quantitative measures like anatomical landmark locations are viable ways of understanding human health. Functional product designers require 3D body measurements to develop 2D product patterns or blueprints that fit the human body appropriately. There are lots of 3D Body scans that are not landmarked and designers are unable to design products required for safety and performance using these scans. Accurate 3D measurements of the human body are necessary to facilitate the design of good products. Problem Statement Manual approaches to identifying anatomical landmarks such as palpation, and handpicking software have been used in the past. In this thesis, we will investigate the use of machine learning methods to automate the process of landmarking on 3D bodies. This will be beneficial for the following reasons: rapid development in medical research can be made and also for functional product design. 12 A Parametric Model of the Human Body With the advent of artificial intelligence and the amount of storage available for recording data in numerous formats, it is possible to fit parametric models using iterative means to approximate variations in data. We derive parametric models of the body using a large database of human 3D body scans. Through this research, we can expect scalability in the process of taking measurements around the human body which in effect will improve our understanding of the shape of the human body. In this work, we consider two methods of scalability - the first one uses convolutional filters to identify locations around the body, and also one that uses a linear approach to fitting models in the human body. Significance of Our Work In our research, we develop machine learning models that localize landmarks in the human body using raw point cloud features. Most approaches use hand- engineered features like kernel signatures [Aubry et al., 2011, Yang et al., 2018] to model the human body. We train two different models - CNN and lasso regression. In this work, we built and compared the relative performances of an end-to-end neural network to that of a linear machine learning model 3D scans of the human body. Using our methods, body landmarks can be identified on multiple point clouds at a time rapidly which is a scalable alternative to manual procedures. 13 CHAPTER II LITERATURE REVIEW The body geometry of humans reflects a lot of things about them. Deductions can be made about the nature of an individual from their body shape. A Body Shape Index (ABSI), shown in equation 1, is a formula that is often used to predict the risk of premature mortality in people using three variables - waist circumference (WC), body mass index (BMI), and height. WC BSI = (2.1) BMI2/3 ·Height1/2 Grant et al. [2017] derived a relationship between the ABSI of the subjects to mortality from all causes, including cardiovascular (CVD) conditions and cancer using data collected across 4056 Australian adults. Thomas et al. [2013] invented a new metric that is height independent - Body roundness Index (BRI), as an alternative to body mass index for identifying people with the risk of visceral adiposity tissue volume (VAT). The authors discovered this relationship through their study conducted on three quantitative resources - Anthropometric measurements of human subjects found in the Third National Health and Nutrition Examination Survey (NHANES III), Magnetic Resonance Imaging (MRI) measured VAT data of subjects at St. Luke’s Roosevelt Hospital New York Nutrition Obesity Research Center (NORC) and MRI measured VAT of subjects pooled from several studies conducted at the Christian Albrecht’s University in Kiel, Germany. 14 Kuijk et al. [2019] identified shape-dependent risk factors that can cause posterior cruciate ligament (PCL) injuries. The authors studied the Rosenberg radiographs of 94 patients with ruptured PCL after carefully studying the shape of their knees. Sitko et al. [2023] examined the relationship between anthropometric variations among road cyclists with different performance levels. Additionally, an experiment was conducted to evaluate whether anthropometric measurements could be indicative of physiological markers commonly used to categorize road cyclists based on their performance levels. The researchers categorized 46 cyclists into groups based on their VO2 max levels, including recreationally trained, trained, well-trained, and professional cyclists. Graded exercise tests were conducted, and comprehensive anthropometric assessments were completed as part of the study. The length of the leg is often used as an index for measuring human physical attractiveness among other qualities like the nutritional status of infants, health status, and fecundity. [Kiire, 2016] conducted extensive research to investigate the relationship between Leg-to-body ratio (LBR) and attractiveness in humans. In their study, they invited 40 male and 40 female Japanese subjects to rate on a scale of 1 to 7 the level attractiveness of 22 human subjects of mixed sexes, and their experiments showed that measured participants with LBRs closest to the mean LBR were rated most attractive by human judges. Machine learning (ML) has seen widespread success in diverse domains such as speech recognition [Baevski et al., 2020], and self-playing agents [Silver et al., 2016]. ML continues to show promise in applications that require automatic 15 landmarking on the human body, where several studies have demonstrated its capabilities for the task. Pinte Caroline [2021] used an ML model to localize electrode positions from Magnetic Resonance Imaging (MRI) scans by pre-training it on Ultrashort Echo time (UTE) sequences of MRI images. Hargreaves et al. [2021] performed forensic facial reconstruction on fossil bones using a generative deep learning algorithm trained on a limited amount of learning data. Grishchenko et al. [2022] created “BlazePose GHUM Holistic”, a lightweight neural network pipeline for estimating 3D landmarks from monocular images. Giachetti et al. [2014] organized a point-localization contest for automatic landmarking on body scans, however, their method used hand-engineered features and a small data set. In this work, we used ML to automatically identify 3D coordinates of landmarks in point cloud data from 5000 3D body scans from the 2002 Civilian American and European Surface Anthropometry Resource (CAESAR) database and SHREC 2014 database. This was done by training a deep neural network on large databases of scans, where useful features were extracted from the data and mapped to landmark locations that can be referenced for anthropometric measurements. 16 CHAPTER III METHODOLOGY Point clouds are a rich, yet sparse surface representation of non-rigid bodies. Each is an array of points with several fields representing points in 3D space - fields like x,y,z coordinates, r,g,b values of each point. However, due to the inherent sparsity of point clouds, these data points are often widely dispersed, leading to inherent challenges in accurately representing the object’s surface and structure. As a result, a careful amount of attention must be devoted to the precise measurement of each attribute to ensure the fidelity of the resulting representation. Notably distinct from other surface representations, working with point clouds encapsulates a unique set of challenges and advantages. The sparsity of the data requires specialized techniques for efficient storage, processing, and analysis. Moreover, the characteristics of point clouds make them especially suitable for scenarios involving irregular or non-rigid structures, where conventional mesh-based representations may prove less effective. Point clouds have found diverse applications, ranging from 3D object reconstruction and augmented reality to autonomous navigation and environmental monitoring. This chapter discusses two machine learning approaches used in modeling the parametric relationship between the structural features of human point cloud data and geometric landmark locations using large hand-labeled point cloud datasets. We compared two machine learning models for fitting point clouds - a convolutional neural network (CNN) and a Lasso regressor. As for the CNN model, we compared two different training settings of convolutional neural networks - the Atomistic CNN model which learns individual landmarks with a fixed set of 17 parameters, and a holistic CNN model which learns multiple landmarks with a fixed set of parameters. Our CNN models are adapted from PointNet [Charles et al., 2017], a 30-layer neural network with 2.8 million parameters. It was trained on the ModelNet40 [Wu et al., 2014] for object detection dataset and ShapeNet dataset [Chang et al., 2015] for part segmentation. PointNet was demonstrated to be a model that was capable of solving three tasks - object classification, part segmentation, and semantic segmentation. PointNet is a deep neural network that efficiently works with point cloud data by exploiting the permutation invariance property of point clouds, through convolution and maxpooling. The CNN models used in this work are a slight modification of the PointNet. The output scores of the PointNet are what we use as landmark coordinates. Skinned Multi-Person Linear Model (SMPL) SMPL models [Bogo et al., 2016] are parameterized models that capture pose and shape variation in 3D bodies. With SMPL models it is possible to obtain a prior that represents a large set of data by minimizing the reconstruction error between the model and every point cloud object. We utilized SMPL to extract representations of the point cloud data, expressed in a low-dimensional space. Each SMPL point cloud fit contained 10,475 points resulting in an input tensor of size (10475 x 3) per sample for both the CAESAR and SHREC datasets. Atomistic CNN Model In the atomistic CNN training setting, first, we preprocessed the dataset and fitted a convolutional neural network on the dataset one landmark at a time. We 18 utilized a simple convolutional neural network with 101,123 parameters. During training, the input to the neural network was a batch of 3D body scan point clouds, where one of the pre-processing steps was to normalize the point cloud coordinates to meters, as the CAESAR data had mixed metric scales. We padded the point clouds to a uniform dimension of 256000 for CAESAR and 50000 for SHREC datasets. The resulting tensor is then fed to a masking layer which informs the network of the variable point set sizes of the point clouds in the batch during training. The tensor was next passed through a stack of 1D convolution layers, where each convolution used 128 (1x1) filters. The stack of convolution was then followed by a global max-pooling layer to aggregate point features and reduce the tensor to a single dimension, effectively making the network permutation invariant to the input point cloud. The resulting tensor was followed by two fully- connected (FC) layers. The first FC layer had 512 channels; the second had three channels. The atomistic CNN model has a total of 101,123 parameters. All hidden layers were equipped with Leaky-ReLU non-linearity. The training was done by optimizing the mean squared error (MSE) objective between predicted and ground truth landmarks using Adam at a learning rate of 1e-3. Holistic CNN Model A unified model that approximates multiple landmark locations might be a fast and transferable alternative to models that can only learn one landmark location per time. Here, the point cloud undergoes an input transformation, also feature transformation with the use of a T-network to align the point cloud in the input space and the embedding space respectively. The T-network comprises three convolution layers followed by three linear layers, thereby increasing the capacity 19 of the model. It learns a transformation matrix; which effectively aligns the point cloud and also reduces the input dimension by one. This architecture contains two stages of transformation - a 64x64 input transformation matrix, and a 3x3 feature transformation matrix. This feature learns a transformation matrix that aligns the 3D features during training. Finally, the 2D output features are fed into two fully- connected layers that predict landmark locations. The output dimension of the entire model is 3*C, where C is the number of landmarks. The holistic CNN model has a total of 3,518,247 parameters. Spatial Transformation with the T-Network The T-network, derived from [Khan et al., 2022], is an innovative neural network architecture designed to improve the processing of 3D point cloud information through learned spatial transformations. The T-Network with its spatial transformer module has demonstrated its effectiveness in various tasks, such as 3D object recognition, scene understanding, and point cloud segmentation [Charles et al., 2017]. By leveraging learned spatial transformations, the T-network enhances the processing of point cloud data, leading to improved results in tasks that require robust spatial reasoning. The main advantages of the T-Network adapted for point cloud data are as follows: The T-network can handle varying orientations, positions, and scales within point cloud data, enhancing the model’s ability to recognize objects or structures despite different spatial arrangements. The T-network equips the model with the ability to landmark point clouds with varying orientations, positions, and scales within point cloud data, enhancing the model’s ability to recognize objects or structures despite different spatial arrangements. 20 The T-network was specifically featured in the holistic CNN architecture to capture global context and relationships between points in the point cloud, whereas for pointwise classification, the focus is on classifying individual points within a point cloud independently. The spatial relationships between points are less critical, and the model can often rely on local features. Since the T-network’s strength lies in capturing global relationships, it might introduce unnecessary complexity to a pointwise landmarking, without providing substantial benefits. Least Absolute Shrinkage and Selection Operator (Lasso) Regression We fitted a lasso regression model on raw point cloud features. A crucial preprocessing step was replacing missing point cloud features using mean imputation. We fitted three regressors to represent x, y, and z coordinate labels. While CNN models were robust to the unordered nature of the point cloud data and could approximate the landmark locations to a decent level of accuracy, the lasso regression models struggled due to the fact that they are just linear models without non-linear activation units nor permutation invariant layers, however, with the kind of a fixed set of ordering SMPL fits offer, the originally large point-cloud data was down-sampled by a factor of 25 while preserving the spatial information of points in the point cloud data. Tables 1 and 2 show the results of landmarking on SHREC dataset, the Appendix shows the results for CAESAR. 21 CHAPTER IV RESULTS AND DISCUSSION In this chapter, we will describe the observations we made through the analyses of our methods discussed in the previous chapter. Datasets Civilian American and European Surface Anthropometry Resource (CAESAR) The CAESAR [Robinette and Daanen, 1999] project is a large-scale project carried out to facilitate the design of apparel and equipment. 3D scans were collected across 5000 individuals (male and female) within the age bracket of 18 to 65 in three countries - the United States of America, the Netherlands, and Italy to study the common variations in the human body. By understanding the common body shape variations in humans, product designers, and engineers can develop discrete sizing for apparel, workstations, and vehicular manufacturing. Because the survey was taken through 3D cameras and the exposure of the different parts of the human body varies with posture, each subject’s data was collected in three separate poses, depicted in figure 2 - the Standing Posture (A-pose) which the individual stands with the arms slightly abducted away from the body, digits pointing downwards, the Seated Comfortable Working Posture (B-Pose) in which the subjects place their arms on their thighs and the Seated Coverage posture (C-Pose) - the subjects raises their two arms with their digits pointing upward in a horizontal plane forming a right angle at the elbow and also at the knee joint. Although this pose is very much less reproducible, it was deemed to create a 22 FIGURE 1. The three postures in the CAESAR database better surface area coverage of the underarms [Brunsman et al., 1997]. A total of 74 landmark positions were recorded per subject per pose in the database. The Cyberware WB4 and Vitronic 3D scans were used to take measurements in North America and Europe respectively. Each 3D scan was represented as a water-tight mesh averaging 200K vertices per scan. We selected CAESAR partly because it is the largest commercially available database of human 3D scans to date. Shape Retrieval Contest (SHREC) 2014 The SHREC dataset was collected towards a contest in shape retrieval and pattern recognition. Participants were asked to identify six landmarks on 3D body scans using modern geometry and pattern recognition [Giachetti et al., 2014]. The scans were acquired using a structured light 3D body scanner (Breuckmann BodyScan). The dataset is split into fifty 3D scans for training and fifty for testing. The participants were tasked to reproduce the coordinates of each of the six landmarks on the 3D body scan. The landmarks were manually annotated on the scans using the Meshlab software [Cignoni et al., 2008] point-picking tool. 23 TABLE 1. Comparative analysis of measurement errors in lasso regression and atomistic CNN models on the SHREC dataset. All measurements are in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. LANDMARK Lasso Atomistic CNN mean median 95% mean median 95% ACROMIALE 6.63 6.50 9.49 2.38 2.11 4.64 ILIOCRISTALE 2.57 2.41 5.64 3.17 3.09 5.81 RADIALE 1.21 0.97 2.69 3.61 2.97 7.32 STYLION 5.32 5.18 7.50 2.85 2.58 5.32 TIBIALE LATERALE 2.87 2.68 5.32 2.51 2.30 4.95 TROCHANTERION 3.11 3.09 6.88 3.12 2.62 4.29 TABLE 2. Comparative analysis of measurement errors in Holistic CNN model with and without the T-Network on the SHREC dataset. All measurements are in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. LANDMARK Holistic CNN (w/ T-net) Holistic CNN (w/o T-net) mean median 95% mean median 95% ACROMIALE 2.70 2.66 4.00 4.14 3.35 7.43 ILIOCRISTALE 3.21 3.54 4.79 3.65 3.65 5.95 RADIALE 5.21 5.80 6.05 6.25 6.21 9.09 STYLION 5.01 5.01 7.69 5.23 4.47 10.10 TIBIALE LATERALE 2.58 2.44 3.59 2.82 2.89 4.60 TROCHANTERION 3.04 3.10 4.54 3.11 2.45 4.93 24 TABLE 3. Comparative analysis of measurement errors in atomistic CNN models on the SHREC dataset and Surface-to-Surface Registration (STS). All measurements are in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. LANDMARK STS Atomistic CNN mean median mean median ACROMIALE 1.24 1.16 2.38 2.11 ILIOCRISTALE 2.29 2.00 3.17 3.09 RADIALE 2.72 2.31 3.61 2.97 STYLION 2.47 1.72 2.85 2.58 TIBIALE LATERALE 1.56 1.55 2.51 2.30 TROCHANTERION 2.58 2.52 3.12 2.62 Training Machine Learning Models to Fit Landmarks on 3D Point Clouds First, we trained a convolutional neural network (CNN) to fit the landmarks. Next, we fitted a Lasso regression model on the datasets. We report performance in terms of the average Euclidean distances between the predicted and ground truth landmark position in the SHREC dataset shown in Tables 1 and 2. While the atomistic CNN model outperformed the Lasso regression model in fitting the four of the SHREC landmarks, it performed worse on two particular landmarks - “Illiocristale” and “Radiale”. This is shown in Table 1. However, the reverse is the case for CAESAR - Lasso regression model outperformed the atomistic CNN model on 50 out of 74 CAESAR landmarks. Our observation on the usage of T-Network in the CNN architecture is that it improves the model performance, by up to 53% on the SHREC dataset as shown in Table 2. In Table 3, we compare our best- performing model on SHREC (Atomistic CNN) with surface-to-surface registration (STS) Giachetti et al. [2014] and observe the relatively close error of the atomistic CNN models with the STS on two landmarks - “Radiale” and “Trochanterion”. 25 The STS method outperformed atomistic CNN by a significant margin of about 1.8cm on other landmarks. A crucial preprocessing step was normalizing the measurements, converting those that were originally recorded in millimeters to meters as 25% of the CAESAR measurements were recorded in millimeters. Tables 4 and 5 show the landmarking errors made by the various models on the CAESAR dataset with the atomistic CNN having the best landmarking performance on the test set for both datasets. We observed that even though the holistic CNN model fitted on the SHREC dataset performed a lot better on all landmarks when the T-Network was used, it was not always the case in the CAESAR dataset, however, the holistic CNN model performed better on the majority of the CAESAR landmarks when a T-Network was used. We split the CAESAR dataset, in a ratio of 80%/20% and we did a 77%/33% split on the SHREC dataset. Implementation Details All experiments were run on the University of Oregon’s supercomputer, TALAPAS. Training and testing were done on a Tesla V100 machine, fitting CNN models on CAESAR took 2 hours for each landmark, it took about 30 minutes on SHREC. Training a holistic model lasted for a duration of 12 hours on CAESAR, and it lasted a minute on SHREC. All line plots were made with Plotly Chart Studio (plotly.com). All 3D mesh figures were generated using PyVista. Holistic CNN versus Atomistic CNN Training Setting The CNN model was trained to minimize the mean-squared error (MSE) between the predicted landmark and the manually labeled landmark. We observed 26 that training an (atomistic) CNN model for each landmark yielded estimations with smaller mean squared errors when compared with a single (holistic) CNN classifier fitted to represent all landmark locations. This is evident in the scatter and bar plots shown in figures 2 and 3. Figure 2 is a scatter plot comparing the errors of landmark estimations of a holistic CNN model against that of an atomistic CNN model in the CAESAR dataset. The majority of the points in the scatter plot lie above the “holistic CNN = atomistic CNN” line, which signifies that the holistic CNN model had Euclidean distances that are significantly larger than those made in the atomistic CNN model, which might be due to the unrelatedness in the distribution of the coordinates across all landmarks positions within the point cloud data. Figure 3 is a bar plot of the two training settings on SHREC. When we compare the average Euclidean distances between predicted and ground truth landmarks for each of these datasets in the training set, it will be observed that the holistic CNN model makes more significant prediction errors in each case. From the results achieved during experimentation, one can argue that training an atomistic model would be a better fit and applicable in practice compared to a single model on all landmarks. 27 FIGURE 2. Average squared Euclidean distance between predicted landmarks and ground truth landmarks, comparing holistic CNN and atomistic CNN models on CAESAR data FIGURE 3. Average Euclidean distance between predicted and ground truth landmarks for holistic CNN against atomistic CNN models on SHREC data 28 The Effect of Dataset Sizes on Landmarking Quality We conducted an analysis to assess the influence of training models on both CAESAR and SHREC datasets, using only a fraction of the original training data. The results of this analysis are presented in Tables 6 to 11, with all results obtained from CAESAR included in Appendix 2. We divided each model into three subsets, utilizing 50%, 25%, and 10% of the training set, and then tested them on the entire test set to evaluate the impact of different training set sizes on test performance. Our observations revealed that models trained on larger datasets consistently yielded lower test errors for all SHREC models. However, in the case of CAESAR, when we applied a Lasso regression model to these varying training set sizes, the performance remained relatively similar across all landmarks. For atomistic CNN models, landmarking errors increased as the training set size decreased, while holistic CNN models exhibited slight improvements as we reduced the training set sizes. This suggests that the atomistic CNN models might achieve even better performance if trained on larger datasets. 29 CHAPTER V CONCLUSION Through the advancements in computational geometry and machine learning, several methods that estimate the positions/paths of points in point cloud data have been devised. The goal of this thesis project was to develop and compare machine learning methods for the purpose of landmark estimation of 3D point cloud data towards health diagnostics and functional product design of products that perform with safety and efficiency. We developed a scalable method for landmarking human body scans that can be improved upon to create a system that reliably takes measurements around 3D body meshes for commercial purposes. In this work, we created a tool that can predict landmark locations on the human body in multiple poses. Chapter 3 discusses the novel methods we used in training our neural network models and we identified scenarios where training models in a particular fashion produces models that are less error-prone. Future Work Due to the graphical nature of 3D meshes, a tool that can automatically estimate point locations and also take surface measurements around human point cloud data can be built by exploring other methods like graph neural networks (GNN) [Hamilton et al., 2017], vision transformers [Dosovitskiy et al., 2021]. Ensembling multiple models into one unified model would hypothetically be an advancement over the methods described in this paper. In future work, ML models will be used to take anthropometric measurements and to train new landmarks - ones different from the CAESAR dataset to 30 enable custom measurement capabilities related to human performance and functional product creation. This research also demonstrated that known tedious anthropometric measurement procedures could be expedited to develop more relevant 2D blueprints, to enable the efficient commercialization of functional products that fit better - to improve performance, comfort, and safety. Additionally, exploring the possibility of utilizing unsupervised machine-learning algorithms to train machine-learning models for localizing landmarks on the human body. 31 APPENDIX A TABLE 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Lasso Regression Atomistic CNN mean median 95% mean median 95% 10th Rib Midspine 3.73 3.39 7.30 3.05 1.91 6.13 Butt Block 2.88 2.15 3.20 2.14 3.22 4.95 Cervicale 3.27 3.12 6.18 4.49 4.49 5.51 Crotch 3.46 3.31 6.32 1.79 1.09 4.97 Lt. 10th Rib 4.40 4.30 8.08 1.67 1.58 2.61 Lt. Acromion 3.96 3.65 7.25 4.56 4.55 5.56 Lt. ASIS 4.79 4.44 8.00 2.46 2.34 3.56 Lt. Axilla, Ant 3.91 3.73 7.45 4.21 5.33 8.33 Lt. Axilla, Post. 4.02 3.78 7.51 3.86 5.40 7.14 Lt. Calcaneous, Post. 1.81 1.66 3.48 1.23 2.85 3.36 Lt. Clavicale 3.33 3.10 6.23 3.73 3.34 4.61 Lt. Dactylion 4.22 4.01 7.73 1.33 1.33 1.63 Lt. Digit II 2.30 2.03 4.57 4.41 4.38 5.37 Lt. Femoral Lateral Epicn 3.05 2.85 5.84 4.39 1.43 1.74 Lt. Femoral Medial Epicn 3.79 3.67 6.55 1.42 1.42 1.72 Lt. Gonion 3.44 3.27 6.44 6.24 6.23 7.60 Lt. Humeral Lateral Epicn 4.18 3.95 7.47 3.40 3.37 4.33 Lt. Humeral Medial Epicn 4.01 3.75 7.80 2.22 2.21 2.92 32 TABLE 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Lasso Regression Atomistic CNN mean median 95% mean median 95% Lt. Iliocristale 4.13 3.85 7.50 1.38 1.30 2.05 Lt. Infraorbitale 3.60 3.30 6.87 8.92 8.92 10.74 Lt. Knee Crease 2.96 2.87 5.51 1.43 1.41 1.70 Lt. Lateral Malleolus 1.83 1.66 3.79 3.75 3.73 4.56 Lt. Medial Malleolus 5.47 5.54 6.84 3.63 3.62 4.43 Lt. Metacarpal-Phal. II 4.07 3.89 7.36 1.03 1.03 1.26 Lt. Metacarpal-Phal. V 4.13 3.93 7.48 1.91 1.91 1.12 Lt. Metatarsal-Phal. I 2.06 1.86 4.07 4.17 4.15 5.11 Lt. Metatarsal-Phal. V 1.91 1.74 3.88 4.25 4.22 5.19 Lt. Olecranon 4.16 3.72 7.68 2.69 2.66 3.51 Lt. PSIS 4.25 4.05 7.68 6.00 5.72 9.69 Lt. Radial Styloid 4.27 4.11 7.80 7.83 7.80 9.62 Lt. Radiale 4.13 3.91 7.48 3.57 3.55 4.45 Lt. Sphyrion 5.81 5.80 7.31 3.75 3.73 4.55 Lt. Thelion/Bustpoint 4.13 3.87 7.56 2.23 3.35 4.06 Lt. Tragion 3.55 3.35 6.88 2.83 3.82 2.994 Lt. Trochanterion 4.05 3.79 7.27 2.78 2.70 3.951 Lt. Ulnar Styloid 4.25 3.98 8.00 2.77 2.76 4.94 Nuchale 3.71 3.53 7.02 3.69 2.68 3.85 Rt. 10th Rib 4.28 3.99 7.52 2.57 1.47 3.49 33 TABLE 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Lasso Regression Atomistic CNN mean median 95% mean median 95% Rt. ASIS 3.80 3.58 7.20 2.37 2.25 3.44 Rt. Acromion 3.82 3.57 7.08 4.32 4.28 5.25 Rt. Axilla, Ant 3.73 3.47 6.83 3.12 2.87 3.91 Rt. Axilla, Post. 3.90 3.62 7.42 2.55 3.30 4.07 Rt. Calcaneous, Post. 1.79 1.63 3.46 2.72 2.06 2.91 Rt. Clavicale 3.27 3.07 6.08 3.70 3.67 4.63 Rt. Dactylion 4.28 4.09 7.76 1.31 1.30 1.58 Rt. Digit II 2.38 2.22 4.41 4.38 4.36 5.35 Rt. Femoral Lateral Epicn 3.08 2.90 5.76 1.43 1.41 1.75 Rt. Femoral Medial Epicn 4.05 3.91 7.07 3.39 3.87 4.31 Rt. Gonion 3.45 3.30 6.51 6.19 6.21 7.59 Rt. Humeral Lateral Epicn 4.14 3.92 7.60 3.15 3.11 4.08 Rt. Humeral Medial Epicn 3.93 3.77 7.18 2.05 2.01 0.27 Rt. Iliocristale 4.00 3.79 7.31 1.32 1.25 1.98 Rt. Infraorbitale 3.62 3.32 6.76 8.94 8.94 10.74 Rt. Knee Crease 3.01 2.85 5.51 1.43 1.41 1.70 Rt. Lateral Malleolus 1.94 1.77 3.74 3.75 3.73 4.56 Rt. Medial Malleolus 2.82 2.70 4.83 3.61 3.59 4.43 Rt. Metacarpal-Phal. II 4.15 3.93 7.54 6.94 6.89 9.07 Rt. Metacarpal-Phal. V 4.19 3.94 7.62 4.07 4.90 5.05 34 TABLE 4. Euclidean distances between predicted and ground truth landmark between Lasso Regression and Atomistic CNN models across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Lasso Regression Atomistic CNN mean median 95% mean median 95% Rt. Metatarsal-Phal. I 2.19 2.02 4.21 4.17 4.16 5.12 Rt. Metatarsal-Phal. V 2.07 1.92 3.96 4.24 4.22 5.19 Rt. Olecranon 4.11 3.97 7.41 2.49 2.46 3.29 Rt. PSIS 4.01 3.80 7.09 6.10 5.73 9.91 Rt. Radial Styloid 4.20 3.90 7.68 7.47 7.42 9.11 Rt. Radiale 4.12 3.81 7.42 3.27 3.25 4.21 Rt. Sphyrion 3.46 3.33 5.30 3.73 3.73 4.58 Rt. Thelion/Bustpoint 3.94 3.62 7.22 3.15 3.34 5.22 Rt. Tragion 3.54 3.25 6.64 8.25 8.27 9.92 Rt. Trochanterion 4.08 3.87 7.49 2.74 2.65 3.82 Rt. Ulnar Styloid 4.22 3.91 7.53 7.62 7.59 9.29 Sellion 3.74 3.46 7.11 9.94 9.92 11.99 Substernale 3.93 3.76 7.33 1.96 1.81 3.05 Supramenton 3.62 3.34 6.66 7.10 7.05 8.78 Suprasternale 3.28 3.07 6.32 3.56 3.52 4.53 Waist, Preferred, Post. 4.16 3.87 8.09 4.33 4.01 5.13 35 TABLE 5. Euclidean distances between predicted and ground truth landmark the holistic CNN with and without T-Network across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Holistic CNN (w/o T-net) Holistic CNN (w/ T-net) mean median 95% mean median 95% 10th Rib Midspine 3.77 3.48 7.00 3.51 2.90 7.89 Butt Block 4.12 3.29 4.61 2.63 3.22 3.41 Cervicale 4.26 3.90 7.86 4.78 2.45 13.21 Crotch 3.08 2.85 6.04 2.70 1.97 8.14 Lt. 10th Rib 3.88 3.63 7.34 3.74 3.34 7.92 Lt. Acromion 3.63 3.35 6.73 3.18 2.74 7.06 Lt. ASIS 4.00 3.92 8.06 4.33 2.50 12.18 Lt. Axilla, Ant 4.03 3.70 8.01 3.36 2.48 8.50 Lt. Axilla, Post. 4.14 3.85 7.83 3.91 2.78 9.59 Lt. Calcaneous, Post. 1.60 1.51 2.97 7.18 2.69 19.55 Lt. Clavicale 4.19 3.95 8.18 3.95 2.29 10.75 Lt. Dactylion 5.06 4.65 9.85 4.96 3.78 11.79 Lt. Digit II 1.66 1.51 3.16 8.07 2.76 21.99 Lt. Femoral Lateral Epicn 2.53 2.45 4.58 3.62 2.70 9.25 Lt. Femoral Medial Epicn 2.52 2.42 4.94 3.58 2.77 8.87 Lt. Gonion 4.44 4.07 8.76 5.02 2.55 14.04 Lt. Humeral Lateral Epicn 3.82 3.55 7.36 3.60 2.60 8.30 Lt. Humeral Medial Epicn 3.85 3.65 7.24 3.61 2.76 7.83 Lt. Iliocristale 3.95 3.76 6.96 3.55 2.98 7.79 Lt. Infraorbitale 4.65 4.20 8.92 5.46 2.73 15.04 36 TABLE 5. Euclidean distances between predicted and ground truth landmark the holistic CNN with and without T-Network across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Holistic CNN (w/o T-net) Holistic CNN (w/ T-net) mean median 95% mean median 95% Lt. Knee Crease 2.48 2.35 4.52 4.07 2.76 10.83 Lt. Lateral Malleolus 1.50 1.37 2.76 7.50 2.73 20.69 Lt. Medial Malleolus 1.63 1.60 2.99 5.88 2.32 17.03 Lt. Metacarpal-Phal. II 4.27 3.86 8.46 3.55 2.83 8.18 Lt. Metacarpal-Phal. V 4.07 3.67 8.19 3.77 2.80 9.03 Lt. Metatarsal-Phal. I 1.65 1.59 3.04 7.37 2.61 20.44 Lt. Metatarsal-Phal. V 1.49 1.43 2.77 7.74 2.66 20.94 Lt. Olecranon 3.84 3.52 7.18 3.61 2.61 8.15 Lt. PSIS 3.90 3.63 7.18 3.42 2.90 7.87 Lt. Radial Styloid 3.92 3.46 8.12 3.59 2.78 8.66 Lt. Radiale 3.81 3.60 7.52 3.34 2.43 7.43 Lt. Sphyrion 1.64 1.62 3.08 6.09 2.35 17.19 Lt. Thelion/Bustpoint 4.09 3.76 7.68 3.59 2.73 8.66 Lt. Tragion 4.63 4.30 8.77 5.13 2.53 14.35 Lt. Trochanterion 3.80 3.50 7.06 3.30 2.76 7.44 Lt. Ulnar Styloid 3.81 3.45 7.56 3.50 2.68 8.31 Nuchale 4.61 4.18 8.96 5.58 2.96 14.92 Rt. 10th Rib 4.05 3.89 7.92 3.52 3.12 7.65 Rt. ASIS 3.61 3.31 6.77 3.17 2.67 7.25 Rt. Acromion 4.18 3.90 8.05 4.13 2.62 10.90 37 TABLE 5. Euclidean distances between predicted and ground truth landmark the holistic CNN with and without T-Network across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Holistic CNN (w/o T-net) Holistic CNN (w/ T-net) mean median 95% mean median 95% Rt. Axilla, Ant 4.06 3.66 7.89 3.67 2.71 8.95 Rt. Axilla, Post. 3.93 3.70 7.64 3.78 2.62 9.10 Rt. Calcaneous, Post. 1.62 1.47 3.20 7.31 2.81 20.38 Rt. Clavicale 4.17 3.88 8.02 3.50 2.13 9.55 Rt. Dactylion 4.93 4.58 9.23 5.17 3.74 13.02 Rt. Digit II 1.83 1.74 3.48 7.60 2.67 20.73 Rt. Femoral Lateral Epicn 2.65 2.49 4.89 4.06 2.85 10.70 Rt. Femoral Medial Epicn 2.41 2.29 4.61 3.54 2.77 9.62 Rt. Gonion 4.68 4.29 8.66 5.14 2.60 14.20 Rt. Humeral Lateral Epicn 3.69 3.41 6.64 3.50 2.59 7.81 Rt. Humeral Medial Epicn 3.72 3.54 3.27 2.85 6.96 Rt. Iliocristale 3.90 3.74 6.93 3.28 2.67 7.57 Rt. Infraorbitale 4.71 4.27 8.94 5.48 2.79 15.30 Rt. Knee Crease 2.46 2.32 4.53 3.93 2.61 10.75 Rt. Lateral Malleolus 1.70 1.55 3.06 6.41 2.47 17.35 Rt. Medial Malleolus 1.69 1.58 3.18 7.01 2.67 19.38 Rt. Metacarpal Phal. II 4.12 3.74 7.84 3.73 2.81 9.39 Rt. Metacarpal-Phal. V 4.04 3.66 8.03 3.48 2.82 8.16 Rt. Metatarsal-Phal. I 1.72 1.61 3.24 7.54 2.73 20.52 Rt. Metatarsal-Phal. V 1.77 1.67 3.42 8.16 2.88 22.04 38 TABLE 5. Euclidean distances between predicted and ground truth landmark the holistic CNN with and without T-Network across 74 landmark locations on CAESAR test set. All measurements are expressed in centimeters. Landmark Holistic CNN (w/o T-net) Holistic CNN (w/ T-net) mean median 95% mean median 95% Rt. Olecranon 3.80 3.47 7.01 3.65 2.84 8.22 Rt. PSIS 3.58 3.26 6.75 3.55 2.98 8.44 Rt. Radial Styloid 3.81 3.45 7.75 3.44 2.75 8.03 Rt. Radiale 3.76 3.48 6.70 3.32 2.54 7.33 Rt. Sphyrion 1.72 1.60 3.41 6.27 2.56 17.20 Rt. Thelion/Bustpoint 3.99 3.71 7.77 3.51 2.64 8.59 Rt. Tragion 4.74 4.46 8.83 5.62 2.64 15.65 Rt. Trochanterion 3.67 3.48 6.94 3.38 2.82 7.82 Rt. Ulnar Styloid 3.80 3.43 7.81 3.73 2.91 8.85 Sellion 4.74 4.38 8.93 5.51 2.91 15.14 Substernale 3.89 3.57 7.50 3.24 2.71 7.13 Supramenton 4.50 4.22 8.75 4.49 2.55 12.34 Suprasternale 4.19 3.98 8.08 3.56 2.21 9.66 Waist, Preferred, Post. 3.63 3.33 6.89 3.58 3.02 8.39 39 APPENDIX B ACROMIALE ILIOCRISTALE RADIALE Landmark 50% 25% 10% 50% 25% 10% 50% 25% 10% Mean 3.14 3.07 4.60 3.53 3.81 3.87 3.65 5.17 6.39 Median 3.12 3.01 4.48 3.33 3.29 3.71 3.25 4.66 5.60 95th Percentile 5.56 5.11 8.23 7.60 7.62 7.38 7.05 11.13 11.84 STYLION TIBIALE LATERALE TROCHANTERION Landmark 50% 25% 10% 50% 25% 10% 50% 25% 10% Mean 3.50 4.65 7.66 3.07 3.72 4.17 3.43 3.90 4.88 Median 2.77 4.43 7.07 2.99 3.41 3.82 3.28 3.61 4.65 95th Percentile 7.09 8.94 14.23 5.48 6.05 7.31 6.81 8.19 8.49 TABLE 6. Test errors from Atomistic CNN models trained on SHREC using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. 40 ACROMIALE ILIOCRISTALE RADIALE Landmark 50% 25% 10% 50% 25% 10% 50% 25% 10% Mean 4.30 7.14 12.73 4.92 2.48 6.71 4.45 10.38 23.54 Median 4.26 6.89 12.95 4.93 2.27 6.17 4.36 10.48 23.10 95th Percentile 7.08 10.08 15.04 8.79 6.59 10.18 7.29 13.79 29.06 STYLION TIBIALE LATERALE TROCHANTERION Landmark 50% 25% 10% 50% 25% 10% 50% 25% 10% Mean 7.53 6.71 32.88 11.5 27.52 20.22 2.46 17.49 16.21 Median 7.49 6.71 32.97 11.26 27.52 20.05 1.88 17.37 16.23 95th Percentile 10.10 8.54 37.40 13.89 30.98 23.61 6.67 21.69 20.04 TABLE 7. Test errors from Lasso regression models trained on SHREC using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. ACROMIALE ILIOCRISTALE RADIALE Landmark 50% 25% 10% 50% 25% 10% 50% 25% 10% Mean 6.77 6.03 12.73 1.14 2.48 6.71 10.52 10.38 23.54 Median 5.02 6.89 12.95 3.09 2.27 6.17 11.52 10.48 23.10 95th Percentile 9.47 10.08 15.04 3.39 6.59 10.18 12.59 13.79 29.06 STYLION TIBIALE LATERALE TROCHANTERION Landmark 50% 25% 10% 50% 25% 10% 50% 25% 10% Mean 7.83 16.71 32.88 5.44 27.52 20.22 9.93 17.49 16.21 Median 8.00 11.09 32.97 6.28 27.52 20.05 11.06 17.37 16.23 95th Percentile 10.80 18.54 37.40 9.44 30.98 23.61 19.93 21.69 20.04 TABLE 8. Test errors from holistic CNN models trained on SHREC using varied training set sizes - 50%, 25%, and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. 41 42 TABLE 9. Table 9. Test errors from Lasso Regression models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Lasso (50%) Lasso (25%) Lasso (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile 10th Rib Midspine 3.73 3.42 7.42 3.71 3.41 7.48 3.70 3.41 7.53 Butt Block 4.20 3.89 8.13 4.20 3.89 8.13 4.10 3.81 7.96 Cervicale 3.27 3.11 6.18 3.26 3.05 6.10 3.34 3.19 6.27 Crotch 3.45 3.29 6.30 3.49 3.30 6.47 3.55 3.36 6.62 Lt. 10th Rib 4.39 4.32 8.10 4.40 4.28 7.98 4.35 4.24 7.86 Lt. ASIS 3.96 3.62 7.32 3.95 3.68 7.32 3.92 3.57 7.15 Lt. Acromion 4.71 4.34 8.19 4.77 4.53 7.99 4.66 4.49 7.67 Lt. Axilla, Ant 3.88 3.69 7.43 3.91 3.73 7.42 3.80 3.56 7.24 Lt. Axilla, Post. 4.02 3.79 7.52 4.05 3.79 7.58 3.90 3.62 7.04 Lt. Calcaneous, Post. 1.81 1.67 3.47 1.80 1.65 3.50 1.82 1.65 3.45 Lt. Clavicale 3.33 3.10 6.20 3.34 3.15 6.23 3.31 3.10 6.02 Lt. Dactylion 4.13 3.92 7.49 4.17 3.92 7.59 4.26 4.02 7.86 Lt. Digit II 2.30 2.03 4.56 2.30 2.05 4.54 2.31 2.04 4.55 Lt. Femoral Lateral Epicn 3.04 2.85 5.86 3.04 2.86 5.78 3.01 2.85 5.79 Lt. Femoral Medial Epicn 4.04 3.95 7.02 4.16 4.03 6.99 3.85 3.74 6.71 Lt. Gonion 3.43 3.22 6.47 3.50 3.37 6.50 3.40 3.21 6.37 Lt. Humeral Lateral Epicn 4.19 3.95 7.52 4.18 3.99 7.45 4.17 3.98 7.42 Lt. Humeral Medial Epicn 4.02 3.76 7.76 4.01 3.74 7.74 4.01 3.67 7.68 Lt. Iliocristale 4.14 3.86 7.44 4.13 3.86 7.41 4.07 3.78 7.48 Lt. Infraorbitale 3.61 3.33 6.88 3.58 3.35 6.67 3.54 3.30 6.68 Lt. Knee Crease 2.96 2.87 5.40 2.95 2.88 5.46 2.91 2.81 5.38 Lt. Lateral Malleolus 1.83 1.66 3.79 1.83 1.65 3.78 1.84 1.66 3.76 Lt. Medial Malleolus 5.16 5.21 6.52 4.74 4.74 6.02 5.48 5.54 6.84 Lt. Metacarpal-Phal. II 4.00 3.86 7.20 4.03 3.88 7.39 4.06 3.87 7.22 Lt. Metacarpal-Phal. V 4.07 3.88 7.48 4.12 3.91 7.63 4.04 3.87 7.20 43 TABLE 9. Test errors from Lasso Regression models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Lasso (50%) Lasso (25%) Lasso (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile Lt. Metatarsal-Phal. I 2.06 1.86 4.06 2.06 1.88 4.16 2.06 1.88 4.14 Lt. Metatarsal-Phal. V 1.91 1.74 3.88 1.91 1.75 3.74 1.91 1.73 3.76 Lt. Olecranon 4.17 3.78 7.69 4.17 3.77 7.59 4.14 3.76 7.57 Lt. PSIS 4.26 4.05 7.74 4.23 3.98 7.65 4.19 3.87 7.71 Lt. Radial Styloid 4.22 4.09 7.74 4.16 4.05 7.70 4.24 4.15 7.69 Lt. Radiale 4.14 3.88 7.54 4.14 3.95 7.50 4.12 3.93 7.48 Lt. Sphyrion 5.81 5.80 7.30 6.96 6.97 8.44 3.25 3.17 4.80 Lt. Thelion/Bustpoint 4.12 3.87 7.65 4.13 3.90 7.47 4.10 3.88 7.48 Lt. Tragion 3.54 3.34 6.93 3.55 3.34 6.75 3.53 3.36 6.85 Lt. Trochanterion 4.06 3.76 7.18 4.02 3.82 7.38 4.01 3.77 7.26 Lt. Ulnar Styloid 4.23 3.99 8.05 4.19 3.95 7.77 4.21 3.98 7.75 Nuchale 3.77 3.54 7.16 3.68 3.48 6.86 3.65 3.42 7.02 Rt. 10th Rib 4.28 4.01 7.51 4.28 4.01 7.47 4.21 3.95 7.37 Rt. ASIS 3.80 3.58 7.14 3.79 3.53 7.16 3.76 3.49 7.25 Rt. Acromion 3.84 3.60 7.10 3.81 3.53 6.98 3.80 3.59 6.94 Rt. Axilla, Ant 3.75 3.47 6.90 3.74 3.44 6.80 3.71 3.43 6.82 Rt. Axilla, Post. 3.95 3.66 7.52 4.07 3.75 7.81 3.93 3.88 7.29 Rt. Calcaneous, Post. 1.79 1.63 3.44 1.79 1.60 3.48 1.80 1.63 3.48 Rt. Clavicale 3.28 3.07 6.05 3.28 3.07 5.93 3.25 3.06 5.90 Rt. Dactylion 4.26 4.08 7.69 4.04 3.85 7.32 4.26 4.12 7.70 Rt. Digit II 2.38 2.22 4.40 2.37 2.23 4.44 2.37 2.21 4.40 Rt. Femoral Lateral Epicn 3.08 2.91 5.70 3.07 2.91 5.63 3.06 2.87 5.55 Rt. Femoral Medial Epicn 4.24 4.13 7.32 4.70 4.56 8.00 4.11 3.98 6.98 Rt. Gonion 3.46 3.32 6.48 3.45 3.28 6.45 3.43 3.24 6.57 44 TABLE 9. Test errors from Lasso Regression models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Lasso (50%) Lasso (25%) Lasso (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile Rt. Humeral Lateral Epicn 4.14 3.91 7.57 4.14 3.92 7.52 4.16 3.89 7.52 Rt. Humeral Medial Epicn 3.93 3.74 7.16 3.94 3.73 7.18 3.96 3.75 7.35 Rt. Iliocristale 4.00 3.85 7.33 4.00 3.76 7.30 3.94 3.72 7.33 Rt. Infraorbitale 3.62 3.33 6.83 3.57 3.28 6.65 3.52 3.25 6.56 Rt. Knee Crease 2.98 2.75 5.46 3.00 2.84 5.23 2.94 2.83 5.37 Rt. Lateral Malleolus 1.94 1.77 3.73 1.94 1.76 3.65 1.94 1.76 3.78 Rt. Medial Malleolus 2.61 2.50 4.55 3.42 3.35 5.36 3.36 3.28 5.27 Rt. Metacarpal Phal. II 4.18 3.94 7.55 3.83 3.63 6.90 3.86 3.73 6.98 Rt. Metacarpal-Phal. V 4.24 3.98 7.72 3.98 3.73 7.29 4.02 3.79 7.33 Rt. Metatarsal-Phal. I 2.20 2.02 4.19 2.19 2.04 4.20 2.19 2.02 4.21 Rt. Metatarsal-Phal. V 2.07 1.92 3.96 2.06 1.91 3.98 2.06 1.93 3.96 Rt. Olecranon 4.11 4.01 7.46 4.10 3.99 7.42 4.13 4.01 7.57 Rt. PSIS 4.02 3.85 7.12 3.99 3.75 7.06 3.96 3.68 7.04 Rt. Radial Styloid 4.33 3.95 7.87 3.95 3.62 7.20 4.02 3.76 7.25 Rt. Radiale 4.12 3.84 7.40 4.11 3.83 7.35 4.14 3.87 7.36 Rt. Sphyrion 3.99 3.87 5.75 3.63 3.52 5.52 2.71 2.61 4.51 Rt. Thelion/Bustpoint 3.95 3.61 7.13 3.92 3.62 7.20 3.91 3.55 7.36 Rt. Tragion 3.55 3.30 6.72 3.48 3.26 6.55 3.41 3.16 6.39 Rt. Trochanterion 4.08 3.85 7.47 4.06 3.89 7.42 4.04 3.83 7.42 Rt. Ulnar Styloid 4.35 4.02 7.90 4.04 3.78 7.15 4.13 3.81 7.43 Sellion 3.75 3.49 7.20 3.64 3.39 6.83 3.59 3.40 6.88 Substernale 3.94 3.75 7.38 3.97 3.79 7.23 3.92 3.71 7.13 Supramenton 3.62 3.38 6.72 3.62 3.35 6.61 3.60 3.38 6.71 Suprasternale 3.28 3.06 6.35 3.28 3.11 6.26 3.26 3.12 6.20 Waist, Preferred, Post. 4.17 3.87 8.10 4.13 3.82 8.00 4.12 3.82 7.97 45 TABLE 10. Test errors from holistic CNN models trained on CAESAR using varied training set sizes, 50% - 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Holistic CNN (50%) Holistic CNN (25%) Holistic CNN (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile 10th Rib Midspine 7.72 7.75 13.01 6.12 5.77 11.56 6.51 6.15 11.87 Butt Block 9.31 9.82 11.43 8.33 9.04 10.70 8.13 8.55 14.31 Cervicale 12.24 11.99 20.18 12.80 13.10 20.83 12.32 12.34 21.32 Crotch 6.93 6.66 12.11 6.47 6.07 11.83 6.39 5.94 11.75 Lt. 10th Rib 6.61 6.41 11.55 6.16 5.99 10.86 7.47 7.44 12.55 Lt. ASIS 6.59 6.39 10.46 6.70 6.54 10.96 7.34 6.97 12.47 Lt. Acromion 13.41 14.39 22.16 11.90 12.47 20.52 10.10 10.71 17.17 Lt. Axilla, Ant 8.98 8.87 15.60 9.97 9.87 16.84 9.54 9.46 16.26 Lt. Axilla, Post. 10.38 10.49 17.00 11.55 11.55 17.90 9.86 9.70 16.03 Lt. Calcaneous, Post. 24.99 26.64 30.42 21.52 21.79 27.82 18.42 20.54 24.42 Lt. Clavicale 13.09 13.27 20.64 9.00 8.61 16.53 8.92 8.34 18.58 Lt. Dactylion 16.21 16.15 21.74 13.51 13.17 19.63 13.28 13.58 20.28 Lt. Digit II 26.27 27.23 31.42 21.58 22.29 27.01 22.87 26.03 28.13 Lt. Femoral Lateral Epicn 15.57 15.75 19.74 11.57 11.55 15.55 11.10 11.26 16.39 Lt. Femoral Medial Epicn 14.72 15.64 20.13 11.32 11.46 16.75 12.91 13.92 19.23 Lt. Gonion 14.23 14.69 22.57 12.14 12.19 19.87 13.11 13.10 21.46 Lt. Humeral Lateral Epicn 9.34 9.31 15.01 10.89 10.78 15.45 8.37 8.11 14.15 Lt. Humeral Medial Epicn 8.29 8.17 13.51 9.19 8.89 14.41 8.94 8.79 14.98 Lt. Iliocristale 8.05 7.92 12.65 6.54 6.24 11.04 7.59 7.36 12.56 Lt. Infraorbitale 13.80 13.87 22.86 13.74 14.14 21.90 12.94 12.81 21.37 Lt. Knee Crease 13.61 13.93 18.60 12.13 12.06 16.72 12.60 13.23 19.14 Lt. Lateral Malleolus 24.99 26.50 30.04 22.48 22.75 29.58 18.29 20.29 23.98 Lt. Medial Malleolus 21.01 22.83 27.44 20.37 21.72 27.75 17.44 19.55 24.16 Lt. Metacarpal-Phal. II 12.64 12.67 17.64 12.87 12.89 17.79 12.69 13.11 19.36 Lt. Metacarpal-Phal. V 12.91 12.96 18.31 11.55 11.43 16.45 11.23 11.41 17.08 46 TABLE 10. Test errors from holistic CNN models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Holistic CNN (50%) Holistic CNN (25%) Holistic CNN (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile Lt. Metatarsal-Phal. I 26.49 27.87 31.37 24.38 25.18 30.50 20.41 23.11 25.26 Lt. Metatarsal-Phal. V 26.05 27.52 31.10 23.27 23.52 29.55 23.13 24.97 28.99 Lt. Olecranon 9.70 9.88 15.29 9.58 9.34 14.65 7.88 7.56 13.01 Lt. PSIS 6.31 6.10 10.72 6.02 5.63 11.07 7.24 6.69 13.33 Lt. Radial Styloid 10.06 9.99 14.51 11.08 11.06 16.37 9.44 9.40 14.83 Lt. Radiale 11.34 11.26 17.15 9.17 8.82 14.62 10.67 10.46 16.52 Lt. Sphyrion 21.43 22.60 28.17 22.27 23.02 29.98 17.66 20.16 24.83 Lt. Thelion/Bustpoint 8.57 8.52 14.75 7.48 7.26 13.13 8.98 8.91 14.67 Lt. Tragion 16.07 16.42 24.19 14.52 14.69 22.73 14.23 14.40 22.53 Lt. Trochanterion 7.00 6.94 11.08 6.80 6.57 11.58 8.02 7.89 13.47 Lt. Ulnar Styloid 12.05 12.10 16.42 9.94 9.89 14.69 9.91 10.02 16.16 Nuchale 16.01 16.13 25.61 14.64 15.10 22.98 12.68 13.01 20.35 Rt. 10th Rib 6.14 5.92 11.22 6.60 6.47 11.28 6.10 5.75 11.50 Rt. ASIS 6.36 6.00 11.10 5.69 5.39 10.21 5.60 4.96 11.20 Rt. Acromion 11.72 11.78 19.15 10.40 10.51 17.13 10.24 9.89 18.53 Rt. Axilla, Ant 10.07 10.33 16.47 8.05 7.66 13.54 8.68 8.03 16.59 Rt. Axilla, Post. 11.00 11.12 17.64 10.30 10.59 16.58 9.19 8.88 16.76 Rt. Calcaneous, Post. 24.78 26.26 30.44 20.43 20.96 26.39 19.40 21.61 24.39 Rt. Clavicale 11.19 11.13 18.83 10.27 10.24 17.47 10.31 9.83 18.03 Rt. Dactylion 14.63 14.59 20.38 13.57 13.34 19.64 13.08 13.40 20.16 Rt. Digit II 25.70 26.28 31.45 21.07 21.12 27.09 19.82 22.41 24.09 Rt. Femoral Lateral Epicn 12.37 12.56 16.47 14.04 13.85 20.18 11.17 11.55 16.30 Rt. Femoral Medial Epicn 12.50 13.02 17.35 12.01 12.42 17.90 11.91 12.60 17.77 Rt. Gonion 12.61 13.11 20.54 11.08 10.84 20.05 12.92 13.18 19.56 Rt. Humeral Lateral Epicn 9.12 9.02 14.31 10.13 9.95 15.35 9.41 8.90 16.18 47 TABLE 10. Test errors from holistic CNN models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Holistic CNN (50%) Holistic CNN (25%) Holistic CNN (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile Rt. Humeral Medial Epicn 8.75 8.53 13.86 7.96 7.67 12.40 6.92 5.96 14.75 Rt. Iliocristale 6.51 6.27 11.14 6.99 6.71 11.44 6.28 5.79 11.07 Rt. Infraorbitale 16.73 17.37 26.07 14.93 14.96 22.98 12.38 12.19 21.43 Rt. Knee Crease 13.43 13.78 18.60 13.63 13.52 19.13 12.15 12.46 18.66 Rt. Lateral Malleolus 22.84 24.02 28.36 23.09 23.70 28.89 16.92 18.65 22.56 Rt. Medial Malleolus 22.20 23.82 27.76 20.53 21.69 27.11 16.25 18.10 21.12 Rt. Metacarpal Phal. II 12.76 12.76 17.88 11.66 11.50 17.27 11.13 11.10 17.22 Rt. Metacarpal-Phal. V 11.69 11.68 16.84 10.80 10.78 15.60 10.34 10.28 16.63 Rt. Metatarsal-Phal. I 24.89 26.05 29.84 24.97 25.62 32.54 20.23 22.35 26.63 Rt. Metatarsal-Phal. V 23.65 25.27 28.81 22.74 22.89 29.69 17.93 19.69 24.11 Rt. Olecranon 10.77 10.72 15.86 9.20 8.84 14.64 7.57 7.03 14.42 Rt. PSIS 7.27 6.98 12.14 7.44 7.08 12.77 7.96 7.55 13.38 Rt. Radial Styloid 10.51 10.35 15.22 9.24 9.23 14.51 7.99 7.80 13.34 Rt. Radiale 9.13 8.95 13.88 8.29 8.14 12.91 10.22 10.18 15.09 Rt. Sphyrion 23.16 24.87 29.62 20.10 21.20 26.33 18.77 21.74 24.76 Rt. Thelion/Bustpoint 8.66 8.57 14.39 8.42 7.97 14.93 9.71 9.50 16.32 Rt. Tragion 15.25 15.98 23.90 14.70 14.48 22.73 12.55 12.65 20.00 Rt. Trochanterion 8.09 7.81 12.74 8.41 8.17 13.39 7.08 6.62 12.28 Rt. Ulnar Styloid 11.69 11.86 16.59 10.26 10.18 15.33 8.71 8.70 14.13 Sellion 17.63 18.18 27.23 12.91 12.76 22.07 12.35 12.09 22.38 Substernale 7.38 7.22 13.24 6.23 5.70 11.90 7.12 6.80 12.70 Supramenton 15.40 15.87 24.57 11.02 10.93 19.33 11.48 11.39 19.48 Suprasternale 12.47 12.75 19.91 10.75 10.87 18.40 8.25 7.00 19.52 Waist, Preferred, Post. 7.88 7.72 12.47 7.65 7.44 12.35 7.28 7.06 12.15 48 TABLE 11. Test errors from atomistic CNN models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Atomistic CNN (50%) Atomistic CNN (25%) Atomistic CNN (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile 10th Rib Midspine 2.01 1.50 7.21 2.25 1.11 7.51 2.90 1.73 9.22 Butt Block 2.93 3.64 6.04 2.26 3.09 9.63 5.32 4.01 8.31 Cervicale 1.90 1.61 3.69 1.01 1.73 2.95 1.36 1.97 3.68 Crotch 1.56 2.92 3.86 1.73 1.10 4.39 2.42 1.54 6.38 Lt. 10th Rib 2.74 1.80 8.56 3.28 2.09 10.32 3.93 2.58 12.38 Lt. ASIS 1.77 0.96 5.26 5.21 4 13.11 3.49 2.28 10.2 Lt. Acromion 1.26 1.02 2.97 1.59 1.22 4.71 1.94 1.35 5.19 Lt. Axilla, Ant 3.32 1.35 4.02 1.30 2.59 3.06 1.89 2.31 3.09 Lt. Axilla, Post. 1.93 2.07 3.01 1.96 2.38 3.92 2.05 3.91 5.03 Lt. Calcaneous, Post. 2.05 1.82 2.95 1.68 2.08 3.01 2.14 2.8 4.01 Lt. Clavicale 1.24 0.94 3.30 1.09 0.86 2.8 0.89 0.61 2.52 Lt. Dactylion 1.95 1.49 5.21 2.26 1.75 5.73 2.23 1.75 5.95 Lt. Digit II 0.91 0.72 2.12 1.27 1.04 3.14 1.38 0.92 4.18 Lt. Femoral Lateral Epicn 1.16 0.83 3.32 1.23 0.85 3.37 1.55 1.08 4.6 Lt. Femoral Medial Epicn 1.51 1.21 3.61 1.19 0.83 3.61 2.66 1.96 7.38 Lt. Gonion 1.13 0.88 2.96 1.10 0.79 3.16 0.96 0.69 2.52 Lt. Humeral Lateral Epicn 0.99 0.68 2.87 1.43 1.78 4.21 1.53 1.14 4.33 Lt. Humeral Medial Epicn 1.34 1.03 3.55 1.3 0.87 3.85 1.56 1.11 4.83 Lt. Iliocristale 1.90 1.34 5.49 1.87 1.21 5.44 2.67 1.77 7.99 Lt. Infraorbitale 0.82 0.63 1.96 0.57 0.35 1.65 0.73 0.45 2.44 Lt. Knee Crease 1.38 1.15 3.55 0.79 0.53 2.55 3.17 2.7 7.01 Lt. Lateral Malleolus 0.66 0.53 1.69 1.17 1.01 2.59 1.09 0.83 2.97 Lt. Medial Malleolus 0.60 0.46 1.64 1.14 0.96 2.76 1.38 1.11 3.48 Lt. Metacarpal-Phal. II 1.45 1.13 3.76 1.29 0.91 3.7 2.33 1.78 6.27 Lt. Metacarpal-Phal. V 2.04 1.41 5.61 1.77 1.47 4.05 1.64 1.05 5.20 49 TABLE 11. Test errors from atomistic CNN models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Atomistic CNN (50%) Atomistic CNN (25%) Atomistic CNN (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile Lt. Metatarsal-Phal. I 0.92 0.72 2.21 1.00 0.79 2.33 0.84 0.64 2.12 Lt. Metatarsal-Phal. V 1.57 1.41 2.49 1.44 1.29 2.19 1.29 0.88 3.87 Lt. Olecranon 1.54 1.17 4.33 1.01 0.68 3.09 1.79 1.34 5.10 Lt. PSIS 3.06 1.67 10.32 2.86 1.63 9.07 4.60 2.71 14.85 Lt. Radial Styloid 1.30 0.84 4.04 2.28 1.8 5.98 1.61 1.08 4.99 Lt. Radiale 1.10 0.82 3.08 1.23 0.92 3.36 1.48 0.99 4.14 Lt. Sphyrion 1.10 0.96 2.41 0.82 0.68 1.97 1.33 0.99 3.68 Lt. Thelion/Bustpoint 2.05 3.31 4.79 1.96 2.65 2.90 2.03 3.10 3.53 Lt. Tragion 0.79 0.62 1.90 0.53 0.35 1.43 0.96 0.73 2.54 Lt. Trochanterion 3.24 2.25 10.02 2.26 1.49 7.32 3.38 2.02 11.00 Lt. Ulnar Styloid 0.88 0.59 2.59 1.27 0.95 3.41 1.83 1.38 4.98 Nuchale 1.66 1.16 4.79 1.41 0.92 4.31 2.34 1.69 6.53 Rt. 10th Rib 2.95 1.88 9.46 3.25 1.82 11.46 5.22 3.86 14.39 Rt. ASIS 1.72 1.15 5.09 2.14 1.24 6.27 2.92 1.93 8.63 Rt. Acromion 0.78 0.54 2.34 1.29 1.00 3.2 2.71 2.11 7.38 Rt. Axilla, Ant 1.69 2.38 2.91 1.72 1.04 2.81 1.38 1.92 2.14 Rt. Axilla, Post. 1.82 3.01 4.94 1.09 2.34 3.09 2.49 3.01 3.39 Rt. Calcaneous, Post. 2.71 2.64 4.85 1.55 2.99 3.52 3.24 3.43 4.39 Rt. Clavicale 0.99 0.79 2.60 2.34 1.97 5.34 1.22 0.96 3.28 Rt. Dactylion 2.76 1.76 6.28 3.71 2.45 7.39 3.74 2.41 8.63 Rt. Digit II 1.88 1.63 3.94 1.26 1.07 2.66 1.10 0.78 3.07 Rt. Femoral Lateral Epicn 1.07 0.68 3.37 1.41 0.86 4.39 2.23 1.66 6.26 Rt. Femoral Medial Epicn 2.53 1.89 3.17 2.89 3.29 4.19 3.01 3.71 4.08 Rt. Gonion 1.63 1.19 4.68 1.06 0.77 2.80 0.83 0.58 2.28 Rt. Humeral Lateral Epicn 1.11 0.81 3.02 2.45 1.93 6.25 1.55 1.16 4.10 50 TABLE 11. Test errors from atomistic CNN models trained on CAESAR using varied training set sizes - 50%, 25% and 10% of the original training set. All measurements are expressed in centimeters, representing Euclidean distances between predicted landmark coordinates and ground truth coordinates. Atomistic CNN (50%) Atomistic CNN (25%) Atomistic CNN (10%) Landmark Names Mean Median 95th Percentile Mean Median 95th Percentile Mean Median 95th Percentile Rt. Humeral Medial Epicn 1.35 1.02 3.53 1.69 1.12 5.17 1.81 1.38 5.02 Rt. Iliocristale 2.06 1.51 5.76 2.34 1.65 6.06 2.45 1.64 7.37 Rt. Infraorbitale 1.04 0.80 2.75 1.15 0.79 3.08 1.04 0.67 2.81 Rt. Knee Crease 0.70 0.51 1.67 1.14 0.79 3.06 1.30 0.88 3.75 Rt. Lateral Malleolus 0.65 0.49 1.70 0.83 0.68 2.13 1.37 1.08 3.45 Rt. Medial Malleolus 0.73 0.59 1.65 0.72 0.51 1.86 1.16 0.89 3.06 Rt. Metacarpal Phal. II 1.64 1.22 4.43 1.26 0.90 3.60 2.03 1.51 5.92 Rt. Metacarpal-Phal. V 1.28 0.99 3.44 1.34 0.98 3.88 1.94 1.38 5.61 Rt. Metatarsal-Phal. I 0.65 0.54 1.66 1.74 1.44 3.90 1.09 0.84 2.62 Rt. Metatarsal-Phal. V 0.66 0.50 1.72 0.76 0.51 2.16 1.33 1.08 3.14 Rt. Olecranon 1.14 0.75 2.99 1.03 0.59 2.99 1.42 0.83 4.27 Rt. PSIS 3.42 2.03 10.53 3.30 1.98 9.61 4.97 2.94 14.88 Rt. Radial Styloid 1.09 0.76 3.35 1.83 1.52 4.42 2.17 1.57 5.82 Rt. Radiale 0.70 0.44 2.19 1.49 1.16 3.64 1.16 0.79 3.20 Rt. Sphyrion 0.92 0.81 1.86 1.17 1.00 2.43 1.95 1.58 4.96 Rt. Thelion/Bustpoint 1.38 1.09 2.63 1.05 3.05 3.19 2.59 1.67 3.91 Rt. Tragion 1.85 2.65 3.07 3.67 1.47 4.74 1.76 1.54 2.16 Rt. Trochanterion 3.26 2.43 8.94 4.36 3.34 11.23 3.36 2.16 10.38 Rt. Ulnar Styloid 2.21 1.60 6.37 1.06 1.70 3.25 2.20 1.44 6.40 Sellion 2.47 2.05 6.25 1.78 1.53 2.07 1.23 0.88 3.41 Substernale 1.36 0.94 3.95 1.51 0.92 4.84 1.98 1.33 6.27 Supramenton 0.79 0.50 2.38 1.57 1.25 3.83 1.11 0.71 3.36 Suprasternale 1.14 0.87 3.01 1.23 0.94 3.40 1.12 0.84 2.95 Waist, Preferred, Post. 2.34 3.12 3.69 2.66 2.08 2.90 1.94 2.28 3.03 REFERENCES CITED Mathieu Aubry, Ulrich Schlickewei, and Daniel Cremers. “the wave kernel signature: A quantum mechanical approach to shape analysis.”. IEEE International Conference on Computer Vision Workshops, 2011. Huanyu Yang, Kuangrong Hao, and Yongsheng Ding. Semantic segmentation of human model using heat kernel and geodesic distance. 2018. doi: 10.1155/2018/7974340. URL https://doi.org/10.1155/2018/7974340. Janet F. Grant, Catherine R. Chittleborough, Zumin Shi, and Anne W. Taylor. The association between a body shape index and mortality: Results from an australian cohort. 2017. URL https://doi.org/10.1371/journal.pone.0181244. Diana M. Thomas, Carl Bredlau, Anja Bosy-Westphal, Manfred Mueller, Wei Shen, Dympna Gallagher, Yuna Maeda1, Andrew McDougall1, Courtney M. Peterson, Eric Ravussin, and Steven B. Heymsfield. Relationships between body roundness with body fat and visceral adipose tissue emerging from a new geometrical model. 2013. KSR Van Kuijk, M Reijman, SMA Bierma-Zeinstra, JH Waarsing, and DE Meuffels. Posterior cruciate ligament injury is influenced by intercondylar shape and size of tibial eminence. pages 1058–1062, 2019. doi: 10.1302/0301-620X.101B9.BJJ-2018-1567.R1. Sebastian Sitko, Rafel Cirer-Sastre, Nuria Garatachea, and Isaac López-Laval. Anthropometric characteristics of road cyclists of different performance levels. Applied Sciences, 13(1), 2023. ISSN 2076-3417. doi: 10.3390/app13010224. URL https://www.mdpi.com/2076-3417/13/1/224. Satoru Kiire. Effect of leg-to-body ratio on body shape attractiveness. arch sex behaviour. 2016. doi: 10.1007/s10508-015-0635-9. Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. wav2vec 2.0: A framework for self-supervised learning of speech representations. 2020. David Silver, Aja Huang, Christopher J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, and Demis Hassabis. Mastering the game of go with deep neural networks and tree search. Nature, 529:484–503, 2016. URL http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html. 51 Maurel Pierre Pinte Caroline, Fleury Mathis. Deep learning-based localization of eeg electrodes within mri acquisitions. Frontiers in Neurology, 12, 2021. doi: 10.3389/fneur.2021.644278. URL https://www.frontiersin.org/articles/10.3389/fneur.2021.644278. Mitchell Hargreaves, David Ting, Stephen Bajan, Kamron Bhavnagri, Richard Bassed, and Xiaojun Chang. A generative deep learning approach for forensic facial reconstruction. pages 1–7, 11 2021. doi: 10.1109/DICTA52665.2021.9647290. Ivan Grishchenko, Valentin Bazarevsky, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, Richard Yee, Karthik Raveendran, Matsvei Zhdanovich, Matthias Grundmann, and Cristian Sminchisescu. Blazepose ghum holistic: Real-time 3d human landmarks and pose estimation. 2022. Andrea Giachetti, Emanuele Mazzi, Francesco Piscitelli, M Aono, A Ben Hamza, T Bonis, Peter Claes, A Godil, C Li, M Ovsjanikov, et al. Shrec’14 track: automatic location of landmarks used in manual anthropometry. In Eurographics Workshop on 3D Object Retrieval, pages 93–100, 2014. R. Q. Charles, H. Su, M. Kaichun, and L. J. Guibas. ”pointnet: Deep learning on point sets for 3d classification and segmentation. pages 77–85. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. doi: 10.1109/CVPR.2017.16. Zhirong Wu, Shuran Song, Aditya Khosla, Xiaoou Tang, and Jianxiong Xiao. 3d shapenets for 2.5d object recognition and next-best-view prediction. CoRR, abs/1406.5670, 2014. URL http://arxiv.org/abs/1406.5670. Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9. IEEE, 2015. Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. Keep it smpl: Automatic estimation of 3d human pose and shape from a single image. 2016. Tariq M. Khan, Antonio Robles-Kelly, and Syed S. Naqvi. T-net: A resource-constrained tiny convolutional neural network for medical image segmentation. pages 1799–1808, 2022. doi: 10.1109/WACV51458.2022.00186. Kathleen M. Robinette and Hein A. M. Daanen. The caesar project: a 3-d surface anthropometry survey. pages 380–386, 1999. 52 Matthew A. Brunsman, Hein A. M. Daanen, and Kathleen M. Robinette. Optimal postures and positioning for human body scanning. Proceedings. International Conference on Recent Advances in 3-D Digital Imaging and Modeling (Cat. No.97TB100134), pages 266–273, 1997. Paolo Cignoni, Massimiliano Corsini, and Guido Ranzuglia. Meshlab: an open-source 3d mesh processing system. ERCIM News, 2008(73), 2008. William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. 2021. 53