Scientific Papers

Liver shape analysis using statistical parametric maps at population scale | BMC Medical Imaging


Data

The UK Biobank [20] is a population-based study in which 500,000 participants aged 40 to 70 years were recruited for deep phenotypic profiling. There is also a currently ongoing imaging sub-study, in which 100,000 of the participants have been recruited to undergo an imaging protocol including MRI of the brain, the heart, and the abdominal region. The abdominal scans include a neck-to-knee Dixon 3D acquisition that can be used to derive volumes of adipose tissue, skeletal muscle and abdominal organs. Full details regarding the UK Biobank abdominal acquisition protocol have previously been reported [21]. We processed and segmented the data using our automated methods [9]. In this study on liver morphology, we included 41,800 participants with Dixon MRI data acquired at the imaging visit, between 2014 and 2020 with data comprising imaging, health-related diagnoses and biological measurements.

Fully anonymized participant data was obtained through UK Biobank Access Application number 44,584. The UK Biobank has approval from the North West Multi-Centre Research Ethics Committee (REC reference: 11/NW/0382) written informed consent was obtained from all participants prior to inclusion in the UK Biobank.

Phenotype definitions

Anthropometric measurements including age, body mass index (BMI), waist and hip circumferences were taken at the UK Biobank imaging visit and ethnicity was defined based on the continental genetic ancestry (https://pan.ukbb.broadinstitute.org). AST:ALT ratio, defined as the ratio of aspartate aminotransferase (AST) to alanine aminotransferase (ALT), commonly used to indicate presence of more advanced liver disease including fibrosis and cirrhosis [22, 23] was calculated from the biological samples taken at the initial assessment visit. The fibrosis-4 index (FIB-4), also designed to identify more advanced stages of liver disease and fibrosis in particular, was calculated as previously described [24] using age, AST, ALT and platelet count taken from the initial assessment visit. Diagnosis of liver disease and T2D was obtained from UK Biobank hospital records and self-reported information (see Disease Categories in supporting information). Due to the relatively limited number of scanned participants within the UKBB diagnosed with specific liver diseases, a broad umbrella definition of liver disease was implemented which included, alcoholic liver disease, fibrosis, cirrhosis, and chronic hepatitis.

Quality control

We included liver segmentations from an overall 41,800 participants. For details on the segmentation process and quality control refer to the supplementary data in [9]. Participants with missing clinical, anthropometric or biochemical data, as well as those with Dixon MRI datasets that did not have full anatomical coverage were excluded from the study, including organs with zero volume. More specifically, we removed 8,297 data that were missing ethnicity, BMI, WHR, AST, ALT, platelet count and liver IDPs. We also conducted quality control measures to determine potential extreme values in the liver volume and ensure the full anatomical coverage of the organs by visually examining values falling outside from randomly selected quantiles (0.1% and 99.9%) and excluding eight outliers. We visually inspected segmentations with 3D liver mesh-derived values to potentially identify extremely high values, resulting in the exclusion of 61 datasets with segmentation errors. Overall, from the initial 41,800 participants, 33,434 participants were included in the final analysis (20% of data excluded).

Study design

Template definition

Deformation of an image to a standard organ template is a key part of MRI organ shape assessment. Given the potential variation in morphology, it is important to identify a suitable population sample size for constructing a template image [25]. To assess the impact of population size on template construction, we constructed three distinct templates using liver segmentations from a gender-balanced European ancestry cohort of 20, 100 and 200 participants with BMI < 25 kg/m2 and low liver fat (< 5%). The characteristics for each template population are provided in Supplementary Table S1. To test the 3 templates, we selected 500 participants, derived from the full cohort, with European genetic ancestry, aged between 46 and 62 years old, without any disease reported or diagnosed here [26] (Supplementary Table S2). We then registered the three liver templates to the 500-participant cohort and investigated the associations between the 3D mesh-derived phenotype and the anthropometric covariates across the three templates.

Association between mesh-derived phenotypes, IDPs and disease

To assess the associations between the 3D mesh-derived phenotype, the anthropometric covariates and liver IDPs (volume, fat, iron), we first analysed the liver MRI data from the entire UK Biobank imaging cohort. The cohort of 33,434 participants was 97.6% European, 48.7% male and aged between 44 and 82 years old (Supplementary Table S3). To determine the potential association between disease and liver shape, we first selected diseases that are known from previous studies to impact liver health, and are associated with changes in liver fat accumulation or volume [9]. These included 449 participants with liver disease (207 F/242 M; 48–81 years old; BMI 18.6–43.8 kg/m²) and 1,780 participants with T2D (67% males; 46–82 years old; BMI 18.3–50.1 kg/m²) (Supplementary Table S4).

Prediction of disease outcomes

To determine whether the 3D mesh-derived phenotype was a better predictor of disease outcomes than the conventional measurement of liver volume, we identified 182 participants with liver disease (45% males; 45–78 years old; BMI 16.5–46.1 kg/m²) and 144 participants with T2D (61% males; 45–80 years old; BMI 19.9–47.9 kg/m²) that were diagnosed after the baseline imaging visit (see supporting information). We then identified a control cohort without any reported conditions and designed a case-control study for each disease population, achieving a 364 case-control cohort with liver disease and 288 case-control cohort with T2D. The control cohort was chosen by matching one individual with every case by age (± 1 year), gender and BMI (± 2 kg/m²) using the R package ccoptimalmatch [27].

Image registration and mesh construction

The process for template construction of the liver has been previously described [28]. Here, we constructed three distinct templates using liver segmentations from 20, 100 and 200 subject-specific volumes in order to evaluate the impact of cohort size on template construction. It also allows us to test if cohort size influenced the statistical associations in our mesh-based analysis. We constructed surface meshes from each template using the marching cubes algorithm and smoothed using a Laplacian filter [29]. The template construction was performed using ANTs software (https://picsl.upenn.edu/software/ants) with mutual information as the similarity metric and the B-spline non-rigid transformation. Briefly the process of the template construction is performed in two stages: affine registration to account for translation, rotation, scaling and shearing, and non-rigid registration to account for local deformation using the symmetric image normalisation (SyN) method with mutual information as the similarity metric [30, 31]. The analysis was performed using “antsMultivariateTemplateConstruction2.sh” script provided from ANTs, with the following default parameters: -i (iteration limit) = 4, -g (gradient step size) = 0.25, -k (number of modalities) = 1, -w (modality weight) = 1. The rest parameters were customised depending on the machine used, image dimension and the metrics applied, including: -d (image dimension) = 3, -j (number of CPU cores) = 10, -c (control for parallel computation) = 2, -q (max iteration for each pairwise registration) = 100 × 70 × 50 × 10, -n (NBiasFieldCorrection of moving image) = 0, -r (do rigid body registration of inputs to the initial template) = 1, -m (similarity metric) = MI and -t (transformation model) = BSplineSyN.

Surface meshes were first constructed from each subject’s segmentations using marching cubes algorithm and smoothed using a Laplacian filter. Then the template-to-subject registration was performed by first applying rigid registration to remove the position and orientation difference between all subject-specific surfaces and template surfaces and an affine transformation with nearest neighbour interpolation was computed between template and subject segmentations. The resulting affine transformations were used to warp the template to the subject’s space. The template segmentation is then mapped into each subject segmentation by computing a non-rigid transformation modelled by a free-form deformation, based on B-Splines, with label consistency as the similarity metric between the subject and template liver segmentations [32]. To enable subject comparison with vertex-to-vertex correspondence, the template mesh is then warped to each subject mesh using the deformation fields obtained from the non-rigid registration. Hence, all surface meshes are parameterised with the same number of vertices (approximately 18,000). This ensures that each vertex maintains approximate anatomical accuracy and consistency across all subjects, while preserving the size and shape information for subsequent analyses [29].

To determine the regional outward or inward adaptations in the liver surface in comparison to an average liver shape, the surface-to-surface (S2S) distance, a 3D mesh-derived phenotype for each subject was measured. This was achieved by computing the signed distance between each vertex in the template mesh and each corresponding vertex in the subjects’ mesh. This indicates positive distances for outward expansion in the subject’s vertices compared to template vertices and negative distances for inward shrinkage in the subject’s vertices. All the steps for the template-to-subject registration were performed using the Image Registration Toolkit (IRTK) (https://biomedia.doc.ic.ac.uk/software/irtk). After conducting the described manual quality control process, which involved identifying extremely high S2S values, we found that all the values fell within the range of -48.3 to 70.5 mm. This is to ensure that the organ sizes were within an expected range and to suggest that there were no significant segmentation errors, such as the inclusion of surrounding tissues in the liver segmentations.

Mass univariate regression

Associations between the S2S values and anthropometric variables were modelled using a linear regression framework. To enhance the detection of spatially contiguous signals and discriminate them from noise, we utilised threshold-free cluster enhancement (TFCE) [33]. TFCE not only provides improved sensitivity and stability compared to other cluster-based techniques but also identifies local maxima in the resulting significance map that is not possible in other enhancement and thresholding techniques [14, 33]. A permutation testing was then performed on the TFCE maps and the derived TFCE p-values were corrected to control the false discovery rate (FDR), as previously described [28]. Specifically, we performed mass univariate regression (MUR) analysis using the R package mutools3D [34] and adjusted for multiple comparisons by applying the FDR procedure [35] to all the TFCE p-values derived from each vertex using 1,000 permutations. The estimated regression coefficients \(\widehat{\beta }\) for each of the relevant covariates and their related TFCE-derived p-values were then displayed at each vertex in the mesh on the whole 3D liver anatomy, providing the spatially-distributed associations. Regions of the liver exhibiting significant associations (p-values < 0.05) between variables were identified, and the estimated regression coefficients \(\widehat{\beta }\) for each relevant covariate within those regions were reported. The MUR model for deriving associations between clinical parameters and a 3D phenotype is outlined in Supplementary Fig. S1.

To determine which factors influence the design and performance of the liver template, we used a regression model to address: (1) how many participants are required to construct a representative liver template, (2) whether the template population size affected the associations between the S2S and the anthropometric covariates, (3) which factors have an impact on regional S2S distances and (4) how are the changes in S2S distances linked to liver disease and T2D.

We constructed three models adjusting for additional covariates. Model 1 was adjusted for age, gender, ethnicity, body mass index (BMI) and waist-to-hip ratio (WHR), liver fat (referred to as proton density fat fraction (PDFF)) and liver iron concentration with correction to control the FDR. To investigate the morphological changes related to liver function Model 2 had all the covariates from model 1 plus AST:ALT, FIB-4 index and disease conditions. We further adjusted with interaction terms between age and disease status and between liver fat and disease. In order to test whether there is a circadian effect in the liver morphology, Model 3 included all the covariates from model 2 plus time of the day for the MRI scan, discretised into hours of the day.

Predictive model

To determine whether S2S distance improves the prediction of disease outcomes prospectively, we used a logistic regression model. This model allowed us to investigate the associations between liver volume as well as the S2S values from the baseline imaging visit and the occurrence of disease outcomes in two distinct case-control cohorts: one comprising individuals with liver disease and the other with T2D.

Due to having a large number of S2S values for small population groups, we first calculated the sparse principal component analysis (SPCA) using the R package sparsepca [36] and extracted principal component scores representing the shape features of the S2S distances for each disease case-control group that were diagnosed after the baseline imaging visit. We utilised the principal component scores for each individual corresponding to the modes that summarised 90% of the cumulative variation for each group. We then performed this analysis in two models. In the first model (the volume model), the disease outcome was regressed on age, gender, ethnicity, BMI, WHR, AST/ALT, FIB-4 index, liver volume, PDFF and iron concentration. In the second model (the S2S model), we included all the covariates from the volume model, adding the principal component scores of the S2S distances for each disease group.

Predictive modelling was performed using the R package caret [37]. Model training was conducted with leave-one-out cross validation for each group. Our model performance was evaluated using several metrics, including the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve, the F1 score, accuracy, and sensitivity/specificity. Additionally, we employed Delong’s test to compare the AUC of the ROC curves from S2S and liver volume models [38].



Source link