It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Background
Compositional data comprise the parts of a ‘whole’ (or ‘total’), which sum to that ‘whole’. The ‘whole’ may vary between units of analyses, or it may be fixed (constant). For example, total energy intake (a variable total) is the sum of intake from all foods or macronutrients. Total time in a day (a fixed total) is the sum of time spent engaging in various activities. There exist different approaches to analysing compositional data, such as the isocaloric or isotemporal model, ratio variables, and compositional data analysis (CoDA). Although the performance of the different approaches has been compared previously, this has only been conducted in real data. Since the true relationships are unknown in real data, it is difficult to compare model performance in estimating a known effect. We use data simulations of different parametric relationships, to explore and demonstrate the performance of each approach under various possible conditions.
Methods
We simulated physical activity time-use and dietary data as examples of compositional data with fixed and variable totals, respectively, using different parametric relationships between the compositional components and the outcome (fasting plasma glucose): linear, log2, and isometric log-ratios. We evaluated the performance of a range of generalised linear and additive models as well as CoDA, in estimating a 1-unit and either 10-unit (for physical activity) or 100-unit (for dietary data) reallocations under each parametric scenario. We simulated 10,000 datasets with 1,000 observations in each.
Results
The performance of each approach to analysing compositional data depends on how closely its parameterisation matches the true data generating process. Overall, we demonstrated that the consequences of using an incorrect parameterisation (e.g. using CoDA when the true relationship is linear) are more severe for larger reallocations (e.g. 10-min or 100-kcal) than for 1-unit reallocations. The implications of choosing an unsuitable approach may be starker in compositional data with variable totals. For example, while models with ratio variables are mathematically equivalent to linear models in compositional data with fixed totals, their estimates may be radically different for variable totals.
Conclusions
Compositional data with fixed and variable totals behave differently. All existing approaches to analysing such data have utility but need to be carefully selected. Investigators should explore the shape of the relationships between the compositional components and the outcome and chose an approach that matches it best.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer