Analyzing Data on the Microbiome using Linear Models based on Approximate Singular Value Decompositions

Document Type

Presentation Abstract

Presentation Date

9-11-2017

Abstract

Data from a microbiome study is often analyzed using approaches developed by Ecologists: a matrix of pairwise distances is calculated between each observation, followed by ordination (graphical representation of each observation as a point in a low-dimensional space, using either principal components or multidimensional scaling based on the distance matrix). Ordination is frequently successful in separating meaningful groups of observations (e.g., cases and controls). Although distance-based analyses such as Permanova can be used to test whether explanatory variables (such as case-control status, batch or sample pH) have a significant effect on the distance matrix, the connection between data on individual species (or operational taxonomic units, OTUs) and the information in the distance matrix is lost, and there is no way to know which species contribute to the patterns seen in ordination for the high-dimensional data we gather in a microbiome study. To provide a single analysis path that includes distance-based ordination, global hypothesis tests of any effect of the microbiome, and hypotheses tests of the effects of individual OTUs, we present a novel approach we call the linear decomposition model (LDM). Using simulations we show that the LDM can have higher power than solely distance-based methods, while avoiding some technical difficulties that plague existing methods when a non-Euclidean distance is used. Finally, we show how the effects of confounding covariates can be accounted for by a novel 'peeling' approach. (Joint work with Yijuan Hu, Department of Biostatistics and Bioinformatics, Emory University)

Additional Details

Monday, September 11, 2017 at 3:00 p.m. in Math 103
Refreshments at 4:00 p.m. in Math Lounge 109

This document is currently not available here.

Share

COinS