Date | Monday, February 17th, 2020 |
Room: Tea and Refreshments with Faculty and Speaker | Weniger Hall, Room 245 (Statistics Conference Room) |
Time: Tea and Refreshments with Faculty and Speaker | 3:00 pm to 3:45 pm |
Room: Seminar | Weniger Hall, Room 149 |
Time: Seminar | 4:00 pm to 4:50 pm |
Cost: | Free and open to the public |
Truncated latent Gaussian copula model for zero-inflated data
A great number of multivariate statistical methods, such as principal component analysis, discriminant analysis, canonical correlation analysis and graphical lasso to name a few, require the estimate of covariance or correlation matrix of variables as one of the inputs. It is typical to use Pearson sample correlation matrix, which works well at capturing dependencies between normally distributed variables. In this work we consider the problem of estimating dependencies between zero-inflated measurements, which arise in miRNA data, microbiome data, physical activity data, etc. We propose truncated latent Gaussian copula to model the data with excess zeroes, which allows us to derive a rank-based estimator of latent correlation matrix without the estimation of marginal transformation functions. We prove the consistency of corresponding estimator, and demonstrate its use for the analysis of associations between gene expression and microRNA data of breast cancer patients, and for inferring the conditional independence graph in quantitate gut microbiome data.
For more information about Dr. Irina Gaynanova click here.