||February 11th, 2019
||Kelley Engineering Center (KEC) Room 1001
||This seminar is free and open to the public.
A Novel Matrix Decomposition with Applications to Inference for Multivariate Means in Adaptive Data Analysis.
We derive a novel matrix decomposition that has application in adaptive data analysis involving multivariate means. Modern data analysis is often an iterative process where the analyst queries the data one or more times before performing statistical inference. This invalidates traditional statistical inference methods and has motivated the development of statistical methods that are appropriate in an adaptive data analysis setting. Our work investigates Hotelling's T^2 test by deriving novel representations of the statistic, deriving an exact decomposition of a Gaussian matrix and deriving the distributions of the decomposition's component parts. We apply these results to developing a valid method for inference on the means of clustered data. In this instance, clustering violates the assumptions of Hotelling's T^2 test. However our novel matrix decomposition allows one to derive the exact distribution of the statistic conditional on the clustering. To illustrate the effectiveness of this approach we analyze The Cancer Genome Atlas (TCGA) Breast Cancer Subtype dataset.