您现在的位置:首页> 外文学位 >文献详情

【6h】Covariance Estimation in the Presence of Diverse Types of Data.

机译存在多种数据类型时的协方差估计。

【摘要】A primary goal of multivariate analysis is covariance estimation. Estimating the covariance matrix enables us to study the associations among random variables and provides standard error estimates to construct confidence regions. Traditional multivariate methods are mostly for homogenous normal populations, that is, y1,...,yn ∼ i.i.d. multivariate normal distribution. However; multivariate data usually- contain non-normal measurements of diverse types, including continuous, ordinal, and non-ordered categorical. In this dissertation, we discuss theories and methods of estimating the covariance matrix in the presence of diverse types of data, with two main deviations from the normal situation, 1. the marginal distributions of the multivariate data are not normal; 2. the population is heterogeneous due to some explanatory variables x.;In the first situation, we discuss the idea of copula models for estimating the association parameters for multivariate data. We mainly concentrate on the rank likelihood method proposed by Hoff (2007) and investigate its asymptotic properties. We compare the asymptotic results with other rank-based estimators for the bivariate Gaussian copula model.;In the second case, we propose a covariance regression model for the heterogeneous population, and describe the covariance matrix of continuous variables as a function of other variables, such as categorical variables. The model we propose is a parsimonious model which can be considered as a natural analogy to linear regression for the mean. We present a geometric interpretation of the model and both the maximum likelihood and the Bayesian method for the parameter estimation. We demonstrate the application of the model using a very simple example with two response variables and one continuous explanatory variables. We apply the covariance regression model to a large health dataset with four continuous response variables and four categorical variables. We discuss in detail several practical issues when fitting the covariance regression model, such as model selection, interpreting the coefficients, presenting the fitted results, and model misspecification.

【摘要机译】多元分析的主要目标是协方差估计。估计协方差矩阵使我们能够研究随机变量之间的关联,并提供标准误差估计以构造置信区域。传统的多变量方法主要适用于同质的正常群体,即y1,...,yn〜i.i.d。多元正态分布。然而;多元数据通常-包含各种类型的非正态测量值,包括连续,有序和无序分类。在本文中,我们讨论了在存在多种类型数据的情况下估计协方差矩阵的理论和方法,与正常情况存在两个主要偏差:1.多元数据的边际分布不正常; 2.由于某些解释变量x,总体是异质的;在第一种情况下,我们讨论了copula模型的思想,用于估计多元数据的关联参数。我们主要关注Hoff(2007)提出的秩似然法,并研究其渐近性质。我们将双变量高斯copula模型的渐近结果与其他基于秩的估计量进行比较。在第二种情况下,我们为异质种群提出了协方差回归模型,并将连续变量的协方差矩阵描述为其他变量的函数,例如分类变量。我们提出的模型是一个简约模型,可以将其视为均值线性回归的自然类比。我们提出了模型的几何解释,以及参数估计的最大似然法和贝叶斯方法。我们使用一个非常简单的示例来演示该模型的应用,该示例具有两个响应变量和一个连续的解释变量。我们将协方差回归模型应用于具有四个连续响应变量和四个类别变量的大型健康数据集。在拟合协方差回归模型时,我们将详细讨论一些实际问题,例如模型选择,解释系数,呈现拟合结果以及模型错误指定。

【作者】Niu, Xiaoyue.;

【作者单位】University of Washington.;

【年(卷),期】2010(),

【年度】2010

【页码】96 p.

【总页数】96

【原文格式】PDF

【正文语种】eng

【中图分类】;

【关键词】

  • 相关文献
  • 联系方式:010-58892860转803 (工作时间) 18141920177 (微信同号)
  • 客服邮箱:kefu@zhangqiaokeyan.com
  • 京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-1 六维联合信息科技(北京)有限公司©版权所有
  • 客服微信
  • 服务号