The design of optimal incomplete multivariate normal samples
Criteria for the design of multivariate data collection procedures are developed. The design objective is to determine the minimum cost allocation of resources for gathering data while satisfying certain precision specifications for point estimation. Of primary concern is the cost reducing potential of making incomplete sets of observations. The multivariate normal distribution is assumed, and estimation is based on the principle of maximum likelihood. The development of appropriate statistics for estimating elements of the population mean and covariance matrix is presented. The asymptotic properties of these statistics in a certain class of problems described as nested samples are considered in determining the optimum allocation of resources. Exact moments of the point estimators in the special case of one complete and one incomplete set of observations are derived. These are compared with the asymptotic expressions. The resource allocation problem is presented as a non-linear programming problem. The solution procedure is exemplified by simulating the incomplete data design and analysis in estimating coefficients in a six variable multinormal regression model. The precision of estimates and the cost of collecting the data are compared with several complete data analysis situations.