A study of the use of a work sample criterion in test validation research



Journal Title

Journal ISSN

Volume Title



A work sample test was developed for use as a job performance criterion in a pre-employment test validation study. The work sample test consisted of four tasks representative of the job tasks of secretarial employees in a large organization, and was administered to 94 White, Black, and Hispanic employees. Their performance was scored according to a detailed, objective system, focusing primarily on work speed and the quality of the final products. Interviews were conducted with the supervisors of these employees in which they were asked to predict how well their subordinates would perform on the tasks in terms of specific scoring dimensions. Four major issues were addressed in this research. First, the quality of the work sample as a measure of performance was examined in terms of scoring reliability, internal consistency, and content validity. These data indicated that the work sample was highly relevant to job content, had very high scoring reliability, and was characterized by acceptable consistency and discriminant validity of measurement across tasks. The integrity of the scoring dimensions was supported by a factor analysis of work sample scores. The second issue was the value and predictability of the work sample as a criterion of job performance. It was found that performance on the work sample was predicted very well and in a logical pattern by cognitive aptitude tests and a typing skills test. Also, the data showed that there was a gain of practical significance in the prediction of the work sample criterion as compared to the prediction of supervisory ratings of the same job skills. These data reflected further on the structure of the work sample and on the value of the work samplp as a criterion. Third, the construct validity of the work sample and the supervisory predictions of work sample performance were analyzed by looking at the intercorrelations between the two assessments of performance. It was concluded that the relatively modest relationship which was observed was the product of the inability of supervisors to make reliable and discriminating ratings of specific dimensions of performance. Finally, the pattern of results in the Black and White subgroups was examined. Blacks, in general, scored lower than Whites on the work sample test, but these differences were matched by differences on the supervisory predictions of performance and by aptitude test differences. However, the work sample performance of Blacks was predicted much less accurately than that of Whites. The most plausible explanation of this finding was a higher incidence of rating errors by supervisors in rating Black performance. Total work sample performance was predicted quite similarly for Blacks and Whites, but it appeared that aptitude was manifested somewhat differently in White and Black subgroup performance on specific dimensions of the work sample tasks. The evidence in this research was consistent with the general finding that work samples are worthwhile criteria in test validation research. It was demonstrated that it is possible to develop a complex, realistic simulation of job tasks with acceptable standardization and reliability. Because performance on the work sample closely corresponded to performance on job tasks, the likelihood of demonstrating validity for aptitude measures was maximized, and the performance of each examinee was objectively and fairly evaluated.