Reliability assessment in narrative observations of human behavior



Journal Title

Journal ISSN

Volume Title



The direct observation of behavior in natural settings offers the researcher a unique and promising data collection procedure. The reliability of the observing and data reduction process needs to be evaluated before the limits of an observational data analysis system can be estimated. This is a multidimensional problem, but most observational research has been evaluated with unidimensional measures of reliability. The objectives of this study were: (a) to develop an ongoing multidimensional set of assessment techniques that helped alleviate important methodological problems and made the measurement of interobserver agreement possible, and (b) to evaluate the quality of the data collected in a large-scale observational study. The research was conducted within the context of a longitudinal study of patient behavior in a rehabilitation hospital. Trained observers dictated continuous, time-cued narrative descriptions of patient behavior into hand-carried tape recorders. The descriptive records were then transcribed and coded. During one 13-week data collection period (Phase 1), one one-and-one-half-hour observational segment was selected each week. During that segment, two observers observed the target patient simultaneously. The protocols generated from these pairings were then independently coded by a team of coders and used to assess intercoder agreement (i.e., different coders, one observer) and intersystem agreement (e.g., different coders, different observers). In Phase 2, the stability of the system was assessed. High levels of intercoder and intersystem agreement were found, and the intercoder agreement was consistently above the intersystem agreement. The stability phase of the study indicated some system drift over time, but the change was not greater than intersystem differences computed on the same data. The results suggest that high reliabilities can be achieved with narrative observation procedures, but that perceptual limits in the observational process influence agreement. These findings suggest some limits on the kinds of settings in which this observational methodology can be used.



Human behavior, Reliability, Assessments