An evaluation of two National Science Foundation Academic Year Institutes for Earth Science Teachers
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Problem. The purpose of the present investigation was two-fold: (1) to attempt to measure the effectiveness of two National Science Foundation Academic Year Institutes for Earth Science Teachers with respect to improving teacher content competency in the area of Earth Science and, (2) to determine the validity of a theoretical evaluation model which was devised to measure changes in earth science content knowledge effected by the institute programs. Content was viewed as consisting of both factual knowledge and mental processes which are employed in the acquisition and utilization of this knowledge. The two objectives were closely related since they both attempted to answer the question: Do National Science Foundation Academic Year Institutes for Earth Science Teachers serve as effective means of improving teachers' subject matter knowledge of Earth Science? Population. Thirty-three NSF Academic Year Institute participants from two 1969-70 institutes for Earth Science teachers constituted the population for the study. One institute (University of Houston) contained 18 participants and the other (University of Florida) contained 15 participants. Mean comparisons of participants' age, sex, academic preparation, aptitude, and teaching experience were made between the two institutes and it was determined that the two groups were comparable. Procedures. Pre-tests of the Earth Science Achievement Test (ESAT), Watson-Glaser Critical Thinking Appraisal (WGCTA) Form YM, and the Wisconsin Inventory of Science Processes (WISP) were administered to the subjects in October, 1969. Post-tests of the same instruments were administered in May, 1970. A three-factor mixed design with repeated measures on two factors analysis of variance (Lindquist Type 7 Design) was used to analyze the data. The design used three factors (Institutes, tests, pre-post) to make one between comparison and two within comparisons. Raw scores were converted to standard scores to achieve equivalent scaling across the three instruments. Appropriate tests for normality of a distribution and homogeneity of variance were used to satisfy assumptions of the research design. The Scheffe method of determining significant differences between means was used for all significant F ratio scource of variation categories noted in the analysis of variance summary table. All tests of significance were made at the .05 level. Findings and Conclusions. Seven null hypotheses were tested using the results of the analysis of variance and subsequent Scheffe comparison of means. H[lowered o]1: There will be no significant difference in mean score (determined over all three evaluative instruments both pre and post tests) of participants in Institute A and participants in Institute B. Hypothesis was rejected at the .05 level of significance. The mean score of participants in Institute A was significantly higher than mean score of participants in Institute B. H[lowered o]2: There will be no significant difference in mean score (determined over both Institutes on all three evaluative instruments) of pre-tests and post-tests for participants. Hypothesis was rejected at the .01 level of significance. Concluded that significant gains were achieved at end of course of study. H[lowered o]3: There will be no significant difference in mean score (determined over pre-tests and post-tests for both institutes) on the three evaluative instruments. Hypothesis was accepted at .05 level of significance. This was anticipated because of the necessity of reducing raw scores to standard scores for purposes of assessing changes and interactions. H[lowered o]4: There will be no significant interaction between Institutes and tests; that is, no significant differences in mean scores on three evaluative instruments will be present for the two Institutes when pre-tests and post-tests scores are combined. Hypothesis was accepted at .05 level of significance. Concluded that the participants in the two institutes displayed similar scoring patterns on the three evaluative instruments. H[lowered o]5: There will be no significant interaction between Institutes and time of testing; that is, no significant differences in mean scores for the two institutes will be present at either time of testing when the three evaluative instruments are combined. Hypothesis was rejected at .05 level of significance. Concluded after further analysis with Scheffe technique that Institute B gained significantly more than did Institute A during the term of the program. H[lowered o]6: There will be no significant interaction between the three evaluative instruments and time of testing; that is, there are no significant differences in mean scores for the three evaluative instruments at the time of pre and post testing for participants in both institutes. Hypothesis rejected at .05 level of significance. Concluded after further analysis with Scheffe technique that the gain on ESAT instrument was significantly superior to the gain on either of the other two instruments (WGCTA and WISP). H[lowered o]7: There will be no significant interaction between mean scores for either of the institutes on any of the three evaluative instruments at pre or post-test time. Hypothesis accepted at .05 level of significance. Concluded that no evidence exists that the two institutes were different in pattern of change in scores from pre to post-tests on three evaluative instruments. Significant differences at the .05 level were found for each of the four sources of variation (between institutes, within pre-post, and within interactions of institutes X pre-post and tests X pre-post). Interpretation of the results of the analysis of data indicate that the two NSF Academic Year Institutes for Earth Science Teachers are effective means of improving teacher content competency in the area of Earth Science. This was concluded with respect to the investigator's prior definition of effectiveness as being significant positive gains between pre-test and post-test AYI participant group means on selected instruments used to measure content competencies. Apparently, the evaluation model as proposed was a valid means of assessing the effectiveness of the two institutes in improving content competency of the participants during their course of study at the respective institutions.