Imágenes de páginas
PDF
EPUB
[blocks in formation]

XI. THE EFFECT OF PRACTICE ON THE DISTRIBUTION OF JUDGMENTS

Do ordinary differential judgments (warmer, lighter) approximate more and more nearly, with increase of practice, the normal percentile curve? When a comparison-weight rises step by step in relation to a given standard, will the resultant series of 'heavier' judgments, rising also step by step between o per cent. and 100 per cent., tend to approach more and more nearly, as practice accumulates, the form of the cumulative (y)-function? This is our problem; and we hope to show herein that (1) when degree of normality (P) is high to begin with, it does not appreciably change as practice proceeds; (2) when P on the contrary begins low, it improves notably as practice continues.

The matter has gained some attention of late (rightly, in our view) from students of psychometric theory; it reaches almost every phase of quantitative psychology and thus goes far beyond the confines of psychophysics and differential sensitivity; so that we incline to Urban's declaration that if practice does indeed prove to bring an observer's judging into better accord with the (y)-hypothesis, the fact will have "far-reaching practical and theoretical consequences" (12, 496). Despite the reach and import of the question, it has so

1 The first ten of these Studies have already appeared in Psychological Studies from the University of Illinois (Psychol. Monog., 1926, 35, no. 163), 56–137. 169

far been treated in a rather casual way; it has not been seriously and systematically taken in hand; so that we have had a good deal of debate with little in the way of evidence or assured conclusion. We here propose therefore to submit a few facts and inferences in the endeavor first, properly to analyze and define the problem and then to attempt an answer. While the evidence is not broad enough to provide a universal solution, it does reveal, as we hope to prove, some clear and concordant tendencies. The present argument falls under two heads:

(1) In the case of a Þ(y)-function, what is the true measure of agreement or accord between theory and observation? (2) Does the accord improve with practice (successive samplings from a temporal sequence)?

As for (1), there is no accepted procedure to date for finding how well theory conforms to observation, no consensus among investigators on the true way to measure how widely Urban's curves depart from the empirical values to which they are fitted. One method has been developed by Thomson (9) and used by Rich (8); another (the traditional procedure) was originally adopted by Urban (10), later used by Hoisington (7) and Fernberger (6) and accepted by Boring (1). The two procedures, of course, differ; but the superiority of Thomson's can readily be shown. By the 'best-fitting' curve of a given kind we mean the most probable curve of this type or formula that can be laid through the body of data in question; that curve in turn is 'most probable,' by the accepted theory of least squares, whose Ed2 is smaller than for any other function. of the same type that can be used to graduate these data. Now the distinctive feature of the Urban-hypothesis, which alone differentiates it from Müller's psychometric formula, is that Urban multiplies each d' with 1/pq. When p = q = 1/2 this factor amounts to 4.0, but when p = .01 or .99 the reciprocal of pq rises to 101.0, or more than 25 times as large. With Müller, then, one d counts just as much as another, all have the same importance; every frequency contributes equally to the course of the fitted curve; whereas Urban penalizes a d near o per cent. and 100 per cent. as being far

more serious than in the region of 50 per cent.; these extreme values influence the course of the smooth curve some 10 to 25 times as much as do the medial percentages. It follows that Urban's (y) fits the tails far more neatly than the middle range of its distribution, a procedure which is both demanded and warranted by the greater stability of judgment near the extremes. Since all these curves are so fitted as to minimize (d/pq), this sum is obviously the only true index or measure of agreement between theory and observation; of n empirical distributions, the one whose graduated curve shows the lowest (d2/pq) is by definition the 'best fitting' of the lot. Inasmuch as the raw Zd2 has no stated relation of any kind to this weighted sum, which alone is used in graduating the observed figures, it would be strange indeed to make use of the former when comparing goodness of fit. Ed is commonly used to measure goodness of fit merely because functions are commonly so fitted as to minimize Ed2; but when Ed2 has no part whatsoever in fixing the course of the fitted curve, it has likewise no place in measuring how well the resultant curve fits. Inasmuch as Urban has never defended the adoption of Eď, we need stay with the matter no longer.

(2) Does goodness of fit, when properly measured, improve with practice (samplings taken from a temporal sequence)?

Here as in question (1) opinion is conflicting and tentative. Thomson (3,90) discovered that, in the case of Urban's seven observers, the best approximation to (y) appeared with the second (II, Urban himself), who was presumably more practiced than any of the others. Hoisington also finds that fit improves with practice: "we approach the theoretical function as we increase the number of observations. This relationship suggests that the (y)-hypothesis is the correct hypothesis for these conditions, and that deviations from it are due in actual cases not to errors of theory but to errors of observation" (7, 595). Urban, being impressed with these two findings and believing that confirmation of them would have far-reaching practical and theoretical consequences, asks other investigators to put the same question to any data they

may have (12, 496); whereupon Fernberger computes Ed2 for some earlier figures of his own (28 sets of 50 each for two observers) and concludes that, while the decrease in Ed2 is far from regular, the general tendency toward improvement of fit is evident (6, 500). Rich testing the same data by Thomson's method finds that, "No general conclusion as to the effect of practice seems possible from the data at hand” (8, 620). Boring finally is impatient of the whole problem and contends that seeking for any general psychometric function [e.g., Þ(y)], is like "following a wandering fire," the likelihood of finding a generalized formula being so meager as to make the topic unsuitable even for investigation (1, 770). Despite these conflicting and hesitant conclusions, we believe that the problem, when better analyzed and treated with additional evidence, will put us in the way of its own solution.

The value of P (Pearson's symbol for goodness of fit, which may range from .00 to 1.00) is of course fixed by the concurrence of many factors (e.g., observer, type of material, categories of response); but the two which alone concern us here are: (a) how P varies with practice (the temporal position of a given sample); (b) how P correlates with h (the measure of precision, whose magnitude varies inversely to the limen). Let us consider the two cases in turn. (a) If we are to speak usefully about the matter, we need to distinguish two forms of 'practice'; first, general facility in the technique of observation, and secondly, specialized skill in some particular function (as, discriminating weights, greys or temperatures). A 'practiced' observer in the general meaning is conversant with the methodology of scientific observing, knows by dint of adequate training how to form and use criteria, is careful to keep conditions within and without the organism as constant as may be, and the like; he is then a trained O, though he may never have lifted a weight nor compared two thermal impressions. When he begins to observe in a particular field (lifted weights), is there any reason to suppose that his responses will of necessity distribute more and more normally as time passes? Not at all; if he uses the same care in judging at first as he does later, his distributions will not appreciably

change in type. In psychometry, so far as known, normality demands nothing but a situation in stable equilibrium; it has no necessary dependence upon temporal order; a beginner who is careful about criteria and attitude may accord just as well with normality in the first stages of his work as in the last, even though his sensitivity or limen shows meanwhile great improvement. If on the contrary O at first is naïve, ignorant of good methodology and indifferent to the demands of his work, then indeed will his judgments be erratic, because they issue from instable and ill-defined criteria; his frequencies may then be expected to depart widely from their theoretical values. As he proceeds, he will tend not only to lower his limen but to define and stabilize the criteria as well; but the reader will note that these two forms of 'practice,' though they may of course proceed concomitantly, are by no means identical nor do they always correlate. This being true, the question whether practice leads to greater accord with normality can be answered only when we know what kind of 'practice' is meant; debate is irrelevant until the situation is properly defined. In short, an observer whose criteria and attitude improve pari passu with his increase in proficiency will distribute his frequencies more and more normally as judging proceeds; while an O who is careful and consistent from the start will show little if any change, even though his precision be markedly increasing all the while.

h

We now come to question (b): How does P correlate with the precision, h? Now h (the traditional measure of precision) is, in our opinion, the true index of differential sensitivity by the Urban procedure (see article IX of this series: 5). If we define the limen as the probable error of the (y)-curve which is fitted to a given set of frequencies, then h varies inversely to the limen by the simple relation

[blocks in formation]

may range from zero to infinity; when h is zero the limen (p.e.) is infinite and when h becomes infinite the limen is zero. Graphically, this means (Fig. 1) that the ogive begins with a

« AnteriorContinuar »