Imágenes de páginas
PDF
EPUB

Each of these pairs represents a relationship, the entire series. reading: A deviation in A of -7 from the central tendency of A brought with it a deviation in B of 5 from the central tendency of B; a deviation of 5 brought in one case a deviation in B of -5, in a second case one of 3, and in a third case of -1, etc.

= .634.

Consider now two measures each expressing an important fact concerning this series of 30 individual relationships. The first is, Σ(AB) The second is, The median of the 30 B/A VEA2 VEB2 ratios.65. The former is of course the Pearson Coefficient of correlation for A-B; the latter is the Median or Mid Ratio B/A.

What the former measures can not be stated except in terms not yet given by the individual relationships themselves. Professor Pearson's own statements for instance are in terms of certain facts of a correlation diagram such as Fig. 1, not in terms of the individual relationships.

It is clear that in the case of Fig. 1, which represents our 30 relationships graphically, the slope of the straight line LL1 through

[ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

O so drawn that the sum of the deviations of the individual dots from it is zero (measuring deviations in the direction of the B line and calling deviations above the line in the left hand half of the surface and below the line in the right hand half of the surface+, and calling deviations below the line in the left half and above the line. in the right hand half) is a measure of an important fact about the series of relationships.

The Pearson Coefficient does not, however, measure the slope of just such a line as we have supposed to be drawn in Fig. 1 and described in the last paragraph. Its line is not so calculated as to

1

1In this case the slope is roughly 73 per cent. of 45°, the slope which would be found were correlation perfect. The slope for the A's taken as dependent on the B's is roughly 64 per cent. of 45°.

make the deviations from it toward closer correlation equal to the deviations from it towards less correlation, but is so calculated as to make the sum of the squares of the deviations from it least.

This of course weights the extreme deviations much more than those near the center of the surface, for the same change in the slope of the line alters the sum of the squares of the deviations from the line near the center of the surface far less than that of the remote deviations. This is a possibly questionable feature of the Pearson Coefficient.

Moreover it is calculated as the slope of this line of so-called 'regression' as found when the two traits are reduced to equivalence of variability and double entries are made in the correlation table, i. e., B's as related to A's and A's as related to B's, the two sets of entries being so superposed that the intersection of the means in the one case coincides with the intersection of the means in the other case.

Professor Pearson gives many readers the impression that his coefficient of correlation is calculated as the slope of the straight line

[merged small][merged small][merged small][merged small][merged small][ocr errors][ocr errors][ocr errors][ocr errors][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][ocr errors]

through 0 to fit the points in the correlation diagram that represent the means of the arrays' (the two related series being reduced to an equivalence in variability and entered doubly), but in fact it is the slope of the line from which the sum of the squares of the deviations of all the dots each representing one relationship is least, not the slope of the line from which the sum of the squares of the deviations of the dots representing each the mean of one array is least. It is in our illustration a line to fit the dots of Fig. 3, not those of Fig. 2. That is, an array of 100 cases is (quite properly) given greater weight than one of 2 cases.

See, for instance, 'Grammar of Science,' 2d edition, 1900, p. 393 and p. 396.

Consider now the Pearson Coefficient from another point of view. Let us for the present restrict relationships to those between two series of the same form of distribution, and also define perfect correlation as a relationship such that any deviation of A from its central tendency will imply a deviation of B from B's central tendency which shall be the same fraction of B's variability that the deviation of A is of A's variability. That is,

[blocks in formation]

perfect correlation find each deviation of A accompanied by an identical deviation of B. The sum of the AB products would be equal to the sum of the A2, or to the sum of the B2, or to VA2 óB2. In the case of two series of the same form of distribution and of equal variability the Pearson Coefficient formula then measures the proportion which the sum of the series A,B1, A,B,, etc., is of what it would be with perfect correlation as defined.

It can be shown that without reducing B or A to equivalence in variability perfect correlation as defined would give for the sum of the AB products √ΣA2 VΣB2, provided the form of distribution of A is the same as that of B.

The Pearson Coefficient measures, then, in cases where the form of distribution of the two facts to be related is the same, the proportion which the sum of the AB products is of what it would be were correlation perfect.

There is no ambiguity as to what is measured by the median of the B/A ratios. Whatever the distributions may be or the ratios, the median means always a definite thing: the ratio B/A which is exceeded in magnitude by as many of the ratios as it exceeds. We have only to note that the median of the B/A's and the median of the A/B's are two different things and that if we are interested in representing in one number both what a given A deviation implies with respect to B and what a given B deviation implies with respect to A, we must use both the B/A and the A/B median.

Certain other measures deserve mention. The directly calculated average of all the individual relationships B/A or A/B is a perfectly comprehensible measure but rather a useless one. The Modal Ratio B/A or A/B is also a perfectly clear conception and, in cases where it can be easily and accurately determined, a very valuable one.

The per cent. of direct or the per cent. of inverse relationships is equally comprehensible, and is an important function of the closeness of relationship.

[merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed][subsumed]

1 11 23569 12 16 20 26 31 37 43 50 54 59 62 63 63 62 59 54 50 43 37 31 26 20 15 12 9 6 5 3 2 111

When the individual values of A and B are not measured as amounts of deviation from their central tendencies, but only as so many A''s known to be less than Z and so many A2's greater than Z, and as so many B1's less than W and so many B2's greater than W, the per cents. of A'B' pairs, A'B' pairs, A2B1 pairs and A2B2 pairs give important information.

The number and amount of the divergences of the ranks of the second members from the ranks of their related first members also give important information.

If the two related facts are of the so-called normal distribution and the relationship is uniform for all amounts of A and each array is also a normal distribution, the Median Ratio, the Modal Ratio and

X→

1

1

FIG. 4.

the Pearson Coefficient will, if the two series are reduced to equivalence in variability, coincide and will equal cosine U.1 This is the case of so-called normal correlation approximated in many organic and hereditary anatomical relationships. It is of course only one of many possible types of relationship. The extent to which it prevails in mental and social relationships is not known. valence in the case of anatomical facts has probably been overestimated.

Table IX. gives the facts of the relationship between two series both of the same form of distribution, almost exactly the so-called normal, and of the same variability, the relationship being devised artificially so that the average of each array of y is .5 X the corresponding value of x. This regression of y on x is shown graphically in Fig. 4, which gives the average of each array of the y's. The regression of x on y is shown graphically in Fig. 5, which gives the average of each array of the x's. The Pearson Coefficient for this case is .53. The Median Ratio is much higher (.60 for the y/x

1U equalling the per cent. of unlike-signed pairs.

« AnteriorContinuar »