Thirteen Ways to Look at the Correlation Coefficient
In 1885, Sir Francis Galton first defined the term "regression" and completed the theory of bivariate correlation. A decade later, Karl Pearson developed the index that we still use to measure correlation, Pearson's r. Our article is written in recognition of the 100th anniversary of Galton's first discussion of regression and correlation.
We begin with a brief history. Then we present 13 different formulas, each of which represents a different computational and conceptual definition of r. Each formula suggests a different way of thinking about this index, from algebraic, geometric, and trigonometric settings. We show that Pearson's r (or simple functions of r) may variously be thought of as a special type of mean, a special type of variance, the ratio of two means, the ratio of two variances, the slope of a line, the cosine of an angle, and the tangent to an ellipse, and may be looked at from several other interesting perspectives
1885年,弗朗西斯·高尔顿爵士首先定义了“回归”一词,并完成了二元相关理论。 十年后,Karl Pearson发展了我们仍用于衡量相关性的指标,Pearson's r。 我们的文章是为了表彰高尔顿关于回归和相关性的第一次讨论100周年。
我们从简短的历史开始。 然后我们提出了13个不同的公式,每个公式代表r的不同计算和概念定义。 每个公式都提出了一种不同的思考方法,包括代数,几何和三角函数。 我们证明Pearson的r(或r的简单函数)可以被不同地认为是一种特殊的均值,一种特殊的方差,两个平均数的比率,两个方差的比率,一条线的斜率,一个角度的余弦,和一个椭圆的切线,可以从其他几个有趣的角度来看。
1. CORRELATION AS A FUNCTION OF RAW SCORES AND MEANS
相关作为原始数据以及平均数的函数
2. CORRELATION AS STANDARDIZED COVARIANCE
相关作为标准化的协变量
3. CORRELATION AS STANDARDIZED SLOPE OF THE REGRESSION LINE
作为回归线的标准化斜率
4. CORRELATION AS THE GEOMETRIC MEAN OF THE TWO REGRESSION SLOPES
相关作为两个回归斜率的几何平均值
5. CORRELATION AS THE SQUARE ROOT OF THE RATIO OF TWO VARIANCES (PROPORTION OF VARIABILITY ACCOUNTED FOR)
相关作为两个方差比率的平方根(变异的比例)
6. CORRELATION AS THE MEAN CROSS-PRODUCT OF STANDARDIZED VARIABLES
7. CORRELATION AS A FUNCTION OF THE ANGLE BETWEEN THE TWO STANDARDIZED REGRESSION LINES
相关性作为两个标准化回归线之间角度的函数
8. CORRELATION AS A FUNCTION OF THE ANGLE BETWEEN THE TWO VARIABLE VECTORS
8.作为两个变量向量之间角度的函数的相关性
If the variable vectors are based on centered variables, then the correlation has a straightforward relationship to the angle a between the variable vectors (Rodgers 1982): r= cos(a). (8.1)
9. CORRELATION AS A RESCALED VARIANCE OF THE DIFFERENCE BETWEEN STANDARDIZED SCORES
相关性作为标准化分数差异的重新定位