Correlation coefficient

In many cases, the independent variable $ x$ and the dependent variable $ y$ are both continuous (taking on real values). This enables another important measure called the Pearson correlation coefficient (or Pearson's r). This estimates the amount of linear dependency between the two variables. For each subject $ i$, the treatment (or level) $ x[i]$ is applied and the response is $ y[i]$. Note that in this case, there are no groups (or every subject is a unique group). Also, any treatment could potentially be applied to any subject; the index $ i$ only denotes the particular subject.

The r-value is calculated as the estimated covariance between $ x$ and $ y$ when treated as random variables:

$\displaystyle r = {\displaystyle\strut \sum_{i=1}^n (x[i] - \hat{\mu}_x) (y[i] ...
...t{\mu}_x)^2} \sqrt{\displaystyle\strut \sum_{i=1}^n (y[i] - \hat{\mu}_y)^2} } ,$ (12.7)

in which $ \hat{\mu}_x$ and $ \hat{\mu}_y$ are the averages of $ x[i]$ and $ y[i]$, respectively, for the set of all subjects. The denominator is just the product of the estimated standard deviations: $ \hat{\sigma}_x \hat{\sigma}_y$.

The possible r-values range between $ -1$ and $ 1$. Three qualitatively different outcomes can occur:

In practice, it is highly unlikely to obtain $ r = 0$ from experimental data; therefore, the absolute value $ \vert r\vert$ gives an important indication of the likelihood that $ y$ depends on $ x$. The theoretical equivalence to the null hypothesis ($ r = 0$) would happen only as the number of subjects tends to infinity.

Steven M LaValle 2016-12-31