There is a non-parametric test for an association (not necessarily linear) between two variables, called Spearman's Rank Correlation Test that can be used when the assumptions/requirements of the (parametric) correlation test are not satisfied.
The only requirements of this non-parametric test are that the data is paired and the result of a simple random sample, and that the data can be ranked (if they are not ranks already).
Essentially, all this test does is find ranks xi and yi for each pair of Xi and Yi values and then run Pearson's correlation test on these ranks.
Recall that r=sxysxsy=∑i(xi−¯x)(yi−¯y)√∑i(xi−¯x)2√∑i(yi−¯y)2
We denote this value as rS when it is computed from ranks to avoid confusion.
Procedurally, one ranks each sample separately. Then for each pair, one finds the difference of ranks di.
The test statistic rS, when there are no rank ties, can be simplified to
rS=1−6∑d2in(n2−1)To see this, first note that as there are no ties, the xi's and yi's both consist of the integers from 1 to n, inclusive.
Consequently, we can rewrite the denominator as ∑i(xi−¯x)(yi−¯y)∑i(xi−¯x)2 Ultimately, the denominator is just a function of n: n∑i=1(xi−¯x)2=n∑i=1x2i−2n∑i=1xi¯x+n∑i=1¯x2=[n∑i=1x2i]−2n¯x[∑ni=1xin]+n¯x2=[n∑i=1i2]−2n¯x2+n¯x2=[n∑i=1i2]−n¯x2=n(n+1)(2n+1)6−n(n+12)2=n(n+1)(2n+16−n+14)=n(n+1)(8n+424−6n+624)=n(n+1)(2n−224)=n(n+1)(n−1)12=n(n2−1)12As for the numerator...
n∑i=1(xi−¯x)(yi−¯y)=n∑i=1xi(yi−¯y)−n∑i=1¯x(yi−¯y)=n∑i=1xiyi−¯yn∑i=1xi−¯xn∑i=1yi+n¯x¯y=[n∑i=1xiyi]−n¯x¯y=[n∑i=1xiyi]−n(n+12)2=[n∑i=1xiyi]−n(n+1)(2n+1)6+n(n2−1)12=[n∑i=1xiyi]−n∑i=1x2i+n(n2−1)12=2∑ni=1xiyi2−∑ni=1(x2i+y2i)2+n(n2−1)12=n(n2−1)12−∑ni=1(x2i−2xiyi+y2i)2=n(n2−1)12−∑ni=1(xi−yi)22=n(n2−1)12−∑ni=1d2i2Finally, dividing both numerator and denominator by n(n2−1)/12, we can simplify things to
rs=n(n2−1)12−∑ni=1d2i2n(n2−1)12=1−6∑d2in(n2−1)Critical values can be found in the table below:
Example
Suppose one wishes to use a non-parametric test to test the claim that there is a correlation between one's age and the number of parties they attend in a two-month period, given the following data:
Age16241817232732Parties3254061First we rank the x's and y's separately:
1532467Age16241817232732Parties32540614365172Then, for each pair, we find the difference of the ranks and its square.
d−32−3−33−15d294999125Now we can calculate the test statistic:
rS=1−6∑d2in(n2−1)=1−(6)(66)(7)(49−1)=−0.1786Seeing this test statistic less in absolute value than the corresponding critical value at α=0.05 given in the table above (i.e., C.V.=0.786), we would fail to reject the null hypothesis, inferring that there is no evidence of a correlation.