R: Check assumptions, then use the function
prop.test(x, n, alternative, conf.level)
To explain the parameters:
x
is a vector of the number of successes seen in the two categoriesn
is a vector of the two sample sizesalternative
is a string of text that specifies the alternative hypothesis (i.e., "two.sided", "less", or "greater", for $p_1 \neq p_2, p_1 \lt p_2, \textrm{ and } p_1 \gt p_2$, respectively.conf.level
is associated with the significance level for the test.correct
is a logical value (i.e., TRUE
or FALSE
) that indicates is a "Yates Continuity Correction" should be used. There is a large body of research that suggests this correction is too strict. To perform an uncorrected $z$-test of a proportion (which pools the proportions), specify correct = FALSE
to override the default.As an example of its use, suppose we have two samples of 500 individuals. Everyone in the first sample has lung cancer, while everyone in the second sample is healthy. There are 490 smokers in the first group, while only 400 in the second.
Perform the test in R with:
results = prop.test(x = c(490, 400), n = c(500,500)) resultswhich results in:
2-sample test for equality of proportions with continuity correction data: c(490, 400) out of c(500, 500) X-squared = 80.909, df = 1, p-value < 2.2e-16 alternative hypothesis: two.sided 95 percent confidence interval: 0.1408536 0.2191464 sample estimates: prop 1 prop 2 0.98 0.80
Note, after running the above, you can access the $p$-value of the test with results$p.value
, and the related confidence interval with results$conf.int
.