Descriptive Statistics in R and Excel

Measures of Center

In R, finding means and medians can be found with similarly named functions, as seen below.

> data = c(1,2,2,20,4,5,7,3)
> mean(data)
[1] 5.5
> median(data)
[1] 3.5

There is no built-in function for identifying the mode, although one can write a custom function to this end. Importantly, avoid naming this custom function "mode" (although "Mode" with a capital "M" is ok, as R is case-sensitive). This is due to the fact that there is another function named "mode" that is used for something else in R. An example of such a function is given below, along with an example of its application:

> data = c(1,2,2,20,4,5,7,3)
> Mode = function(x) {}
> Mode = function(x) {
+   ux = unique(x)
+   ux[which.max(tabulate(match(x,ux)))]
+ }
> Mode(data)
[1] 2

In Excel, the mean, median, and mode of a given range $R$ can be quickly calculated with the following functions, respectively: $$AVERAGE(R), \quad MEDIAN(R), \quad \textrm{and} \quad MODE(R)$$

Measures of Spread

In R, one can calculate the variance, standard deviation, and interquartile range (IQR) with functions var(), sd(), and IQR(), as seen in the examples below:

> data = c(1,2,2,20,4,5,7,3)
> var(data)
[1] 38
> sd(data)
[1] 6.164414
> IQR(data)
[1] 3.5

Of course, if one needed the population variance or population standard deviation, one could multiply the variance by the appropriate constant:

> data = c(1,2,2,20,4,5,7,3)
> n = length(data)
> n
[1] 8
> pop.var = var(data) * (n-1) / n
> pop.var
[1] 33.25
> pop.sd = sqrt(pop.var)
> pop.sd
[1] 5.766281

The range of values in a data set can also be easily found with the min(), max(), and range(), as shown below:

> data = c(1,2,2,20,4,5,7,3)
> min(data)
[1] 1
> max(data)
[1] 20
> range(data)
[1]  1 20

In Excel, similar statistics related to the spread of data in a given range $R$ are given by the following functions or expressions:

StatisticFunction
Population VarianceVAR.P(R)
Sample VarianceVAR.S(R)
Population Standard DeviationSTDEV.P(R)
Sample Standard DeviationSTDEV.S(R)
Inter-quartile Range (IQR)QUARTILE(R,3) - QUARTILE(R,1)
MaximumMAX(R)
MinimumMIN(R)