R Functions Related to Simulation

To generate a vector of $n$ random values following a uniform distribution, one can use the runif() function.

runif

Usage
runif(n)
runif(n, min, max)

Example

To generate 10 random values between $0$ and $1$, and then 5 random values between $2$ and $7$, one could use the following:

> runif(10)
 [1] 0.7072679 0.2488529 0.7572154 0.3351405 0.2017931 0.5901582
 [7] 0.4627859 0.8125028 0.3704643 0.7976154

> runif(5,min=2,max=7)
[1] 2.308490 3.929546 2.929231 2.294929 4.137228

Often in statistics we consider drawing elements from some larger population. R provides a powerful tool in the form of the sample() function to this end.

sample()

Description
sample takes a sample of the specified size from the elements of x, either with or without replacement.

Usage
sample(x, size = n, replace = FALSE, prob = NULL)

where

$x$ is a vector of one or more elements from which to choose.
$n$ is a positive number of items to choose.
$replace$ (optional) indicates whether or not sampling should be done with replacement.
$prob$ (optional) is a vector of probability weights for obtaining the elements of the vector being sampled.

Examples

Suppose a bag is filled with 3 red marbles and 7 blue marbles. To simulate a drawing of 4 marbles, without replacement, from the bag, we could do the following:

> sample(c(rep("red",3), rep("blue",7)), size=4, replace=FALSE)
[1] "red"  "blue" "blue" "red"

To simulate the sum of two rolled dice, we could do the following:

> sum(sample(1:6, size=2, replace=TRUE))
[1] 7

To simulate 10 coin flips, we could do the following:

> sample(c("H","T"), 10, replace = TRUE)
 [1] "T" "T" "T" "H" "T" "H" "T" "T" "H" "H"

To simulate a random permutation of the letters ABCDE, we can make the sample size equal to the size of the vector we are sampling and sample without replacement:

> sample(c("A","B","C","D","E"), size = 5, replace = FALSE)
[1] "A" "E" "C" "B" "D"

Suppose the probability of a boy being born is $0.513$, while the probability of a girl is $0.487$. We could simulate 10 births with

> births = sample(c("boy","girl"), 10, replace=TRUE, prob=c(0.513,0.487))
> births
 [1] "girl" "boy"  "girl" "girl" "girl" "girl" "boy"  "boy"  "boy"  "girl"

Now suppose we want to see what happens in $100,000$ births. Showing the resulting vector will not be helpful, but a creative use of the == operator and the sum() function can be, as the following demonstrates (recall that TRUE when considered as a numerical value is equal to $1$, while FALSE is equal to $0$).

> births
 [1] "girl" "boy"  "girl" "girl" "girl" "girl" "boy"  "boy"  "boy"  "girl"
 
> births == "boy"
 [1] FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE
 
> sum(births == "boy")
[1] 4

applying this strategy to a sample of size $100,000$ we have

> manybirths = sample(c("boy","girl"), 100000, replace=TRUE, prob=c(0.513,0.487))

> sum(manybirths == "boy")
[1] 51205

> sum(manybirths == "girl")
[1] 48795

Alternatively, we could appeal to the table() function, which simplifies the process:

> manybirths = sample(c("boy","girl"), 100000, replace=TRUE, prob=c(0.513,0.487))

> table(manybirths)
manybirths
  boy  girl 
51205 48795

Other times, one needs to simulate multiple occurrences of the same random phenomenon -- perhaps one built around runif(), or sample(), or one of the other random distribution functions that we will learn about later. In these cases, the replicate() function will likely be what one needs.

replicate()

Description
replicate is a function that allows us to repeatedly evaluate an expression (which usually involves something being done "randomly", like the selection of a sample()).

Usage
replicate(n, expr)

where

$n$ is the number of times to evaluate the expression
$expr$ is the expression to be evaluated

Examples

To simulate 20 times the sum of two rolled dice, we could do the following:

> replicate(n=20, sum(sample(1:6, size=2, replace=TRUE)))
 [1]  8  9  6  9  8  7 11  5  7  5 11  7  7 12  8 12  6 10  6 10