Often in statistics we consider drawing elements from some larger population. R provides some powerful tools to simulate this circumstance. Two very important functions to this end are ** sample()** and

`replicate`

`sample()`

__Description__

sample takes a sample of the specified size from the elements of x, either with or without replacement.

__Usage__

`sample(x, size = n, replace = FALSE, prob = NULL)`

where

- $x$ is a vector of one or more elements from which to choose.
- $n$ is a positive number of items to choose.
- $replace$ (optional) indicates whether or not sampling should be done with replacement.
- $prob$ (optional) is a vector of probability weights for obtaining the elements of the vector being sampled.

__Examples__

Suppose a bag is filled with 3 red marbles and 7 blue marbles. To simulate a drawing of 4 marbles, without replacement, from the bag, we could do the following:

> sample(c(rep("red",3), rep("blue",7)), size=4, replace=FALSE) [1] "red" "blue" "blue" "red"

To simulate the sum of two rolled dice, we could do the following:

> sum(sample(1:6, size=2, replace=TRUE)) [1] 7

To simulate 10 coin flips, we could do the following:

> sample(c("H","T"), 10, replace = TRUE) [1] "T" "T" "T" "H" "T" "H" "T" "T" "H" "H"

To simulate a random permutation of the letters ABCDE, we can make the sample size equal to the size of the vector we are sampling and sample without replacement:

> sample(c("A","B","C","D","E"), size = 5, replace = FALSE) [1] "A" "E" "C" "B" "D"

Suppose the probability of a boy being born is $0.513$, while the probability of a girl is $0.487$. We could simulate 10 births with

> births = sample(c("boy","girl"), 10, replace=TRUE, prob=c(0.513,0.487)) > births [1] "girl" "boy" "girl" "girl" "girl" "girl" "boy" "boy" "boy" "girl"Now suppose we want to see what happens in $100,000$ births. Showing the resulting vector will not be helpful, but a creative use of the

`==`

operator and the `sum()`

function can be, as the following demonstrates (recall that `TRUE`

when considered as a numerical value is equal to $1$, while `FALSE`

is equal to $0$).
> births [1] "girl" "boy" "girl" "girl" "girl" "girl" "boy" "boy" "boy" "girl" > births == "boy" [1] FALSE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE FALSE > sum(births == "boy") [1] 4applying this strategy to a sample of size $100,000$ we have

> manybirths = sample(c("boy","girl"), 100000, replace=TRUE, prob=c(0.513,0.487)) > sum(manybirths == "boy") [1] 51205 > sum(manybirths == "girl") [1] 48795Alternatively, we could appeal to the

`table()`

function, which simplifies the process:
> manybirths = sample(c("boy","girl"), 100000, replace=TRUE, prob=c(0.513,0.487)) > table(manybirths) manybirths boy girl 51205 48795

`replicate()`

__Description__

`replicate`

is a function that allows us to repeatedly evaluate an expression (which usually involves something being done "randomly", like the selection of a `sample()`

).

__Usage__

`replicate(n, expr)`

where

- $n$ is the number of times to evaluate the expression
- $expr$ is the expression to be evaluated

__Examples__

To simulate 20 times the sum of two rolled dice, we could do the following:

> replicate(n=20, sum(sample(1:6, size=2, replace=TRUE))) [1] 8 9 6 9 8 7 11 5 7 5 11 7 7 12 8 12 6 10 6 10