## Exercises - Binomial Probabilities

1. To find the probability of rolling exactly two 3's in 5 rolls of a single die, what is wrong with the strategy of using the multiplication rule to find the probability of getting two 3's, followed by three non-3's, which is given by $$\frac{1}{6} \cdot \frac{1}{6} \cdot \frac{5}{6} \cdot \frac{5}{6} \cdot \frac{5}{6} = \left(\frac{1}{6}\right)^2 \left(\frac{5}{6}\right)^3$$

It doesn't take into account that the two 3's don't have to happen on the first two rolls. There are ${}_5 C_2$ positions in which they could occur.

2. Determine if the following situations suggest a random variable with a binomial distribution:

1. The number of questions correct if one randomly guesses on a quiz of 20 multiple choice questions where each question has 4 possible answers
2. The number of people with blue eyes in a group of 10 people drawn from a room of 30 people without replacement.
3. The number of bird chirps one can hear in a day if the average number of chirps per hour is 15.
4. The number of heads seen in 30 flips of a coin
5. The number of each of 3 species of flowers present in a collection of 100 flowers.
6. The number of rolls of two dice that result in a prime total if the dice are rolled 50 times.
7. The number of 400 subjects taking Atorvastatin that indicated they experienced a headache the same day they first took this drug.
1. Yes, this suggests a random variable with a binomial distribution

2. No. The "trials" (the people drawn from the room and tested if they have blue eyes or not) are not independent. As people with blue eyes are drawn from the room, the probability the next person drawn from the room has blue eyes decreases

3. No. There is no upper limit on how many chirps one could hear. In a binomial situation, the maximum number of successes is limited by the fixed number of trials.

4. Yes, this suggests a random variable with a binomial distribution

5. No. There are more than two outcomes. If one species could be the "success" and another be "failure", what would the third species be?

6. Yes, this suggests a random variable with a binomial distribution

7. Yes, this suggests a random variable with a binomial distribution

3. For each random variable below that follows a binomial distribution corresponding to the given number of trials $n$, and probability of success $p$, find the probability of seeing $x$ successes.

1. $n = 12$,   $p = 3/4$,   $x = 10$
2. $n = 9$,   $p = 0.35$,   $x = 2$
3. $n = 20$,   $p = 0.15$,   $x = 4$
4. $n = 15$,   $p = 1/3$,   $x = 13$
In each, use $P(x) = ({}_{n}C_{x})p^x q^{n-x}$, yielding (approximately):

1. 0.2323
2. 0.2162
3. 0.1821
4. 0.00002927
R:
a. dbinom(10,size=12,prob=3/4)
b. dbinom(2,size=9,prob=0.35)
c. dbinom(4,size=20,prob=0.15)
d. dbinom(13,size=15,prob=1/3)

Excel:
a. BINOM.DIST(10,12,3/4,FALSE)
b. BINOM.DIST(2,9,0.35,FALSE)
c. BINOM.DIST(4,20,0.15,FALSE)
d. BINOM.DIST(13,15,1/3,FALSE)


4. Is it unusual to see less than 3 heads in 12 flips of a coin? Why?

Yes, it only happens around 2% of the time. (Note: $P(0)+P(1)+P(2) \doteq 0.01929$, when $n=12, p= 0.5$)
R:
pbinom(2,size=12,prob=0.5)   # <-- this is P(0)+P(1)+P(2)

Excel:
=BINOM.DIST(2,12,0.5,TRUE)


5. About 8% of males are colorblind. A researcher needs three colorblind men for an experiment and begins checking potential subjects. What is the probability that she finds three or more colorblind men in the first nine she examines?

Binomial. $P(3) + P(4) + \cdots + P(9) = 1 - (P(0) + P(1) + P(2))$
where $n = 9, p = 0.08$. Thus, approximately 0.0298
R:
1-pbinom(2,size=9,prob=0.08)

Excel:
=1-BINOM.DIST(2,9,0.08,TRUE)


6. Assume that 13% of people are left-handed. If we select 5 people at random, find the probability of each outcome below:

1. The first lefty is the fifth person chosen
2. There are exactly 3 lefties in the group
3. There are some lefties among the 5 people
4. There are no more than 3 lefties in the group
1. $(0.87)^4(0.13) \doteq 0.0745$
2. $P(3) = ({}_5C_3)(0.13)^3(0.87)^2 \doteq 0.0166$
3. Complement. $1-P(0) = 1 - (0.87)^5 \doteq 0.5016$
4. Complement. $1-P(4)-P(5) = 1 - ({}_5C_4)(0.13)^4(0.87)^1 - (0.13)^5 \doteq 0.9987$

7. A biologist examines frogs for a genetic trait he suspects might be due to contaminated water. Normally the trait examined presents itself on average in 1 out of 8 frogs in the wild. If the frequency of this trait has not changed and he captures 12 frogs to examine, what is the probability he finds this trait in more than 4 frogs? Given this, if he sees more than 4 frogs with the trait in question, what reasonable conclusion should he likely make?

Complement. $1-(P(0) + P(1) + P(2) + P(3) + P(4))$, with $n=12$ and $p = 1/8$. Approximately 0.0113.

R:
1-pbinom(4,size=12,prob=1/8)

Excel:
=1-BINOM.DIST(4,12,1/8,TRUE)

TI-83/84:
1-binomcdf(12,1/8,4)


Seeing such an unusual result under an assumption that the frequency of the trait has not changed, it would be reasonable to conclude that the frequency of the trait has likely changed.

8. Find the mean and standard deviation of a random variable following a binomial distribution corresponding to 50 trials each with a probability of success equal to 0.2.

$n = 50, p = 0.2, q = 0.8$, so mean is $\mu = np = 10$, while standard deviation is $\sigma = \sqrt{npq} \doteq 2.8284$

9. An exam consisting of 75 true/false questions is taken in the last five minutes of the exam period by a student who accidentally slept through most of the exam period. The student, given the small amount of time available to complete the exam randomly guesses on all 75 questions. Find the mean and standard deviation for the number of questions correctly answered by such a student. Assuming a minimum of 45 correct answers are necessary to pass the exam, what is this student's probability of passing?

$n = 75, p = 0.5$. $\mu = 37.5$; $\sigma = 4.3301$;

Probability of passing given by:

$P(X \ge 45) = 1 - (P(0) + P(1) + P(2) + \cdots + P(44)) \doteq 0.0527$
R:
1-pbinom(44,size=75,prob=0.5)

Excel:
=1-BINOM.DIST(44,75,0.5,TRUE)

TI-83/84:
1-binomcdf(75,0.5,44)


10. Construct a table giving the probability distribution for the number of 3's seen in 7 rolls of a standard die . Find the mean and standard deviation of this probability distribution using the formulas: $$\mu = \Sigma \left[X \cdot P(X)\right] \quad \quad \textrm{ and } \quad \quad \sigma = \sqrt{\Sigma \left[X^2 \cdot P(X)\right] - \mu^2}$$ Then compare these with the results of $\mu = np$ and $\sigma = \sqrt{npq}$.

In both cases, $\mu \doteq 1.1667$ and $\sigma \doteq 0.9860$.
R:
X.outcomes = 0:7
X.probabilities = dbinom(X.outcomes,size=7,prob=1/6)
X.exp = sum(X.outcomes * X.probabilities)
X.var = sum((X.outcomes^2) * X.probabilities) - X.exp^2
X.sd = sqrt(X.var)

X.exp
X.sd

mu = 7*(1/6)
sigma = sqrt(7*(1/6)*(5/6))

mu      # observe this outputs the same value as X.exp = E(X) = 1.1667
sigma   # observe this outputs the same value as X.sd = SD(X) = 0.9860


11. In a trial of Chantix, a drug to help people stop smoking, 821 subjects were treated with 1 mg doses of the drug. 30 experienced nausea within 24 hours of taking the drug. In 725 not taking Chantix (but given a placebo), 9 experienced nausea in this same time frame. Under an assumption that Chantix does not cause nausea and that $9\,/\,725$ is the probability that a randomly selected person experiences nausea within a 24 hour period, what is the probability that out of 821 people, the observed number of people (i.e., 30) -- or more -- experience nausea in a given 24 hour period? Does this seem unusual? What reasonable conclusion might one make about Chantix?

Doing this by hand is cumbersome, so let us use technology instead:

R:
1-pbinom(29,size=821,9/725)

Excel:
=1-BINOM.DIST(29,821,9/725,TRUE)


The probability of seeing 30 or more people out of 821 experiencing nausea is approximately $2.915 \times 10^{-7}$, which is an exceedingly small probability. This just shouldn't be observed very often -- and yet we have seen it as a result of our first test of 821 subjects. The reasonable conclusion is that our assumption that Chantix doesn't cause nausea is incorrect. There seems to be evidence in the form of this sample that it can indeed cause nausea.

12. Suppose 20 deaths that fall within one week of Thanksgiving are randomly selected. Under the assumption that there is nothing special about Thanksgiving and patients don't have the ability to delay death until after a holiday, suggesting that deaths should be uniformly spread throughout these two weeks -- find the probability that more than 14 deaths occur in the week after Thanksgiving. What likely conclusion can one make upon seeing 15 out of the 20 deaths selected occur in the week after Thanksgiving?

Binomial (not Poisson); "success" = a death after Thanksgiving; $n=20$; $p=0.5$; $P(X>14) \doteq 0.0207$;

R:
1-pbinom(14,size=20,prob=0.5)

Excel:
1-BINOM.DIST(14,20,0.5,TRUE)


One can conclude that there is likely something special about Thanksgiving. It may be that patients have some ability to delay death until after the holiday. On a less optimistic note, it may also be the case that a lower level of care on or near the holiday accelerates the onset of death.