Tech Tips: Normal Distributions

Calculating Heights of a Normal Curve

Recall that a normal distribution with mean $\mu$ and standard deviation $\sigma$ is one characterized by the function: $$f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{\frac{-(x-\mu)^2}{2\sigma^2}}$$ To find this value (i.e., the height of $f$ at $x$),

R: use the R function
```
dnorm(x, mean=μ, sd=σ)
```
As an example, the height at $x=13$ of the normal curve with mean 10 and standard deviation 7 is given by
```
> dnorm(13,10,7)
[1] 0.05199096
```
Excel: use the function
```
NORM.DIST(x, μ, σ, FALSE)
```
The last argument for this function, when $FALSE$, indicates that the height of $f(x)$, not a cumulative probability, should be returned.

Calculating Areas Under a Normal Curve

To find the probability that a normally distributed random variable with mean $\mu$ and standard deviation $\sigma$ results in a value less than $x$ (i.e. the area under the normal curve to the left of $x$.),

R: use the function
```
pnorm(x, mean=μ, sd=σ)
```
Let us consider a couple of examples:

Suppose the manufacturer of a certain type of snack knows that the total weight of the snack packet they sell is normally distributed with a mean of $80.2$ grams and a standard deviation of $1.1$ grams. What is the probability that a selected packet is less than $78$ grams?
```
> pnorm(78,80.2,1.1)
[1] 0.02275013
```
Under the same assumptions, what is the probability that a selected packet has a weight within 2 standard deviations of the mean weight?
```
> pnorm(82.4,80.2,1.1) - pnorm(78,80.2,1.1)
[1] 0.9544997
```
Notice, in the last example, we find the area under the normal curve between $x=a$ and $x=b$ by finding a difference of two left-tailed areas.
Excel: use the function
```
NORM.DIST(x, μ, σ, TRUE)
```
The last argument for this function, when $TRUE$, indicates the function should return should cumulative probability equal to the area under the associated normal curve to the left of $x$ (as opposed to the height of the function).

Inverse Normal Functions

Suppose one wishes to find the $x$ value for which a normally distributed random variable with mean $\mu$ and standard deviation $\sigma$ will produce an outcome less than $x$ with some given probability of $p$. Equivalently, one seeks the $x$ value where there is an area of $p$ to the left of $x$ and under the related normal curve.

R: use the function
```
qnorm(p, mean=μ, sd=σ)
```
Consider the following two examples:

A math contest author knows that the length of time taken by students to complete a certain section of math questions is normally distributed with a mean of 17 minutes and standard deviation of 4.5 minutes. $90\%$ of students should have completed the section of questions in how many minutes?
```
> qnorm(0.90,17,4.5)
[1] 22.76698
```
Under the same assumptions, how long should the middle $50\%$ of students take to complete this section?
```
> qnorm(0.75,17,4.5)
[1] 20.0352
> qnorm(0.25,17,4.5)
[1] 13.9648
# ...so roughly between 14 and 20 minutes
```
Excel: use the function
```
NORM.INV(p, μ, σ)
```

Simulating Random Variables following a Normal Distributions

To generate random values following a normal distribution with mean $\mu$ and standard deviation $\sigma$,

R: use the function
```
rnorm(n, mean=μ sd=σ)
```
which generates $n$ such values.

As an example, suppose the error in a particular model follows a normal distribution with mean $\mu = 5$ and standard deviation $\sigma = 1$. To simulate the errors seen in 15 applications of the model in question, one can use:
```
> rnorm(15, mean=5, sd=1)
 [1] 5.625824 6.438588 6.642417 3.537867 4.853430
 [6] 4.405501 5.684504 6.041753 4.434506 5.550997
[11] 5.095519 5.188711 5.077366 5.102386 4.247265
```
Excel: use the formula
```
NORM.INV(RAND(), μ, σ) 
```