Recall that a normal distribution with mean $\mu$ and standard deviation $\sigma$ is one characterized by the function: $$f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{\frac{-(x-\mu)^2}{2\sigma^2}}$$ To find this value (i.e., the height of $f$ at $x$),
R: use the R function
dnorm(x, mean=μ, sd=σ)
As an example, the height at $x=13$ of the normal curve with mean 10 and standard deviation 7 is given by
> dnorm(13,10,7) [1] 0.05199096
Excel: use the function
NORM.DIST(x, μ, σ, FALSE)The last argument for this function, when $FALSE$, indicates that the height of $f(x)$, not a cumulative probability, should be returned.
To find the probability that a normally distributed random variable with mean $\mu$ and standard deviation $\sigma$ results in a value less than $x$ (i.e. the area under the normal curve to the left of $x$.),
R: use the function
pnorm(x, mean=μ, sd=σ)
Let us consider a couple of examples:
Suppose the manufacturer of a certain type of snack knows that the total weight of the snack packet they sell is normally distributed with a mean of $80.2$ grams and a standard deviation of $1.1$ grams. What is the probability that a selected packet is less than $78$ grams?
> pnorm(78,80.2,1.1) [1] 0.02275013Under the same assumptions, what is the probability that a selected packet has a weight within 2 standard deviations of the mean weight?
> pnorm(82.4,80.2,1.1) - pnorm(78,80.2,1.1) [1] 0.9544997
Notice, in the last example, we find the area under the normal curve between $x=a$ and $x=b$ by finding a difference of two left-tailed areas.
Excel: use the function
NORM.DIST(x, μ, σ, TRUE)
The last argument for this function, when $TRUE$, indicates the function should return should cumulative probability equal to the area under the associated normal curve to the left of $x$ (as opposed to the height of the function).
Suppose one wishes to find the $x$ value for which a normally distributed random variable with mean $\mu$ and standard deviation $\sigma$ will produce an outcome less than $x$ with some given probability of $p$. Equivalently, one seeks the $x$ value where there is an area of $p$ to the left of $x$ and under the related normal curve.
R: use the function
qnorm(p, mean=μ, sd=σ)
Consider the following two examples:
A math contest author knows that the length of time taken by students to complete a certain section of math questions is normally distributed with a mean of 17 minutes and standard deviation of 4.5 minutes. $90\%$ of students should have completed the section of questions in how many minutes?
> qnorm(0.90,17,4.5) [1] 22.76698Under the same assumptions, how long should the middle $50\%$ of students take to complete this section?
> qnorm(0.75,17,4.5) [1] 20.0352 > qnorm(0.25,17,4.5) [1] 13.9648 # ...so roughly between 14 and 20 minutes
Excel: use the function
NORM.INV(p, μ, σ)
To generate random values following a normal distribution with mean $\mu$ and standard deviation $\sigma$,
R: use the function
rnorm(n, mean=μ sd=σ)
which generates $n$ such values.
As an example, suppose the error in a particular model follows a normal distribution with mean $\mu = 5$ and standard deviation $\sigma = 1$. To simulate the errors seen in 15 applications of the model in question, one can use:
> rnorm(15, mean=5, sd=1) [1] 5.625824 6.438588 6.642417 3.537867 4.853430 [6] 4.405501 5.684504 6.041753 4.434506 5.550997 [11] 5.095519 5.188711 5.077366 5.102386 4.247265
Excel: use the formula
NORM.INV(RAND(), μ, σ)