To clearly and precisely define what one means by an "event" in discussions related to probability, we'll need some additional vocabulary first:

A **probability experiment** is a chance process that leads to well-defined results
called outcomes. As an example, rolling a die would be a probability experiment.

A **simple event** is the outcome of a single trial of a probability experiment. In the case of
the previous example, the simple event would be the number showing on the die.

The **sample space** for a probability experiment is the set of all possible simple events with which it is associated. Thus, for the roll of a standard die, we would have a sample space of $S = \{1,2,3,4,5,6\}$.

Finally, an **event** is a subset of the sample space. Every simple event is certainly an event,
but events can be compound in nature as well. For example, one can consider the event of
rolling an even number. The related subset of $S$ is then $E = \{2,4,6\}$.

The **relative frequency of an event** (also called the **empirical probability**) is
the number of times an event $E$ occurs divided by the number of trials conducted relative
to a particular probability experiment. As an example, suppose
we roll a die 5 times, yielding rolls 2, 6, 3, 6, 4. The relative frequency in this
experiment of rolling an even value is thus
$$P(E) = \frac{\textrm{# of 2's, 4's, and 6's rolled}}{\textrm{# of rolls}} = \frac{4}{5} = 0.80$$
where $P(E)$ denotes the empirical probability of event $E$.

In calculating the **classical probability of an event**, one assumes the all simple
events related to a probability experiment are equally likely and thus
$$P(E) = \frac{\textrm{number of elements in $E$}}{\textrm{number of elements of $S$}}$$
Returning to our example of die rolls, the classical probability of rolling an even number
is thus $$P(E) = \frac{\textrm{number of elements in $\{2,4,6\}$}}{\textrm{number of elements in $\{1,2,3,4,5,6\}$}} = \frac{3}{6} = 0.50$$

As a natural consequence of how probabilities are defined, it must be true that $$0 \le P(E) \le 1$$ Not surprisingly, if the event $E$ can never occur, then $P(E) = 0$. On the flip side, if an event $E$ is certain to occur, then $P(E) = 1$.

As some simple event must occur, the sum of the probabilities for the simple events associated with a probability experiment must be $1$. Equivalently, if $x$ is allowed to range over all simple events associated with a probability experiment, then $$\sum P(x) = 1$$

The Law of Large Numbers indicates that as the number of trials increases, the relative frequency probability will approach the true probability. Thus, the more times a die is rolled, the closer we can expect the relative frequency of even rolls to get to $0.50$. Importantly, know that this does NOT say that if a particular event hasn't previously occurred as often as we might have expected given the classical probability associated with the event that we are in any way "due" for this event to occur. Being "due" suggests the probability for the event to occur has somehow increased for these later trials, which is in direct conflict with our assumption that separate trials of a probability experiment are independent of one another.

The complement of an event $E$ relative to some sample space $S$ is the subset of all elements in $S$ that are not in $E$, and is denoted $\overline{E}$. Consequently, $$P(\overline{E}) = 1 - P(E)$$ As an example, the probability of NOT rolling a 6 in a single die roll is given by $$P(\overline{6}) = 1 - P(6) = 1 - \frac{1}{6} = \frac{5}{6}$$

Note, the probability above could also be described as the probability of a roll that is "less than 6". Given the sample space of $\{1,2,3,4,5,6\}$, one could compute this probability as $$P(1)+P(2)+P(3)+P(4)+P(5) = \frac{1}{6} + \frac{1}{6} + \frac{1}{6} + \frac{1}{6} + \frac{1}{6} = \frac{5}{6}$$ but this is certainly more cumbersome. Complements can often shorten calculations of probabilities. One should pay particular attention to this when finding probabilities of events described using the words "at least", "at most", "more than", "less than", or "exactly".

Much of the study of statistics involves looking at certain situations and deciding whether
or not under certain assumptions what is observed is unlikely. If the observations are "unlikely enough", one
can feel fairly confident in rejecting the assumptions initially made. To setup a (somewhat arbitrary)
standard for what we mean by "unlikely enough", let us call an event **unusual** if its probability
is less than or equal to 0.05.

For any events $A$ and $B$, $$P(A \textrm{ or } B) = P(A) + P(B) - P(A \textrm{ and } B)$$ Note that this means if events $A$ and $B$ are disjoint (i.e., mutually exclusive, in that they share no common simple events), then $P(A \textrm{ and } B) = 0$ and thus, $$P(A \textrm{ or } B) = P(A) + P(B)$$

A **conditional probability** is a probability of some event $B$ given that some other
event $A$ has occurred, and is denoted $P(B|A)$. When we know $A$ has occurred, our sample space
essentially shrinks from what it previously was to just $A$ itself. Thus, for any events $A$ and $B$,
$$P(B | A) = \frac{P(A \textrm{ and } B)}{P(A)}$$
This leads directly to the multiplication rule,
$$P(A \textrm{ and } B) = P(A) \cdot P(B|A)$$
Events $A$ and $B$ are said to be **independent** when the outcome of one does not effect
the probability that the other occurs. Consequently, when $A$ and $B$ are independent, we have $P(B|A) = P(B)$.

Thus, in the case when $A$ and $B$ are independent, the multiplication rule reduces down to $$P(A \textrm{ and } B) = P(A) \cdot P(B)$$

When selecting members from some population, one can do so with or without replacement.

For example, suppose one intends to test some selection from a group of batteries in order to classify them as either "good" or "defective". Suppose also that this group consists of 10 batteries, and 4 of them are defective. Note that the probability of drawing one of the 4 defective batteries from the group of 10 is 4/10 = 0.40. Suppose two batteries are selected.

One way in which this might happen is for the first one to be selected and tested and then returned to the group. Then a second battery is selected and tested in the same way. In this way, the probability of drawing a defective battery is identical for both selections.

However, one might not want to allow for the possibility that the same battery is tested twice. Consequently, one could set the first battery aside and draw the second battery from the remaining nine. However, note that this changes the probabilities involved. If the first battery drawn was defective, the conditional probability the second battery will be found to be defective is now $3/9 = 1/3$. If the first battery drawn was not defective, the conditional probability the second battery will be found to be defective is instead $4/9$.

Different techniques must then be used in these two different circumstances. The former often leads to simpler calculations, but the latter is often more appropriate. Fortunately, the numerical difference between the two probabilities involved is very small when the population is large enough when compared to the **sample size** (i.e., the number of elements drawn from the population).

As a guideline, if the sample size is no more than 5% of the population size, the selections may be treated as independent, if desired.