Basic Probability Theory

A probability model consists of three components:

  1. A set Ω of elementary outcomes called the sample space.
  2. A set E of possible events (subsets of Ω).
  3. A probability function P that assigns probabilities (real numbers between 0 and 1) to events in E.

The axioms of probability say that:

  1. P(A) ≥ 0 (for any event A).
  2. P(Ω) = 1
  3. If A and B are disjoint events, then P(A ∪ B) = P(A) + P(B).

Some important theorems say that:

  1. P(Ω - A) = 1 - P(A)
  2. P(∅) = 0
  3. P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
  4. P(A ∩ B) ≤ P(A ∪ B) ≤ P(A) + P(B)
  5. If A ⊆ B, then P(B - A) = P(B) - P(A)
  6. If (A1, ..., An) is a partitioning of Ω, then P(B) = P(A1B) + ... + P(AnB) (law of total probability).

The conditional probability of A given B is defined as follows:

Two important theorems involving conditional probabilities are the following:

  1. P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B) (chain rule).
  2. P(A|B) = P(A) P(B|A) / P(B) (Bayes theorem).

Two events are said to be independent if the following three (equivalent) conditions hold:

  1. P(A ∩ B) = P(A) P(B)
  2. P(A) = P(A|B)
  3. P(B) = P(B|A)

Slides for lecture 2

Suggested Reading

Exercises

  1. Define suitable sample spaces for the following random experiments:
    1. Form a string consisting of three words using the words "my", "little" and "dog", using each word just once.
    2. Form a string consisting of three words using the words "my", "little" and "dog", allowing repetitions.
    3. Read the first ten words of a book and count the number of nouns.
    4. Read the first ten words of a book and see if you find at least one verb.
    5. Make a three word utterance.
    Solution
  2. What is the event space corresponding to the sample space in exercise 1.4?
    Solution
  3. Suppose we know that a word chosen at random from a text typed by person X has a character missing with probability 0.1, has a character inserted with probability 0.2, and contains both kinds of error with probability 0.05. What are the probabilities that a word chosen at random
    1. has a character missing but no character inserted,
    2. has a character inserted but no character missing,
    3. contains at least one of the two errors,
    4. contains exactly one of the two errors?
    Solution
  4. Show that
    1. the probability that exactly one of the events A and B occurs is
      P(A) + P(B) - 2 P(A ∩ B),
    2. if A is a subset of B and P(B) > 0, then
      P(A|B) = P(A) / P(B).
    Solution
  5. Suppose we want to detect run-ons in typed text (i.e. two words which have accidentally been written as one word without space, e.g. "thedog" for "the dog") and suppose we know that this type of error occurs with probability 0.01 (i.e. one word in a hundred is a run-on). Suppose furthermore that we have developed a method which has 80% accuracy for run-ons (i.e. it correctly identifies them as run-ons with probability 0.8) and which has 95% accuracy for real words (i.e. it incorrectly identifies a real word as a run-on with probability 0.05).
    1. If the method identifies an arbitrary text word as a run-on, what is the probability that it is in fact a run-on?
    2. What is the overall error probability for the method (i.e. how often does the method lead to the wrong decision)?
    3. Show that the simple-minded method of treating every word as a real word (i.e. not a run-on) yields a lower error probability.
    Solution
  6. Binary digits (0, 1) are transmitted through a communication system. The messages sent are such that the proportion of 0s is 0.7 and the proportion of 1s is 0.3. The system is noisy, which has as a consequence that a transmitted 0 will be received as a 0 with probability 0.8 (and as a 1 with probability 0.2), while a transmitted 1 will be received as a 1 with probability 0.9 (and as a 0 with probability 0.1).
    1. If a 1 is received, what is the conditional probability that a 1 was transmitted?
    2. If a 0 is received, what is the conditional probability that a 0 was transmitted?
    Solution