For discrete random variables, the conditional probability mass function of given can be written according to its definition as: Due to the occurrence of in a denominator, this is defined only for non-zero The relation with the probability distribution of given is:
Example
Consider the roll of a fair and let if the number is even and otherwise. Furthermore, let if the number is prime and otherwise.
1
2
3
4
5
6
X
0
1
0
1
0
1
Y
0
1
1
0
1
0
Then the unconditional probability that is 3/6 = 1/2, whereas the probability that conditional on is 1/3.
Conditional continuous distributions
Similarly for continuous random variables, the conditional probability density function of given the occurrence of the value of can be written as where gives the joint density of and, while gives the marginal density for. Also in this case it is necessary that. The relation with the probability distribution of given is given by: The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: Borel's paradox shows that conditional probability density functions need not be invariant under coordinate transformations.
Example
The graph shows a bivariate normal joint density for random variables and. To see the distribution of conditional on, one can first visualize the line in the plane, and then visualize the plane containing that line and perpendicular to the plane. The intersection of that plane with the joint normal density, once rescaled to give unit area under the intersection, is the relevant conditional density of.
Relation to independence
Random variables, are independentif and only if the conditional distribution of given is, for all possible realizations of, equal to the unconditional distribution of. For discrete random variables this means for all possible and with. For continuous random variables and, having a joint density function, it means for all possible and with.
Properties
Seen as a function of for given, is a probability mass function and so the sum over all is 1. Seen as a function of for given, it is a likelihood function, so that the sum over all need not be 1. Additionally, a marginal of a joint distribution can be expressed as the expectation of the corresponding conditional distribution. For instance,.
Measure-theoretic formulation
Let be a probability space, a -field in, and a real-valued random variable. Given, the Radon-Nikodym theorem implies that there is a -measurable integrable random variable such that for every, and such a random variable is uniquely defined up to sets of probability zero. Further, it can then be shown that there exists a function such that is a probability measure on for each and for every. For any, the function is called a conditional probability distribution of given. In this case,almost surely.
For any event, define the indicator function: which is a random variable. Note that the expectation of this random variable is equal to the probability of A itself: Then the conditional probability given is a function such that is the conditional expectation of the indicator function for : In other words, is a -measurable function satisfying A conditional probability is regular if is also a probability measure for all ω ∈ Ω. An expectation of a random variable with respect to a regular conditional probability is equal to its conditional expectation.
For the trivial sigma algebra the conditional probability is a constant function,