A Gibbs sampler is a sequential method for producing samples from the joint distribution of several variables by sampling from the simpler conditional distributions of each variable. That is, to sample from a vector-valued random variable X, we can begin with an initial value X(0), and in the ith step of the algorithm sample the coordinate xj(i) from the conditional distribution

A (correlated) sample is thus obtained from the joint distribution of X.

This iterative procedure can take a long time, though, especially when there are many variables or when sampling from even the conditional distributions is computationally expensive. However, parallelization may be able to help. Naively sampling as many coordinates as possible in parallel usually won’t work because, in the general case, the conditional distribution of each coordinate may depend on every other coordinate. However, in cases where two or more coordinates are independent given the rest (think hierarchical models) they can be sampled in parallel.

We can represent the dependence structure of the joint distribution by a Markov random field (MRF), that is, an undirected graph which has the property that any two subsets of variables are conditionally independent given a separating subset. Now consider a graph coloring (of vertices), which is a labeling (coloring) of the vertices in such a way that no two vertices of the same color share an edge between them. The definition of an MRF then implies that all variables represented by vertices of the same color are independent given the variables represented by other colors.

Thus we can revise the standard Gibbs sampler to, instead of sampling from $p(x_j \vert x_1^{(i)}, \ldots, x_{j-1}^{(i)}, x_{j+1}^{(i-1)}, \ldots, x_n^{(i-1)})$, sample from

where cj is the jth color and nj is the number of variables with color cj. Then, since all of these variables are conditionally independent, each coordinate can be sampled in parallel.