Adaptive randomization: motivation and issues


Randomized controlled trials (or randomized comparative trials) are the gold standard for scientific inference in medicine. However, not all RCTs have balanced randomization ratios, i.e. patients may not be equally likely to be randomized to each treatment. Motivations for unbalancing the randomization ratio may include giving a promising treatment rather than the control to the majority of the trial participants, which is often considered ethically desirable.

However if there are two treatments and neither is known to be more promising than the other, adaptive randomization (AR) techniques are sometimes used to unbalance the randomization ratio as information about the treatments accumulates. One intuitive Bayesian approach to AR is Thompson sampling, which randomizes patients to each treatment with probability equal to the posterior probability that the treatment maximizes the expected benefit. A modification of Thompson sampling is to shrink the randomization probabilities toward 1/2 (or 1/k for k treatments).

A competing method to AR is the group sequential (GS) trial, which has balanced randomization and stops the trial early at one of the pre-specified interim points if it is clear that one treatment is superior. Group sequential trials don’t try to randomize more patients to the superior treatment, but rather ends randomization as soon as possible so that future patients may be given the superior treatment.

A recent paper by Thall, Fox, and Wathen pits AR (both shrunken and unshrunken) against GS, with some disappointing results for AR. Their simulations yield the following results.

  1. AR causes more variability in the sample sizes of each arm than equal randomization (including GS).
  2. AR can have an unacceptably high probability of assigning significantly more patients to the inferior treatment.
  3. AR substantially biases the treatment effect estimator.
  4. AR can lead to very high type I error rates.
  5. AR has much less power to detect treatment effects than GS for a given type I error rate.

More details, including a discussion of parameter drift (a situation in which the prognosis changes over time but not the comparative treatment effect, exacerbating points 3 and 4), are discussed in Thall et al., and it’s quite an accessible and interesting read.