Factorial Experiments: Frequently Asked Questions

The answers to many of your questions about factorial experiments can be found in

Collins, L.M. (2018), Optimization of Behavioral, Biobehavioral, and Biomedical Interventions: The Multiphase Optimization Strategy (MOST).  New York: Springer.

This is explained on our Introduction to Factorial Experiments web page and in Chapter 3 of Collins (2018). Very briefly, you may be thinking of a factorial experiment as a many-armed RCT.  It isn't. The logical underpinnings of the factorial experiment are different from those of the RCT, and therefore the approach to powering the two designs is different. In an RCT, all else being equal, power is driven by the per-arm sample size. By contrast, a factorial experiment can have very small per-condition sample sizes as long as the overall sample size per level of each factor is sufficiently large.

Reference

Collins, L.M. (2018). Optimization of behavioral, biobehavioral, and biomedical interventions: The multiphase optimization strategy (MOST). New York: Springer.

This is done essentially the same way as in an RCT, except instead of assigning people to 2 or 3 conditions, you need to assign people to many more conditions. One way to approach random assignment in a factorial experiment is to produce many sets of permutations of the integers from 1 to C where C is the number of conditions, and use these numbers to assign subjects to conditions in a permuted block fashion. For example, suppose you were conducting a 23 factorial experiment, which of course has 8 experimental conditions, and you want to assign 160 subjects. You could use software to conduct random sampling of the integers 1 through 8 without replacement, and repeat this 20 times. You now have 20 lists of the numbers 1 through 8 randomly assorted, in other words, 160 random numbers. Then you can use these numbers to assign each of the 160 subjects. This approach will ensure that the experimental conditions fill up approximately evenly.

 

There are other approaches to random assignment in factorial optimization trials. We recommend the excellent article by Gallis et al. (2019).

References

Gallis, J. A., Bennett, G. G., Steinberg, D. M., Askew, S., & Turner, E. L. (2019). Randomization procedures for multicomponent behavioral intervention factorial trials in the multiphase optimization strategy framework: challenges and recommendations. Translational behavioral medicine, 9(6), 1047-1056.

Yes, in a limited way. First, make sure you understand the answer to the question “How do I randomly assign subjects in a factorial experiment?”, which describes how to create lists of random numbers for random assignment purposes. You can conduct stratified assignment easily by maintaining more than one list of random numbers. For example, if you want to stratify by gender, you can maintain one list for males and one list for females. Note that stratification has the effect of ensuring that if the subject sample is, say, 40% male, each experimental condition will contain approximately 40% males.

Yes! This is possible even when cluster randomization is necessary. For more about this see Dziak, Nahum-Shani, and Collins (2012) and Nahum-Shani, Dziak, & Collins (2018).

References

Dziak, J. J., Nahum-Shani, I., & Collins, L. M. (2012). Multilevel factorial experiments for developing behavioral interventions: Power, sample size, and resource considerations. Psychological Methods, 17(2), 153.

Nahum-Shani, I., Dziak, J. J., & Collins, L. M. (2018). Multilevel factorial designs with experiment-induced clustering. Psychological Methods, 23(3), 458.

Everything we say about factorial experiments on this website is based on using effect (-1,1) coding. However, a lot of behavioral scientists have been trained to use dummy (0,1) coding. Dummy coding has its place, particularly in one-way ANOVA and in non-experimental situations, but for factorial optimization trials in MOST it is essential to use effect coding. It may not seem like using (-1,1) instead of (0,1) would be a very big deal, but it turns out that it is. Oddly, this difference seems to be discussed only rarely and not very explicitly.

Whether you use effect codes or dummy codes to perform an ANOVA within a regression framework (say, PROC GLM in SAS), you will be interpreting the b-weights associated with the vectors of codes. When effect codes are used, these b-weights correspond to the textbook definitions of main effects and interactions. However, when dummy codes are used, the b-weights do not necessarily correspond to these textbook definitions. (Note the implication here: If you are doing hypothesis testing based on dummy codes, you may not be testing hypotheses about main effects and interactions.) In fact, under most circumstances, dummy-coded effects should not be referred to as main effects and interactions. We prefer to maintain a clear distinction by calling dummy-coded effects first-order effects, second-order effects, etc. (For simplicity, from now on we will refer to second-order effects, third-order effects, etc. as higher-order effects.)

People trained in using dummy coding often make two assertions about factorial ANOVA. The first is that it is impossible to interpret main effects if there are any substantial interactions. The second is that there is always less statistical power for tests of interactions than for main effects, with power decreasing as the number of factors involved in the interaction increases (e.g., less power for three-way interactions than two-way interactions). These statements may be attributable to a failure to distinguish between first-order effects and main effects, and between higher-order effects and interactions. It is true that when dummy coding is used, it is impossible to interpret first-order effects if there are any substantial higher-order effects. When dummy coding is used, the first-order effects and higher-order effects can be highly correlated. You can see how this could make interpretation of a first-order effect difficult if the higher-order effects were substantial. 
 
However, effect-coded main effects and interactions are not usually the same as dummy-coded first-order effects and higher-order effects. When there are equal ns in each experimental condition, all of the effect-coded main effects and interactions are uncorrelated. Although interactions must always be taken into account thoughtfully when interpreting main effects, this means that a main effect is not necessarily contaminated or rendered uninterpretable by interactions. In addition, when effect coding is used and effect size is expressed as the regression coefficient, the power associated with each effect, whether a main effect or interaction, is identical. Thus with effect coding there is not necessarily less power available for interactions than for main effects; this will depend on effect size. Of course, very little is known about interactions in behavioral science, so we don’t know whether they are likely to have effect sizes comparable to main effects, or whether the effect sizes for interactions are likely to be smaller. If the effect sizes for the interactions are smaller than those for the main effects, the power associated with the interactions will be correspondingly smaller.

To learn more, read the Kugler et al. chapter in Collins & Kugler, (2018).

Reference

See chapter 6 of

Collins, L. M., & Kugler, K. C. (2018). Optimization of behavioral, biobehavioral, and biomedical interventions. New York: Springer.

You might want to consider a fractional factorial experiment if a complete factorial design is conceptually suitable for your research questions and one or more of the following is true:

  • there is an upper limit on the number of experimental conditions you can implement, and this upper limit is 8 or greater;
  • you want to take advantage of the efficiency factorial designs offer in terms of use of subjects, and also want to economize on experimental condition overhead costs; and/or
  • you have to use cluster randomization and don’t have enough clusters to populate a complete factorial design. (More about this can be found in Dziak, Nahum-Shani, & Collins, 2012.)

References

Dziak, J. J., Nahum-Shani, I., & Collins, L. M. (2012). Multilevel factorial experiments for developing behavioral interventions: Power, sample size, and resource considerations. Psychological Methods, 17(2), 153.

See chapter 5 of

Collins, L. M., & Kugler, K. C. (2018). Optimization of behavioral, biobehavioral, and biomedical interventions. New York: Springer.