The RCT and the factorial design are very different designs intended for different purposes. Both can be efficient when properly applied, but they are efficient for different research questions. Because the logical underpinnings of the two types of designs are so different, it is understandable that people whose design background is primarily in RCTs might have some misconceptions about factorial experiments. It would be a shame if these misconceptions kept scientists from recognizing the advantages of factorial experiments for certain kinds of research questions! To try to prevent this, here we review some potential misconceptions and offer some suggestions for additional reading.

Overall message: If you see a 32-condition factorial experiment, try not to think of it as a 32-arm RCT, and keep an open mind about power.

Read an informal introduction to factorial experiments aimed at those with a background in the RCT.

You may also be interested in the FAQ about Factorial Experiments section of this web site. A brief but citable treatment of some of this material can be found in Collins, Dziak, Kugler, & Trail (2014), and a more in-depth and also citable explanation can be found in Chapters 3 and 4 of Collins (2018).

**MISCONCEPTION 1: A factorial experiment is essentially an RCT with a lot of experimental conditions, and therefore is extremely difficult to power.**

**REALITY: The RCT and the factorial experiment have very different logical underpinnings.** In an RCT the primary objective is direct comparison of a small number of experimental conditions, whereas in a factorial experiment, the primary objective is estimation of main effects and interactions. These estimates are obtained by combining experimental conditions in a principled way by means of factorial analysis of variance (ANOVA). In fact, individual experimental conditions of a factorial experiment are NEVER directly compared in a factorial ANOVA (which is very counterintuitive for those trained in RCTs).

This difference in the underlying logic extends to how RCTs and factorial experiments are powered. An RCT with a small number of subjects per experimental condition is unlikely to have sufficient statistical power. In contrast, a factorial experiment with a small number of subjects per condition may have excellent statistical power. Why? Power for estimation of main effects and interactions in factorial ANOVA is based on comparison of __combinations__ of experimental conditions, not direct comparison of individual conditions. So the number of subjects in each individual condition does not matter; what matters for power is the total sample size across all experimental conditions.

Read an informal introduction to factorial experiments aimed at those with a background in the RCT.

*References*

Collins, L. M., Dziak, J. J., & Li, R. (2009). Design of experiments with multiple independent variables: a resource management perspective on complete and reduced factorial designs. *Psychological Methods*, *14*(3), 202.

Chapters 3 and 4 of

Collins, L.M. (2018). *Optimization of behavioral, biobehavioral, and biomedical interventions: The multiphase optimization strategy (MOST).* New York: Springer.

**MISCONCEPTION 2: Factorial experimental designs require larger numbers of subjects than available alternative designs.**

**REALITY: When used to address suitable research questions, balanced factorial experimental designs often require many fewer subjects than alternative designs**. For a brief explanation, see Collins et al. (2014); for a more extensive explanation, see Collins, Dziak, and Li (2009) and Collins (2018).

For example, Collins et al. (2011) wanted to use a factorial experiment to examine six components under consideration for inclusion in a clinic-based smoking cessation intervention. They found that whereas conducting individual experiments on each of the components would have required over 3,000 subjects, with a factorial design they would have sufficient power with about 500 subjects. In other words, conducting a factorial experiment rather than six individual experiments meant that they needed about 2,500 fewer subjects.

*References*

Collins, L. M., Baker, T. B., Mermelstein, R. J., Piper, M. E., Jorenby, D. E., Smith, S. S., Schlam, T. R., Cook, J. W., & Fiore, M. C. (2011). The multiphase optimization strategy for engineering effective tobacco use interventions.* Annals of Behavioral Medicine, 41,* 208-226. PMCID: PMC3053423

Collins, L. M., Dziak, J. J., & Li, R. (2009). Design of experiments with multiple independent variables: A resource management perspective on complete and reduced factorial designs. *Psychological Methods, 14,* 202-224. PMCID: PMC2796056

Collins, L. M., Dziak, J. D., Kugler, K. C., & Trail, J. B. (2014). Factorial experiments: Efficient tools for evaluation of intervention components. *American Journal of Preventive Medicine, 47,* 498-504.

Collins, L.M. (2018). *Optimization of behavioral, biobehavioral, and biomedical interventions: The multiphase optimization strategy (MOST).* New York: Springer.

**MISCONCEPTION 3: If you want to add a factor to a balanced factorial experiment, you will have to increase the number of subjects dramatically to maintain power.**

**REALITY: If the factor to be added has an expected effect size no smaller than that of the factor with the smallest effect size that is already in the experiment, power will be about the same without any increase in the number of subjects.**

If the factor to be added has a smaller anticipated effect size than those upon which the power analyses was previously based, it will be necessary to increase the sample size accordingly to maintain power. However, unless the anticipated effect size of the new effect is considerably smaller, the required increase will be modest. For more about this, see Collins et al. (2014); Collins, Dziak, and Li (2009), and Collins (2018).

The power of a factorial experiment depends on the overall sample size per level of each factor, not the number of experimental conditions or the number of subjects in each condition (except to the extent that these impact overall per-level sample size). Scientists whose backgrounds are primarily in designs like the RCT often find this counterintuitive.

Read an informal introduction to factorial experiments aimed at those with a background in the RCT.

*Reference*

Collins, L. M., Dziak, J. D., Kugler, K. C., & Trail, J. B. (2014). Factorial experiments: Efficient tools for evaluation of intervention components. *American Journal of Preventive Medicine, 47, *498-504.

Collins, L. M., Dziak, J. J., & Li, R. (2009). Design of experiments with multiple independent variables: A resource management perspective on complete and reduced factorial designs. *Psychological Methods, 14,* 202-224. PMCID: PMC2796056

Collins, L.M. (2018). *Optimization of behavioral, biobehavioral, and biomedical interventions: The multiphase optimization strategy (MOST).* New York: Springer.

**MISCONCEPTION 4: The only reason to conduct a factorial experiment is to test for interactions between factors.**

**REALITY: Even if it were somehow known with certainty that there were no interactions between factors, a factorial experiment might still be attractive if it required fewer research subjects than the alternatives being considered.**

In fact, in some ways not expecting any interactions is an ideal scenario for the use of factorial designs, because it provides a great justification for the use of extremely efficient fractional factorial designs. (A brief introduction to fractional factorial designs can be found in Collins, Dziak, & Li, 2009; and Chapter 5 of Collins, 2018.)

*Reference*

Collins, L. M., Dziak, J. J., & Li, R. (2009). Design of experiments with multiple independent variables: A resource management perspective on complete and reduced factorial designs. *Psychological Methods, 14,* 202-224. PMCID: PMC2796056

**MISCONCEPTION 5: There is always less statistical power for interactions than for main effects in a factorial ANOVA. Power decreases as the order of the interaction increases.**

**REALITY: When effect coding is used in a 2 ^{k} design, statistical power is the same for all regression coefficients of the same size, whether they correspond to main effects or interactions, and irrespective of the order of the interaction. **

Note that the regression coefficient is not the only way to express the effect size of an interaction. This is explained in Chapter 4 of Collins (2018).

The effect sizes for interactions may be smaller than those for the main effects in a given study, and the effect sizes for higher-order interactions may be smaller than those for lower-order interactions. (This is consistent with the sparsity, or Pareto, principle in engineering.) If that is the case, then the power of course will be lower for the smaller effect sizes. But the lower power is due to the smaller effect size, not to anything inherent about interactions or the use of a factorial design.

*Reference*

**MISCONCEPTION 6: Any interaction between factors necessarily makes interpretation of main effects impossible.**

**REALITY: Whereas it is always important to consider interactions thoughtfully when interpreting main effects, when effect coding is used in a balanced factorial experiment, the main effects are interpretable whether or not there are interactions.**

We recommend use of effect (-1,1) coding for component selection experiments in MOST. When effect coding is used and there are equal *n*s per condition, main effects and interactions are uncorrelated. This makes main effects more readily interpretable.

When dummy (0,1) coding is used many of the effects being tested are highly correlated. This can lead to interpretational difficulties.

Dummy coding and effect coding produce estimates of different effects, and thus the ANOVA results must be interpreted differently. For more information, please see the Kulgler et al.chapter in the Collins & Kugler book.

*Reference*

Collins, L.M., & Kugler, K.C. (2018). *Optimization of Behavioral, Biobehavioral, and Biomedical Interventions: Advanced Topics. *New York: Springer.

Last updated: May 7, 2020