Introduction to MRTs
What is the purpose of an MRT?
The purpose of an MRT is to provide data that can be used to construct a multi-component intervention. The MRT helps researchers answer questions including whether or not to include a time-varying component as part of an intervention package, and in which contexts delivering a component is most effective. For examples of the kinds of questions an MRT can be used to answer, see the HeartSteps example below. Importantly, MRTs are not confirmatory studies designed to evaluate an intervention, rather they are focused on selecting and optimizing components to be delivered as part of an intervention package.
What are the elements of an MRT?
- Intervention components: Anything about the mobile health intervention package that can be separated out for experimentation. Examples include reminders, motivational messages, reinforcement schedules, social support linkages, cognitive messages, type of avatar, delivery mechanism and so on.
- Intervention options: Different levels of an intervention component. These levels can include multiple active options, or an option to deliver nothing. An MRT often involves investigating multiple components.
- Distal outcome: The distal outcome in an MRT is the long-term clinical outcome–something you would want as an outcome in a confirmatory clinical trial. A researcher also may also define a distal outcome that is measurable at the end of an MRT study.
- Proximal outcome: The effect that an intervention component is intended to have in the near term. The proximal outcome is often a shorter-term measurable quantity of a distal outcome, or a potential mediator of a distal outcome. Each intervention component may target a different proximal outcome.
- Decision points: Pre-determined times at which it might be useful to deliver an intervention component. Decision points can differ by intervention component.
- Observations of context: These are variables of scientific interest observed at the time of the current decision point as well as summaries of variables observed prior to the decision point. They can include self-report measures, information captured with wearable sensors (location, weather, movement), information captured using device sensors (e.g., wireless scales participants use to weigh themselves), or recordings of the amount and type of interaction with a mobile application.
- Availability conditions: Restrictions, based on current context, on when the mobile device might deliver an intervention option to an individual. Individuals would not be considered available in contexts where it is unsafe or inappropriate to deliver an intervention. For example, we do not want to send audible notifications in contexts in which an individual is operating a motor vehicle. Availability conditions can also be driven by concerns about overburdening individuals; for example, an individual might be considered unavailable if he/she has recently received a reminder or message.
- Randomization probabilities: The intervention options are randomized with a pre-specified probability at each decision point at which an individual is available.
What is the role of the distal outcome in an MRT?
The distal outcome in an MRT usually corresponds to a long-term, clinical outcome such as time to relapse or average level of symptoms. One doesn’t have to measure the clinical outcome in every MRT. However, the distal outcome guides the choice of proximal outcomes to be targeted by the intervention components. The goal is that by impacting the proximal outcomes, the intervention components impact the distal outcome.
What is the relationship between the proximal and distal outcomes in an MRT?
In an MRT, intervention components target proximal outcomes that are usually either a short-term version of a distal outcome (e.g., number of steps taken the current day when the distal outcome is the average daily steps over a longer period of time) or hypothesized mediators of the distal outcome (for someone attempting to quit smoking, a relaxation exercise to reduce the proximal outcome of stress over the next 2 hours will hopefully impact the distal outcome of time to relapse).
Why are there repeated randomizations in an MRT?
Another way to ask this question is “What can you learn from the repeated randomizations that are part of an MRT?”” The primary rationale for randomization is that it enhances balance in the distribution of unobserved factors across groups assigned to different treatments. This enhances the ability to assess causal effects; that is, randomization reduces alternative explanations for why the group assigned one treatment has improved outcomes as compared to a group assigned an alternate treatment. The repeated randomizations in an MRT enhance balance in the distribution of unobserved factors between participants/decision points assigned to different intervention options. Thus MRTs can provide data to help answer questions including whether or not delivering an intervention component has the desired effect on the targeted outcome, and whether this effect varies with time, prior dose and the current context of the individual.
What types of intervention components might be investigated via an MRT?
The repeated randomizations in an MRT are appropriate for investigating the effects of time-varying intervention components for which the proximal effect might vary by time or current context of the individual. For example, instead of sending a reminder instructing individuals to self-monitor what they eat every day, it might be more effective and less burdensome to remind a person to self-monitor only when they haven’t recently self-monitored. Another example would be the delivery of a relaxation exercise via a mobile phone. A researcher might want to know in what contexts delivering the relaxation exercise is most effective, for example, whether it is more effective when delivered at times when the individual is stressed.
What types of intervention components would not be investigated with an MRT?
Some intervention components considered for inclusion in an intervention package will not require further investigation. For example, it may not be worth trial resources to investigate a component because it is known to be effective in comparison to other components and the component is not burdensome. Furthermore, the inclusion of the component that requires negligible resources and is not burdensome to individuals might not be investigated.
What types of intervention components would only be randomized at baseline and not repeatedly?
Some intervention components should not be altered once provided. This might be for either scientific or ethical reasons. In this case these components would only be randomized at baseline. An example of such a component would be a health coach avatar, where it might not make sense to take away a health coach avatar once it is provided to an individual.
What is the role of the observations of an individual’s current context? In particular, what is the role of these observations in an MRT?
A first use of the current context is to inform the design of an intervention option. For example, the language in an activity message might be tailored to the participant’s current location and weather; this would be done to increase the chance that the message is useful for the participant in that context (e.g. location, weather). Thus one role of observations of an individual’s context is to tailor the content of an intervention component message. In many research settings we don’t have access to a large number of participants for our MRT. It is difficult, with small sample sizes, to detect small differences such as whether the contextually tailored activity message should be tailored to both current weather and current location versus only tailored to current weather. Thus the contextual tailoring of messages/suggestions is frequently informed by current behavioral theory, clinical experience and prior studies.
A second use of the context is to learn if some intervention components are more effective in some contexts, e.g., moderation. An MRT can be used to provide empirical data with respect to whether or not a contextual variable moderates the effectiveness of delivering an intervention component. For example, we may find that delivering a contextually tailored activity suggestion is more effective at encouraging activity than no suggestion in a context in which the weather is good. On the other hand, if the current context includes that the current weather is bad, it may make no difference if we deliver a contextually tailored message or not. In this example the intervention component is the tailored activity message component and there are two intervention options: deliver versus do not deliver. Consider another intervention component: planning of physical activity for tomorrow. This component might have three options, the first is unstructured planning, the second is structured planning and the third is no planning. Here we might use an MRT to learn whether the context, such as the participant’s mood at the time of the planning, moderates the effect of the unstructured versus the structured planning in terms of the next day’s physical activity.
How are MRTs related to N-of-1 trials?
There are three key differences between MRTs and N-of-1 trials. The first is their inferential goals. MRTs are designed to provide data to test marginal causal effects. Marginal causal effects are effects that are averaged over the population (i.e. all individuals who are in recovery support), over a subset of the population (i.e. all young adults in recovery support) or over a subset of the population in a particular context (i.e. young adults in the morning on school days). The associated primary analyses, like most primary analyses in clinical trials involve minimal assumptions. N-of-1 trials, on the other hand, are most often conducted to provide data to ascertain the most effective treatment for a particular individual. Here nuanced assumptions based on behavioral theory are used to conduct the primary analyses.
The second difference has to do with the types of interventions the trials were developed to optimize. MRTs are designed to help decide which of multiple intervention components should be included in a multi-component intervention, where N-of-1 trials were developed for settings in which scientists wish to compare the effect of one treatment to that of another (treatment package A versus treatment package B). Thus the repeated trials within an individual are usually scheduled at time points sufficiently far apart so that the assumption of no carry-over effects is valid. For example, when the individual is provided treatment (A), it is taken away, and then they are provided treatment (B), their previous exposure to treatment (A) does not affect their response to treatment (B). Or if this delayed effect might occur, the associated data analyses adjust for the carry-over effect. This makes eminent sense if the goal, as stated above, is to decide if for this individual it is better to provide treatment A or better to provide treatment B.
Third, in N-of-1 trials the treatments that are considered are usually of the type that the individual level treatment effect is unlikely to vary across time within the individual. That is, during the total duration of the N-of-1 trials the individual’s treatment responsivity should be unlikely to vary over time and across different contexts. In contrast, many intervention components considered in MRTs are likely to have time-varying effects.
What is the role of carry-over effects in an MRT?
Carry-over effects of intervention components may present as moderation effects. That is, the dose of prior intervention might, due to burden/habituation, reduce the effect of an intervention component at a future decision point. A carry-over effect may also simply lead to poorer future proximal outcomes at later decision points. For example, individuals may experience burden due to the intervention and thus delete the mobile application.
Can you provide an example of an MRT to illustrate how one works?
Example: HeartSteps version 1 MRT.
Overview: Physical activity is known to decrease the risk of several health complications, yet only one in five adults in the U.S. meet the guidelines for the number of minutes of physical activity recommended per week. Individuals can still experience health benefits if the required minutes are spread out across several days, and broken into more frequent but smaller amounts of time. The goal of HeartSteps is to develop an intervention to increase overall levels of physical activity in sedentary adults by supporting opportunistic physical activity, in which brief periods of movement or exercise are incorporated into individuals’ daily routines. HeartSteps Version 1 (v1) was a six-week MRT in which the intervention development team aimed to investigate whether contextually tailored activity suggestions, as well as support for planning how to be active, increased participants’ overall physical activity. Below we describe one of the intervention components: the contextually tailored activity suggestion component. The figure below provides a schematic in which each component is labeled.
- Intervention component: Contextually tailored activity suggestion. Push notifications sent to participants’ smartphones providing a suggestion for how to be active in the current moment, with each notification tailored to the participant’s current location, weather conditions, time of day, and day of the week
- Intervention options: The intervention options were: (A) a suggestion of a walking activity that took 2-5 minutes to complete, (B) a suggestion of an anti-sedentary activity (brief movements) that took 1-2 minutes to complete, or (C) no suggestion.
- Distal outcome: The distal outcome is the total step count during the 42-day study.
- Proximal outcome: Total number of steps taken in the 30 minutes following a decision point.
- Decision Points: There were 5 individual-specific decision points every day: before morning commute, at lunch time, mid-afternoon, after evening commute, and after dinner.
- Observations of context: Location, weather, time of day, day of the week (weekday vs. weekend), prior day’s step count, prior 30 minute step count, variation in prior 30 minute step count over past 7 days, time of day, movement, usefulness of prompt, self-reports of physical activity from prior evening.
- Availability: Participants were unavailable when sensors on the phone indicated that they might be operating a vehicle or were currently physically active. Participants were also unavailable if they turned had off the activity notifications.
- Randomization probabilities: Participants who are available at a decision point are randomized with a 0.3 probability to receive (A) a contextually tailored walking activity, a 0.3 probability of receiving (B) an anti-sedentary activity, and a 0.4 probability of receiving (C) no suggestion.
Klasnja, P., Hekler, E. B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., & Murphy, S. A. (2015). Microrandomized trials: An experimental design for developing just-in-time adaptive interventions. Health Psychology, 34(S), 1220.
Klasnja, P., Smith, S., Seewald, N. J., Lee, A., Hall, K., Luers, B., Hekler, E. B. and Murphy, S. A., (In press) Efficacy of contextually-tailored suggestions for physical activity: A micro-randomized optimization trial of HeartSteps. Annals of Behavioral Medicine.
How can MRTs answer scientific questions about the delivery of contextually tailored activity suggestion?
The HeartSteps v1 MRT focused on whether the interruption of delivering these suggestions was worthwhile – whether they had the intended effect on the proximal outcome. Also mHealth components that are delivered multiple times as individuals go about their daily lives can be burdensome, so it was necessary to understand if the effectiveness of the activity suggestions dissipated over time. The MRT was designed to address questions including:
On average across participants, does pushing the contextually tailored activity suggestion increase physical activity in the 30 minutes after the suggestion is delivered, compared to no suggestion?
If so, does the effect of the contextually tailored activity suggestion deteriorate with time (day in study)?
The mobile application that is being used in an MRT can include intervention components that are not being randomized. Why do this, and what are the implications?
Some components are not randomized in an MRT because previous scientific evidence has already demonstrated their effectiveness, efficiency, and/or because the cost/participant burden of including them as part of the intervention is negligible. If some components in an mHealth intervention are not randomized and thus not experimented on as part of an MRT, the resulting data cannot provide evidence regarding whether different options of these components (e.g. on/off, high/low) impact the effectiveness of randomized components. If there are scientific questions regarding whether the inclusion of non-randomized components impact the effectiveness of the randomized components, then further study is needed to address these questions.
What are some guidelines for choosing the decision points?
Decision points are selected so they occur at times when it makes sense to provide treatment. When defining decision points, a researcher should consider the following questions:
- Are there times when a particular treatment is more or less likely to affect the proximal outcome? Take for example the contextually tailored activity suggestion component described above in the HeartSteps MRT. Previous data indicated five time periods within a day when there was high within-person variability in step count. These five times were selected as decision points for the activity suggestions, as the suggestions are more likely to increase participants’ step counts at these times. Another example is, if the treatment is a reminder to take a once per day medication, then the decision point might occur once per day at a time when a participant indicates they usually take the medication.
- What are the contextual factors that impact the effectiveness of an intervention component, and how quickly are they changing? The frequency of decision points can also be related to the timescale at which scientists think there will be meaningful changes in factors that are relevant to deciding if and what treatment should be delivered. For example, in a smoking cessation study, Sense2Stop, researchers wanted to understand the benefits of delivering a reminder to practice a relaxation exercise when a person is classified stressed. In this case stress was the relevant factor. In this study stress classifications, based on sensor data, are made each minute. Accordingly, the decision points for the relaxation exercise component are every minute, in order to ensure opportunities for delivering treatment during times of stress. Note that just because the decision points are every minute does not mean that individuals receive an intervention every minute. In fact, at many or most decision points, no intervention will be provided; that is, at every minute the probability that determines whether a relaxation exercise is delivered will be set to a very low value.
Sense2Stop:Mobile Sensor Data to Knowledge. (2014). Retrieved from http://clinicaltrials.gov/ct2 (Identification No. NCT03184389)
What are some guidelines for choosing randomization probabilities?
- Participant burden: Choice of randomization probabilities is primarily driven by considerations of participant burden, so that participants will not receive a dose/number of treatments that causes them to disengage or habituate to the intervention content. A researcher would start by defining the average number of times they want participants to receive a particular intervention component. For the activity suggestion component in HeartSteps, researchers originally decided that participants should receive an average of two activity suggestions per day (e.g. 2/5).
- Availability: If there are availability considerations for an intervention component, e.g. times when it will not be appropriate to deliver intervention content, this also must be considered when defining the randomization probabilities. Pilot studies for the HeartSteps MRT demonstrated that participants would be available for approximately 80% of the decision times. Therefore, the randomization probability was increased to 3/5 so as to ensure that participants would receive approximately two messages per day.
How to decide the length of time over which one should observe the proximal outcome?
The dominant consideration is the “signal-to-noise ratio.” For each particular intervention component, a researcher needs to determine how long after delivering that component is it necessary to wait in order for a person to respond (to be able to detect the “signal”, its impact on the proximal outcome). If this time interval is too short, then the measure of the proximal outcome will not capture the effects of the intervention component. If this interval is too long, then the measure of the proximal outcome may include too much noise due to other things happening in the individual’s life. Determining “just the right duration” over which the proximal outcome should be measured can be based on prior data and domain expertise. For example, in HeartSteps the activity suggestions were tailored based on current location and weather, and the proximal outcome was measured in terms of step count. A five-minute duration for observing step count following a decision point would be too short, as the individual doesn’t have enough time to respond. However, a 60 minute duration for observing step count following a decision point was thought to be too long as the individual’s context (location, weather) may change significantly over an hour. Therefore, the research team selected 30 minutes as the duration over which the proximal outcome was to be measured.