Mixed-Effect Models

Modified on Sun, 11 Feb, 2024 at 8:57 AM

What is a mixed effect model?

A mixed effects model, also known as a mixed model, is a statistical model that incorporates both fixed effects and random effects. These models are particularly useful for analyzing data that have hierarchical or grouped structure, where observations can be nested within higher-level units (e.g., students within schools, repeated measures within individuals).

Fixed effects are frequently reported in medical literature due to their direct interpretation in estimating the average effect of independent variables on an outcome across the study population. However, the use of random effects or mixed models is essential in the presence of hierarchical data structures or when the research aims to explore variability within and between clusters or over time. The trend towards more complex and nuanced analyses in medical research has led to an increased use of mixed models that leverage the strengths of both fixed and random effects to address sophisticated research questions.

For which analyses are mixed model available on EasyMedStat?

They are available for all linear and logistic regression models, where the predicted variable (dependent) is a Yes-no, List or Numeric variable.

They are not available yet for Cox models, where the predicted variable is an Event variable.

When should I used a mixed effect model?

Mixed effect models and more generally multivariable models are complex statistical analyses. They should be carried out by experienced users with skills in statistics. If you have any doubt regarding the use of a statistical test, ask a statistician!

For reference, here are 2 common situations where random effects could be used in medical literature:

A longitudinal study tracking the progression of a chronic disease, such as diabetes or multiple sclerosis, over several years. In such a study, patients are measured on various outcomes (e.g., disease severity, quality of life) at multiple time points to assess changes over time and the effectiveness of treatments.
A multicenter clinical trial evaluating the efficacy of a new medication across several hospitals or clinics. Such trials often aim to generalize findings across diverse patient populations and healthcare settings, requiring the analysis to account for the variability between centers.

In both examples, mixed effect models are indispensable for their ability to accurately model the complex structure of the data, handle within-group correlations, and provide more reliable and generalizable estimates of the effects of interest.

Random intercept, random slope or both?

In mixed effect models, the choice between using a random intercept, a random slope, or both depends on the structure of the data and the specific research questions. Here's an overview of when to use each and examples from medical literature:

Random Intercept

When to Use:

A random intercept is used when there is natural grouping in the data, and you expect that there is variability in the outcome variable's baseline level across these groups, but you assume the effect of the predictor variables on the outcome is consistent across groups.

Example:

In a study assessing the impact of a new drug on blood pressure levels across multiple clinics, a random intercept could be used to account for baseline differences in blood pressure levels among clinics. Each clinic might have different baseline characteristics (e.g., demographics, average health status), but the effect of the drug on lowering blood pressure is assumed to be consistent across these clinics.

Random Slope

When to Use:

A random slope model is appropriate when you expect that the effect of a predictor variable on the outcome not only varies across groups but does so in a way that the direction or strength of the relationship changes.

Example:

In a longitudinal study examining the effectiveness of a physical therapy regimen on recovery from knee surgery, a random slope could be used for time, indicating that the rate of recovery (effect of time on recovery) varies among patients. This accounts for the fact that some patients may recover faster or respond better to the therapy over time compared to others.

Both Random Intercept and Random Slope

When to Use:

Using both random intercepts and slopes is appropriate when you have grouped data and expect variability both in the baseline level of the outcome across these groups and in the effect of one or more predictors on the outcome.

Example:

In a multicenter clinical trial testing a new medication for managing diabetes, where patients' glycemic control is measured over several time points, using both random intercepts and slopes would be ideal. Here, random intercepts would account for baseline differences in glycemic control across centers (due to differences in patient populations, center protocols, etc.), while random slopes for time would allow the rate of change in glycemic control (due to the medication) to vary by patient. This approach recognizes that not only do patients start from different baselines, but they also may respond differently to the medication over time.

How should my data be organized?

Usually, your data is organized with one line for each patient, just like below

To analyze time as a fixed effect and the patient as a random effect, you should now organize your data with one line for each measurement. As in the example below:

Please note that a new column has appeared: "Day" which allows to distinguish between each measurement. For example, for Patient 1, Day-1 Pain is 6. For Patient 2, Day-3 Pain is 1.

Also, all other columns are conserved as they will be used as fixed effects for the model.