8.2 Non-Equivalent Groups Designs

Learning objectives.

  • Describe the different types of nonequivalent groups quasi-experimental designs.
  • Identify some of the threats to internal validity associated with each of these designs. 

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A  nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions. There are several types of nonequivalent groups designs we will consider.

Posttest Only Nonequivalent Groups Design

The first nonequivalent groups design we will consider is the posttest only nonequivalent groups design.  In this design, participants in one group are exposed to a treatment, a nonequivalent group is not exposed to the treatment, and then the two groups are compared. Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This design would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a posttest only nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Nonequivalent Groups Design

Another way to improve upon the posttest only nonequivalent groups design is to add a pretest. In the  pretest-posttest nonequivalent groups design t here is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a nonequivalent control group that is given a pretest, does  not  receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve, but whether they improve  more  than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an anti-drug program, and finally, are given a posttest. Students in a similar school are given the pretest, not exposed to an anti-drug program, and finally, are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this change in attitude could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Returning to the example of evaluating a new measure of teaching third graders, this study could be improved by adding a pretest of students’ knowledge of fractions. The changes in scores from pretest to posttest would then be evaluated and compared across conditions to determine whether one group demonstrated a bigger improvement in knowledge of fractions than another. Of course, the teachers’ styles, and even the classroom environments might still be very different and might cause different levels of achievement or motivation among the students that are independent of the teaching intervention. Once again, differential history also represents a potential threat to internal validity.  If asbestos is found in one of the schools causing it to be shut down for a month then this interruption in teaching could produce a difference across groups on posttest scores.

If participants in this kind of design are randomly assigned to conditions, it becomes a true between-groups experiment rather than a quasi-experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Interrupted Time-Series Design with Nonequivalent Groups

One way to improve upon the interrupted time-series design is to add a control group. The interrupted time-series design with nonequivalent groups involves taking  a set of measurements at intervals over a period of time both before and after an intervention of interest in two or more nonequivalent groups. Once again consider the manufacturing company that measures its workers’ productivity each week for a year before and after reducing work shifts from 10 hours to 8 hours. This design could be improved by locating another manufacturing company who does not plan to change their shift length and using them as a nonequivalent control group. If productivity  increased rather quickly after the shortening of the work shifts in the treatment group but productivity remained consistent in the control group, then this provides better evidence for the effectiveness of the treatment. 

Similarly, in the example of examining the effects of taking attendance on student absences in a research methods course, the design could be improved by using students in another section of the research methods course as a control group. If a consistently higher number of absences was found in the treatment group before the intervention, followed by a sustained drop in absences after the treatment, while the nonequivalent control group showed consistently high absences across the semester then this would provide superior evidence for the effectiveness of the treatment in reducing absences.

Pretest-Posttest Design With Switching Replication

Some of these nonequivalent control group designs can be further improved by adding a switching replication. Using a pretest-posttest design with switching replication design, nonequivalent groups are administered a pretest of the dependent variable, then one group receives a treatment while a nonequivalent control group does not receive a treatment, the dependent variable is assessed again, and then the treatment is added to the control group, and finally the dependent variable is assessed one last time.

As a concrete example, let’s say we wanted to introduce an exercise intervention for the treatment of depression. We recruit one group of patients experiencing depression and a nonequivalent control group of students experiencing depression. We first measure depression levels in both groups, and then we introduce the exercise intervention to the patients experiencing depression, but we hold off on introducing the treatment to the students. We then measure depression levels in both groups. If the treatment is effective we should see a reduction in the depression levels of the patients (who received the treatment) but not in the students (who have not yet received the treatment). Finally, while the group of patients continues to engage in the treatment, we would introduce the treatment to the students with depression. Now and only now should we see the students’ levels of depression decrease.

One of the strengths of this design is that it includes a built in replication. In the example given, we would get evidence for the efficacy of the treatment in two different samples (patients and students). Another strength of this design is that it provides more control over history effects. It becomes rather unlikely that some outside event would perfectly coincide with the introduction of the treatment in the first group and with the delayed introduction of the treatment in the second group. For instance, if a change in the weather occurred when we first introduced the treatment to the patients, and this explained their reductions in depression the second time that depression was measured, then we would see depression levels decrease in both the groups. Similarly, the switching replication helps to control for maturation and instrumentation. Both groups would be expected to show the same rates of spontaneous remission of depression and if the instrument for assessing depression happened to change at some point in the study the change would be consistent across both of the groups. Of course, demand characteristics, placebo effects, and experimenter expectancy effects can still be problems. But they can be controlled for using some of the methods described in Chapter 5.

Switching Replication with Treatment Removal Design

In a basic pretest-posttest design with switching replication, the first group receives a treatment and the second group receives the same treatment a little bit later on (while the initial group continues to receive the treatment). In contrast, in a switching replication with treatment removal design , the treatment is removed from the first group when it is added to the second group. Once again, let’s assume we first measure the depression levels of patients with depression and students with depression. Then we introduce the exercise intervention to only the patients. After they have been exposed to the exercise intervention for a week we assess depression levels again in both groups. If the intervention is effective then we should see depression levels decrease in the patient group but not the student group (because the students haven’t received the treatment yet). Next, we would remove the treatment from the group of patients with depression. So we would tell them to stop exercising. At the same time, we would tell the student group to start exercising. After a week of the students exercising and the patients not exercising, we would reassess depression levels. Now if the intervention is effective we should see that the depression levels have decreased in the student group but that they have increased in the patient group (because they are no longer exercising).

Demonstrating a treatment effect in two groups staggered over time and demonstrating the reversal of the treatment effect after the treatment has been removed can provide strong evidence for the efficacy of the treatment. In addition to providing evidence for the replicability of the findings, this design can also provide evidence for whether the treatment continues to show effects after it has been withdrawn.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or counterbalancing of orders of conditions.
  • There are three types of quasi-experimental designs that are within-subjects in nature. These are the one-group posttest only design, the one-group pretest-posttest design, and the interrupted time-series design.
  • There are five types of quasi-experimental designs that are between-subjects in nature. These are the posttest only design with nonequivalent groups, the pretest-posttest design with nonequivalent groups, the interrupted time-series design with nonequivalent groups, the pretest-posttest design with switching replication, and the switching replication with treatment removal design.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. However, it does not eliminate the problem of confounding variables, because it does not involve random assignment to conditions or counterbalancing. For these reasons, quasi-experimental research is generally higher in internal validity than non-experimental studies but lower than true experiments.
  • Of all of the quasi-experimental designs, those that include a switching replication are highest in internal validity.
  • Practice: Imagine that two professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.
  • regression to the mean
  • spontaneous remission

Creative Commons License

Share This Book

  • Increase Font Size

Random versus nonrandom assignment in controlled experiments: do you get the same answer?

Affiliation.

  • 1 Department of Psychology, University of Memphis, Tennessee 38152, USA.
  • PMID: 8991316

Psychotherapy meta-analyses commonly combine results from controlled experiments that use random and nonrandom assignment without examining whether the 2 methods give the same answer. Results from this article call this practice into question. With the use of outcome studies of marital and family therapy, 64 experiments using random assignment yielded consistently higher mean post-test effects and less variable posttest effects than 36 studies using nonrandom assignment. This difference was reduced by about half by taking into account various covariates, especially pretest effect size levels and various characteristics of control groups. The importance of this finding depends on (a) whether one is discussing meta-analysis or primary experiments, (b) how precise an answer is desired, and (c) whether some adjustment to the data from studies using nonrandom assignment is possible. It is concluded that studies using nonrandom assignment may produce acceptable approximations to results from randomized experiments under some circumstances but that reliance on results from randomized experiments as the gold standard is still well founded.

Publication types

  • Comparative Study
  • Family Therapy
  • Marital Therapy
  • Meta-Analysis as Topic*
  • Random Allocation*

IResearchNet

Nonexperimental Designs

The most frequently used experimental design type for research in industrial and organizational psychology and a number of allied fields is the nonexperiment. This design type differs from that of both the randomized experiment and the quasi-experiment in several important respects. Prior to describing the nonexperimental design type, we note that the article on experimental designs in this section considers basic issues associated with (a) the validity of inferences stemming from empirical research and (b) the settings within which research takes place. Thus, the same set of issues is not addressed in this entry.

Attributes of Nonexperimental Designs

Nonexperimental designs differ from both quasi-experimental designs and randomized experimental designs in several important respects. Overall, these differences lead research using nonexperimental designs to be far weaker than that using alternative designs, in terms of internal validity and several other criteria.

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% off with 24start discount code, measurement of assumed causes.

In nonexperimental research, variables that are assumed causes are measured, as opposed to being manipulated. For example, a researcher interested in testing the relation between organizational commitment (an assumed cause) and worker productivity (an assumed effect) would have to measure the levels of these variables. Because of the fact that commitment levels were measured, the study would have little if any internal validity. Note, moreover, that the internal validity of such research would not be at all improved by a host of data analytic strategies (e.g., path analysis, structural equation modeling) that purport to allow for inferences about causal connections between and among variables (Stone-Romero, 2002; Stone-Romero & Rosopa, 2004).

Nonrandom Assignment of Participants and Absence of Conditions

In nonexperiments, there are typically no explicitly defined research conditions. For example, a researcher interested in assessing the relation between job satisfaction (an assumed cause) and organizational commitment (an assumed effect) would simply measure the level of both such variables. Because participants were not randomly assigned to conditions in which the level of job satisfaction was manipulated, the researcher would be left in the uncomfortable position of not having information about the many variables that were confounded with job satisfaction. Thus, the internal validity of the study would be a major concern. Moreover, even if the study involved the comparison of scores on one or more dependent variables across existing conditions over which the researcher had no control, the researcher would have no control over the assignment of participants to the conditions. For example, a researcher investigating the assumed effects of incentive systems on firm productivity in several manufacturing firms would have no control over the attributes of such systems. Again, this would serve to greatly diminish the internal validity of the study.

Measurement of Assumed Dependent Variables

In nonexperimental research, assumed dependent variables are measured. Note that the same is true of both randomized experiments and quasi-experiments. However, there are very important differences among the three experimental design types that warrant attention. More specifically, in the case of well-conducted randomized experiments, the researcher can be highly confident that the scores on the dependent variable(s) were a function of the study’s manipulations. Moreover, in quasi-experiments with appropriate design features, the investigator can be fairly confident that the study’s manipulations were responsible for observed differences on the dependent variable(s). However, in nonexperimental studies, the researcher is placed in the uncomfortable position of having to assume that what he or she views as dependent variables are indeed effects. Regrettably, in virtually all nonexperimental research, this assumption rests on a very shaky foundation. Thus, for example, in a study of the assumed effect of job satisfaction on intentions to quit a job, what the researcher assumes to be the effect may in fact be the cause. That is, individuals who have decided to quit for reasons that were not based on job satisfaction could, in the interest of cognitive consistency, view their jobs as not being satisfying.

Control Over Extraneous or Confounding Variables

Because of the fact that nonexperimental research does not benefit from the controls (e.g., random  assignment to conditions) that are common to studies using randomized experimental designs, there is relatively little potential to control extraneous variables. As a result, the results of nonexperimental research tend to have little, if any, internal validity. For instance, assume that a researcher did a nonexperimental study of the assumed causal relation between negative affectivity and job-related strain and found these variables to be positively related. It would be inappropriate to conclude that these variables were causally related. At least one important reason for this is that the measures of these constructs have common items. Thus, any detected relation between them could well be spurious, as noted by Eugene F. Stone-Romero in 2005.

In hopes of bolstering causal inference, researchers who do nonexperimental studies often measure variables that are assumed to be confounds and then use such procedures as hierarchical multiple regression, path analysis, and structural equation modeling to control them. Regrettably, such procedures have little potential to control confounds. There are at least four reasons for this. First, researchers are seldom aware of all of the relevant confounds. Second, even if all of them were known, it is seldom possible to measure more than a few of them in any given study and use them as controls. Third, to the degree that the measures of confounds are unreliable, procedures such as multiple regression will fail to fully control for the effects of measured confounds. Fourth, and finally, because a large number of causal models may be consistent with a given set of covariances among a set of variables, statistical procedures are incapable of providing compelling evidence about the superiority of any given model over alternative models.

References:

  • Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Boston: Houghton Mifflin.
  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.
  • Stone-Romero, E. F. (2002). The relative validity and usefulness of various empirical research designs. In
  • G. Rogelberg (Ed.), Handbook of research methods in industrial and organizational psychology (pp. 77-98). Malden, MA: Blackwell.
  • Stone-Romero, E. F. (2005). Personality-based stigmas and unfair discrimination in work organizations. In R. L. Dipboye & A. Colella (Eds.), Discrimination at work: The psychological and organizational bases (pp. 255-280). Mahwah, NJ: Lawrence Erlbaum.
  • Stone-Romero, E. F., & Rosopa, P. (2004). Inference problems with hierarchical multiple regression-based tests of mediating effects. Research in Personnel and Human Resources Management, 23, 249-290.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Random Assignment in Experiments | Introduction & Examples

Random Assignment in Experiments | Introduction & Examples

Published on March 8, 2021 by Pritha Bhandari . Revised on June 22, 2023.

In experimental research, random assignment is a way of placing participants from your sample into different treatment groups using randomization.

With simple random assignment, every member of the sample has a known or equal chance of being placed in a control group or an experimental group. Studies that use simple random assignment are also called completely randomized designs .

Random assignment is a key part of experimental design . It helps you ensure that all groups are comparable at the start of a study: any differences between them are due to random factors, not research biases like sampling bias or selection bias .

Table of contents

Why does random assignment matter, random sampling vs random assignment, how do you use random assignment, when is random assignment not used, other interesting articles, frequently asked questions about random assignment.

Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment and avoid biases.

In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. To do so, they often use different levels of an independent variable for different groups of participants.

This is called a between-groups or independent measures design.

You use three groups of participants that are each given a different level of the independent variable:

  • a control group that’s given a placebo (no dosage, to control for a placebo effect ),
  • an experimental group that’s given a low dosage,
  • a second experimental group that’s given a high dosage.

Random assignment to helps you make sure that the treatment groups don’t differ in systematic ways at the start of the experiment, as this can seriously affect (and even invalidate) your work.

If you don’t use random assignment, you may not be able to rule out alternative explanations for your results.

  • participants recruited from cafes are placed in the control group ,
  • participants recruited from local community centers are placed in the low dosage experimental group,
  • participants recruited from gyms are placed in the high dosage group.

With this type of assignment, it’s hard to tell whether the participant characteristics are the same across all groups at the start of the study. Gym-users may tend to engage in more healthy behaviors than people who frequent cafes or community centers, and this would introduce a healthy user bias in your study.

Although random assignment helps even out baseline differences between groups, it doesn’t always make them completely equivalent. There may still be extraneous variables that differ between groups, and there will always be some group differences that arise from chance.

Most of the time, the random variation between groups is low, and, therefore, it’s acceptable for further analysis. This is especially true when you have a large sample. In general, you should always use random assignment in experiments when it is ethically possible and makes sense for your study topic.

Prevent plagiarism. Run a free check.

Random sampling and random assignment are both important concepts in research, but it’s important to understand the difference between them.

Random sampling (also called probability sampling or random selection) is a way of selecting members of a population to be included in your study. In contrast, random assignment is a way of sorting the sample participants into control and experimental groups.

While random sampling is used in many types of studies, random assignment is only used in between-subjects experimental designs.

Some studies use both random sampling and random assignment, while others use only one or the other.

Random sample vs random assignment

Random sampling enhances the external validity or generalizability of your results, because it helps ensure that your sample is unbiased and representative of the whole population. This allows you to make stronger statistical inferences .

You use a simple random sample to collect data. Because you have access to the whole population (all employees), you can assign all 8000 employees a number and use a random number generator to select 300 employees. These 300 employees are your full sample.

Random assignment enhances the internal validity of the study, because it ensures that there are no systematic differences between the participants in each group. This helps you conclude that the outcomes can be attributed to the independent variable .

  • a control group that receives no intervention.
  • an experimental group that has a remote team-building intervention every week for a month.

You use random assignment to place participants into the control or experimental group. To do so, you take your list of participants and assign each participant a number. Again, you use a random number generator to place each participant in one of the two groups.

To use simple random assignment, you start by giving every member of the sample a unique number. Then, you can use computer programs or manual methods to randomly assign each participant to a group.

  • Random number generator: Use a computer program to generate random numbers from the list for each group.
  • Lottery method: Place all numbers individually in a hat or a bucket, and draw numbers at random for each group.
  • Flip a coin: When you only have two groups, for each number on the list, flip a coin to decide if they’ll be in the control or the experimental group.
  • Use a dice: When you have three groups, for each number on the list, roll a dice to decide which of the groups they will be in. For example, assume that rolling 1 or 2 lands them in a control group; 3 or 4 in an experimental group; and 5 or 6 in a second control or experimental group.

This type of random assignment is the most powerful method of placing participants in conditions, because each individual has an equal chance of being placed in any one of your treatment groups.

Random assignment in block designs

In more complicated experimental designs, random assignment is only used after participants are grouped into blocks based on some characteristic (e.g., test score or demographic variable). These groupings mean that you need a larger sample to achieve high statistical power .

For example, a randomized block design involves placing participants into blocks based on a shared characteristic (e.g., college students versus graduates), and then using random assignment within each block to assign participants to every treatment condition. This helps you assess whether the characteristic affects the outcomes of your treatment.

In an experimental matched design , you use blocking and then match up individual participants from each block based on specific characteristics. Within each matched pair or group, you randomly assign each participant to one of the conditions in the experiment and compare their outcomes.

Sometimes, it’s not relevant or ethical to use simple random assignment, so groups are assigned in a different way.

When comparing different groups

Sometimes, differences between participants are the main focus of a study, for example, when comparing men and women or people with and without health conditions. Participants are not randomly assigned to different groups, but instead assigned based on their characteristics.

In this type of study, the characteristic of interest (e.g., gender) is an independent variable, and the groups differ based on the different levels (e.g., men, women, etc.). All participants are tested the same way, and then their group-level outcomes are compared.

When it’s not ethically permissible

When studying unhealthy or dangerous behaviors, it’s not possible to use random assignment. For example, if you’re studying heavy drinkers and social drinkers, it’s unethical to randomly assign participants to one of the two groups and ask them to drink large amounts of alcohol for your experiment.

When you can’t assign participants to groups, you can also conduct a quasi-experimental study . In a quasi-experiment, you study the outcomes of pre-existing groups who receive treatments that you may not have any control over (e.g., heavy drinkers and social drinkers). These groups aren’t randomly assigned, but may be considered comparable when some other variables (e.g., age or socioeconomic status) are controlled for.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). Random Assignment in Experiments | Introduction & Examples. Scribbr. Retrieved February 22, 2024, from https://www.scribbr.com/methodology/random-assignment/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, guide to experimental design | overview, steps, & examples, confounding variables | definition, examples & controls, control groups and treatment groups | uses & examples, what is your plagiarism score.

Popular searches

  • How to Get Participants For Your Study
  • How to Do Segmentation?
  • Conjoint Preference Share Simulator
  • MaxDiff Analysis
  • Likert Scales
  • Reliability & Validity

Request consultation

Do you need support in running a pricing or product study? We can help you with agile consumer research and conjoint analysis.

Looking for an online survey platform?

Conjointly offers a great survey tool with multiple question types, randomisation blocks, and multilingual support. The Basic tier is always free.

Research Methods Knowledge Base

  • Navigating the Knowledge Base
  • Foundations
  • Measurement
  • Internal Validity
  • Introduction to Design
  • Types of Designs
  • Probabilistic Equivalence

Random Selection & Assignment

  • Defining Experimental Designs
  • Factorial Designs
  • Randomized Block Designs
  • Covariance Designs
  • Hybrid Experimental Designs
  • Quasi-Experimental Design
  • Pre-Post Design Relationships
  • Designing Designs for Research
  • Quasi-Experimentation Advances
  • Table of Contents

Fully-functional online survey tool with various question types, logic, randomisation, and reporting for unlimited number of surveys.

Completely free for academics and students .

Random selection is how you draw the sample of people for your study from a population. Random assignment is how you assign the sample that you draw to different groups or treatments in your study.

It is possible to have both random selection and assignment in a study. Let’s say you drew a random sample of 100 clients from a population list of 1000 current clients of your organization. That is random sampling. Now, let’s say you randomly assign 50 of these clients to get some new additional treatment and the other 50 to be controls. That’s random assignment.

It is also possible to have only one of these (random selection or random assignment) but not the other in a study. For instance, if you do not randomly draw the 100 cases from your list of 1000 but instead just take the first 100 on the list, you do not have random selection. But you could still randomly assign this nonrandom sample to treatment versus control. Or, you could randomly select 100 from your list of 1000 and then nonrandomly (haphazardly) assign them to treatment or control.

And, it’s possible to have neither random selection nor random assignment. In a typical nonequivalent groups design in education you might nonrandomly choose two 5th grade classes to be in your study. This is nonrandom selection. Then, you could arbitrarily assign one to get the new educational program and the other to be the control. This is nonrandom (or nonequivalent) assignment.

Random selection is related to sampling . Therefore it is most related to the external validity (or generalizability) of your results. After all, we would randomly sample so that our research participants better represent the larger group from which they’re drawn. Random assignment is most related to design . In fact, when we randomly assign participants to treatments we have, by definition, an experimental design . Therefore, random assignment is most related to internal validity . After all, we randomly assign in order to help assure that our treatment groups are similar to each other (i.e., equivalent) prior to the treatment.

Cookie Consent

Conjointly uses essential cookies to make our site work. We also use additional cookies in order to understand the usage of the site, gather audience analytics, and for remarketing purposes.

For more information on Conjointly's use of cookies, please read our Cookie Policy .

Which one are you?

I am new to conjointly, i am already using conjointly.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • v.3(6); 2006 Jun

Logo of plosmed

Does Random Treatment Assignment Cause Harm to Research Participants?

Cary p gross.

1 Section of General Internal Medicine, Yale University School of Medicine, New Haven, Connecticut, United States of America,

3 Robert Wood Johnson Clinical Scholars Program, Yale School of Medicine, New Haven, Connecticut, United States of America,

Harlan M Krumholz

2 Section of General Internal Medicine and Yale Cancer Center, Yale University School of Medicine, New Haven, Connecticut, United States of America,

4 Section of Health Policy and Administration, Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut, United States of America,

Gretchen Van Wye

5 Section of Chronic Disease Epidemiology, Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut, United States of America,

Ezekiel J Emanuel

6 Department of Clinical Bioethics, Warren G. Magnuson Clinical Center, National Institutes of Health, Bethesda, Maryland, United States of America

David Wendler

Associated data.

Some argue that by precluding individualized treatment, randomized clinical trials (RCTs) provide substandard medical care, while others claim that participation in clinical research is associated with improved patient outcomes. However, there are few data to assess the impact of random treatment assignment on RCT participants. We therefore performed a systematic review to quantify the differences in health outcomes between randomized trial participants and eligible non-participants.

Methods and Findings

Studies were identified by searching Medline, the Web of Science citation database, and manuscript references. Studies were eligible if they documented baseline characteristics and clinical outcomes of RCT participants and eligible non-participants, and allowed non-participants access to the same interventions available to trial participants. Primary study outcomes according to patient group (randomized trial participants versus eligible non-participants) were extracted from all eligible manuscripts. For 22 of the 25 studies (88%) meeting eligibility criteria, there were no significant differences in clinical outcomes between patients who received random assignment of treatment (RCT participants) and those who received individualized treatment assignment (eligible non-participants). In addition, there was no relation between random treatment assignment and clinical outcome in 15 of the 17 studies (88%) in which randomized and nonrandomized patients had similar health status at baseline.

Conclusions

These findings suggest that randomized treatment assignment as part of a clinical trial does not harm research participants.

A search was conducted for RCTs where information was recorded on outcomes for participants and those who did not enter the trial, but still had access to the same treatments. No significant difference was found between randomised and non-randomised patients.

Introduction

Despite widespread reliance on randomized clinical trials (RCTs), and claims that they represent the “gold standard” for assessing treatment efficacy, ethical concern has been raised about the impact of RCTs on participants [ 1 – 4 ]. Specifically, there is a perception that individual patients are likely to have better outcomes when treatment decisions are based on physicians' clinical judgment, rather than random assignment [ 1 , 2 ]. It has been claimed that by foregoing individualized treatment assignment, the process of choosing research participants' treatments by random assignment leads to an “inevitable compromise of personal care in the service of obtaining valid research results” [ 1 ]. Further, physician and patient concerns about random treatment assignment are among the most frequently cited reasons for refusal to enroll in RCTs [ 5 – 8 ].

While some commentators focus on the specific impact of random treatment assignment, others have investigated the broader topic of differences in clinical outcomes between research participants and “real” patients in the community setting. Some studies have suggested that research participation may be associated with improved clinical outcomes [ 9 – 14 ]. These data have led some to recommend trial participation as a means to better treatment [ 15 ]. For instance, the National Comprehensive Cancer Network's clinical practice guidelines in oncology state that “the best management for any cancer patient is in a clinical trial” [ 15 ]. Yet these conclusions are not based on strong evidence [ 16 ]. In particular, comparisons of research participants versus non-participants often include non-participants who do not meet trial eligibility criteria [ 16 ]. Because of stringent eligibility criteria, trial participants tend to be younger and healthier than non-participants in the community [ 16 – 18 ]. Trial participation may also be the only means of access to some therapies: if the investigational therapy is available only in the research setting and turns out to be superior to existing therapies, trial participants who were allocated to the newer agent would be more likely to benefit. Further, the supportive clinical care that participants receive as part of research in resource-rich settings associated with some clinical trials may also be associated with superior outcomes. Recognizing these flaws in the existing data, a recent review of the literature called for more studies that assess the impact of participation in clinical research on patient outcomes in a methodologically rigorous manner [ 16 ].

It is particularly timely to disentangle the issues surrounding the effect of research participation on patients. There has recently been increased emphasis on designing trials that compare commercially available, clinically relevant alternatives [ 19 – 21 ]. Some authors have advocated substantial increases in funding of “pragmatic” trials enrolling large numbers of patients in community practice settings [ 19 ]. Additionally, Medicare's policy has recently been modified so as to provide reimbursement for some new therapies only if patients receive them in the setting of a clinical trial [ 22 ]. Unlike in the previous paradigm, which viewed randomized trials as a tool to evaluate the efficacy of novel therapeutic agents, these innovations will likely result in many more patients encountering the decision of trial enrollment in the setting of routine clinical care. Prospective participants who are asked to participate in these pragmatic trials will have to decide whether to receive therapy that was selected via randomization or to select treatment with the input of their clinician.

Given the increased emphasis on recruiting large numbers of patients into trials, it is important to consider the question of enrollment from the perspective of patients who meet all eligibility criteria and are asked to enroll. If they agree to have their treatment selected at random, rather than by their clinicians or themselves, will they be more likely to experience adverse outcomes? We sought to answer this question by examining the potential risks associated with random treatment allocation, rather than delineating differences between trial participants and non-participants [ 16 , 18 ]. While numerous studies have demonstrated the differences between trial participants and their counterparts in the community, few have focused specifically on the impact of random treatment assignment. Specifically, we were interested in the group of patients who were eligible for participation in an RCT but could also receive either of the therapies offered in the RCT even if they refused to enroll. We conducted a systematic review of published randomized controlled trials to compare the clinical outcomes of randomized patients and nonrandomized patients who were eligible for the same trial, were cared for in the same clinical setting, and received the same agents available to trial participants.

Selection of Studies

We conducted a Medline search to identify studies that (1) included only patients who were eligible for trial participation, (2) included only patients who were cared for at the same institutions and at the same time in which the randomized trial was recruiting, (3) allowed non-participants access to the agents used in the trial, (4) provided outcome data for both trial participants and eligible non-participants, and (5) recruited all participants in a similar manner.

The Medline search employed 23 unique combinations of terms and strings of terms (see Protocol S1 ). We focused a significant portion of our Medline search on identifying studies that met our definition of comprehensive cohort study design (see Protocol S1 for terms and phrases). The comprehensive cohort study design, also called the partially randomized patient preference trial design, offers eligible research participants the chance to refuse randomization but receive either the study intervention or the control intervention per study protocol [ 23 ]. In addition, we used the references of relevant manuscripts, authors' own bibliographic libraries, and Web of Science to identify frequently cited researchers and papers.

The Medline search identified 1,505 studies; the Web of Science search identified 371 studies. Of these 1,876 studies, the titles of 1,555 were identified by the two reviewers as potentially appropriate for inclusion in the current analysis. The abstracts of these 1,555 studies were assessed by two authors for appropriate content and relevant methodology. The full texts of 48 potentially suitable manuscripts were retrieved and assessed. Of these, 25 studies met the eligibility criteria.

Data Analysis

An explicit abstraction instrument was used to obtain baseline characteristics of the RCT participants and eligible non-participants and primary clinical outcomes. Outcomes were restricted to the primary outcome listed in each manuscript; if more than one primary outcome was specified, the first one listed was used. To compare outcomes across studies, all study outcomes were standardized to “adverse” outcomes, e.g., for studies that reported survival, we converted probability of survival to probability of death. Most of the studies had dichotomous outcomes that enabled the calculation of odds ratios; those that did not were analyzed separately. In the two studies in which outcomes were expressed only as rates rather than as frequency counts, the stated proportion of people in each group who experienced the study outcome was multiplied by the number at baseline to estimate the frequency [ 24 , 25 ]. In one study, non-participants were able to select from three treatment options, only two of which were part of the RCT. For this study, we included data only from non-participants who received one of the two treatments that were part of the RCT [ 25 ].

Because the relation between trial participation and clinical outcomes might be confounded by differences in baseline health status, we categorized the studies into three mutually exclusive groups: those in which the RCT participants were, overall, less healthy than eligible non-participants at baseline, those in which there was no clear difference in baseline health status, and those in which RCT participants were, overall, healthier at baseline. Two clinicians, using an implicit schema involving examination of baseline clinical and demographic characteristics of randomized and nonrandomized patients, independently categorized each study according to whether there was a balance of important prognostic factors between groups. Disagreements were resolved by consensus.

The odds ratios of experiencing the primary clinical outcome for RCT participants versus eligible non-participants were calculated using SAS 8.1 [ 26 ]. A Breslow–Day chi-square statistic indicated that it would be inappropriate to aggregate the results of studies with dichotomous outcomes because of heterogeneity. Thus, the outcomes are presented simply by study, according to baseline differences.

A total of 25 articles met the inclusion criteria and were selected for data abstraction. The dates of publication ranged from 1984 to 2002; the majority (80%) were published in 1990 or later. There was a broad range of conditions under investigation, and types of studies, including surgical trials, drug trials, and trials of counseling. The most common specialties represented were oncology (six studies), cardiovascular disease (five studies), and obstetrics/gynecology (five studies). The total number of eligible patients across all studies was 17,934 (range: 79 to 3,610); the proportion of eligible patients who agreed to be randomized ranged from 29% to 89% (average: 45 %; median: 47 %). The primary outcomes of interest varied across studies; the most common were mortality (9/25), acceptability of treatment (5/25), and proportion of time or number of days with a given condition (2/25).

Baseline Characteristics

Table 1 shows the study intervention and enrollment data for all 25 studies, categorized according to baseline clinical and sociodemographic characteristics. There were no clear differences in baseline health status between RCT participants and eligible non-participants in 17 studies. In one study, RCT participants were healthier than eligible non-participants at baseline, and in seven studies RCT participants were less healthy at baseline than eligible non-participants. There was no significant relation between the proportion of eligible patients who agreed to be randomized and the occurrence of differences in baseline health status of randomized versus nonrandomized patients. The mean proportion of eligible patients who agreed to be randomized in the seven studies categorized as “RCT patients less healthy” was 48.9%, while the mean in the 17 studies with no baseline differences was 43.5% ( p = 0.61).

Study Characteristics

An external file that holds a picture, illustration, etc.
Object name is pmed.0030188.t001.jpg

Differences in clinical sociodemographic characteristics between groups also varied in magnitude and significance. For instance, in the Bypass Angioplasty Revascularization Investigation of angioplasty versus coronary artery bypass graft, RCT participants were significantly more likely than non-participants to have a history of myocardial infarction (55% versus 51%), heart failure (9% versus 5%), or diabetes (19% versus 17%) [ 24 ]. Significant differences in race were found in two studies: the study by Marcus and colleagues included more non-whites in the eligible, nonrandomized group (10% versus 24%; p = 0.008), and the Bypass Angioplasty Revascularization Investigation included more non-whites in the RCT group (10% versus 6%, p < 0.001) [ 24 , 27 , 28 ].

In 22 of the 25 studies (88%), there were no significant differences in clinical outcomes between patients whose treatment was selected by randomized allocation and those whose treatment was selected on the basis of clinical judgment and/or patient preferences ( Table 2 ; Figure 1 ). There were no significant differences in clinical outcomes between randomized and nonrandomized patients in 15 of the 17 studies (88%) in which there were no clear baseline differences in health or sociodemographic status. Similarly, there were no significant differences in clinical outcomes between randomized and nonrandomized patients in six of the seven studies in which RCT participants were sicker than non-participants at baseline (86%; chi-square test, p > 0.05 for comparison with the “no clear baseline differences” group).

Clinical Outcome in Randomized and Nonrandomized Patients

An external file that holds a picture, illustration, etc.
Object name is pmed.0030188.t002.jpg

Asterisks indicate statistical significance. The relevant references for the studies listed along the x- axis are as follows: AVID [ 50 , 68 ], EAST [ 51 ], Cooper [ 52 ], BARI [ 24 ], Chilvers [ 53 ], Bain [ 54 ], CASS [ 55 ], Link [ 57 ], Blichert-Toft [ 30 ], Henshaw [ 58 ], Nicolaides [ 59 ], SMASH [ 63 ], Mosekilde [ 64 ], Kerry [ 67 ], Bijker [ 25 ], Melchart [ 29 ], and Antman [ 31 ].

In Feit et al.'s analysis of the data from the Bypass Angioplasty Revascularization Investigation [ 24 ], randomized patients were more likely to have risk factors for adverse outcomes at baseline: they were more likely to have congestive heart failure, prior myocardial infarction, or diabetes, and were more likely to be non-white and less educated. The 7-y mortality in the randomized group was 17.3%, compared with 14.5% in the nonrandomized group (relative risk: 1.19; 95% confidence interval [CI]: 1.03, 1.39) [ 24 ]. In Melchart et al.'s study of acupuncture versus midazolam as pretreatment for gastroscopy [ 29 ], there were no significant differences in baseline health status between randomized and nonrandomized groups. Randomized patients were more likely than nonrandomized patients to state that they would not undergo the same treatment again (34.6% versus 15.3%; relative risk: 2.27; 95% CI: 1.06, 4.84). Similarly, in Blichert-Toft's study of mastectomy versus breast-conserving surgery for breast cancer [ 30 ], randomized patients were more likely than nonrandomized patients to experience the outcome of cancer recurrence (13.7% versus 6.6%), although the difference was of borderline significance (relative risk: 2.08; 95% CI: 1.07, 4.02).

In the single study in which randomized patients were categorized as having a better baseline health status than nonrandomized patients, there was a nonsignificant trend towards the randomized patients being less likely to experience disease recurrence or death (odds ratio for randomized versus nonrandomized: 0.35; 95% CI: 0.12, 1.01) [ 31 ].

When there are several treatment options available, and there is uncertainty about which one is superior, it is assumed that individualized treatment assignment—in which clinicians consider the health status and preferences of each patient and incorporate them into a recommendation—is more likely to yield desirable outcomes. This is why doctors don't flip coins, and this is also why some may assume that randomization as part of a trial is harmful. In 23 of the 25 published clinical trials that met inclusion criteria, there were no significant differences in the likelihood of experiencing the primary study outcomes between patients whose treatment was determined by random allocation versus those whose treatment was selected on the basis of clinical judgment and/or patient preferences. More importantly, in 15 of the 17 studies in which randomized and nonrandomized patients were classified as having similar health status at baseline, there were no significant differences between these groups in clinical outcomes. These data contradict the perception that random treatment assignment as part of a clinical trial is harmful to research participants.

The finding that randomized research participants and non-participants tend to achieve similar clinical outcomes also contradicts prior studies suggesting that trial participation may be associated with superior clinical outcomes [ 9 – 14 ]. Many of the previous studies that reported such a difference failed to account for the numerous differences between clinical care and clinical research that may influence patient outcomes, including the fact that research participants are often younger, healthier, and treated by clinicians with more experience in treating patients with the condition of interest. Specifically, we restricted the present analysis to studies that included only patients who were eligible for RCT participation and had access to similar treatments whether or not they chose to enroll in the RCT. Hence, while our study sample was therefore restricted to a relatively small subset of RCTs, our findings suggest that the purported benefit of trial participation is probably due to baseline differences between participants and non-participants, or to differences in treatments received.

All of the studies included in the present analysis allowed access to the experimental therapies to patients who refused trial enrollment. It is unclear whether our results can be generalized to randomized trials that include newer, and potentially more efficacious, therapies that are not available outside the research setting. However, a recent analysis found that only 36% of trials presented at an annual meeting of the American Society of Clinical Oncology yielded “positive” results [ 32 ]. These findings contradict the widespread assumption that access to experimental therapies is beneficial [ 33 – 38 ]. Future work should explore whether participation in randomized trials of otherwise unavailable agents is associated with superior clinical outcomes.

While our comprehensive and systematic search identified far more manuscripts than prior reviews of this topic that we are aware of, our final sample size is small relative to the number of RCTs conducted annually [ 39 ]. As a result, although our findings were consistent across disease entities and different types of intervention, they may not be generalizable. As noted in prior reviews, many of the primary studies did not control for differences in baseline health characteristics [ 16 , 39 ]. We used an implicit, dual review approach to account for this potential bias, stratifying manuscripts according to baseline differences between trial participants and non-participants. Ideally, future work employing primary data would enable multivariate analysis of patient-level information, to account for important patient characteristics that may affect patient outcomes. The increasing use of electronic medical records represents a tremendous opportunity for establishing longitudinal registry databases to facilitate follow-up of patients who are offered trial enrollment, yet decline.

Our results should be interpreted with several considerations in mind. We restricted our analysis to the primary outcomes assessed in the included studies. In particular, many studies assessed the outcome of mortality, and there may have been differences in the probability of other adverse events, satisfaction, or quality of life between RCT participants and non-participants. Similarly, clinical trials may include additional research procedures, such as blood draws and lumbar punctures that do not affect patient outcomes but that pose burdens to participants. Additionally, random assignment refers only to the investigational agent. Even among RCT participants, clinician-investigators generally have some latitude regarding other aspects of care that are administered to their patients and can therefore provide individualized care that consists of interventions that are distinct from the investigational agent. Similarly, clinicians may halt existing treatment for patients who are offered a choice of enrolling in a study. In these instances, if a patient is provided one of the treatment interventions offered in the study—whether selected with randomization or by patient choice—it is possible that the initial treatment may have been superior to either of the treatments under investigation. Further, publication bias might have yielded underestimates of differences between RCT participants and eligible non-participants, as investigators may have been reluctant to report data from the non-participants in their registries if they did not support the generalizability of their RCTs. Finally, there may have been important differences in health status between randomized and nonrandomized patients that were not reported by the investigators. However, given that the vast majority of the study samples in our sample found no difference in health outcome between groups, one would have to invoke a systematic over- or underestimation of health status in the randomized groups across multiple studies in order to instill bias in this synthesis.

Numerous studies indicate that RCT participants often fail to understand that their treatments will be determined by random assignment [ 18 , 40 – 42 ]. For example, a recent analysis found that half of parents who decided whether to enroll their children in a leukemia trial did not understand that treatment allocation would be determined by chance [ 18 ]. The failure to understand randomization is often regarded as part of a broader phenomenon, termed the “therapeutic misconception,” according to which individuals assume that research treatments are based on physicians' decisions regarding what is best for them [ 1 , 43 ]. In this context, our findings have important implications for the informed consent process. In addition to explaining randomization, investigators should also explain that, in general, there is little evidence to support that participating in randomized trials is either helpful or harmful.

What do our findings say about the impact of clinical judgment and patient preferences on clinical outcomes? Although clinicians and patients may be reluctant to forego clinical decision-making, our data suggest that undergoing randomization, rather than individualized treatment recommendations by clinicians, is not harmful. This conclusion calls into question clinicians' ability to determine which therapy is superior for their patients in the setting of clinical equipoise, i.e., when there is uncertainty in the expert community about which treatment is superior for patients in general [ 44 ]. It has also been suggested that some patients who are not randomly assigned to a treatment may achieve a better outcome not because of an objective therapeutic effect, but because they were assigned to the treatment arm they preferred—a logical extension of the placebo effect [ 45 ]. To account for this possible “preference effect,” some have called for incorporating patient treatment preferences into the analysis phase of RCTs [ 45 ]. Our data provide preliminary evidence that this preference effect does not bias the outcomes of RCTs: patients who received a treatment preferred by themselves or their clinicians did not experience superior outcomes. These findings are consistent with the result of a recent review in which the authors stratified patients according to treatment received and then compared the outcome of patients who were randomized versus those who selected each therapy [ 46 ].

A critical barrier to enrolling patients in research studies is the fact that many patients are not even asked to participate [ 47 ]. One reason why physicians are reluctant to recruit their own patients is their reluctance to forego individualized treatment decisions for their patients [ 7 , 48 ]. This reluctance is especially important because physician recommendations are among the strongest predictors of trial enrollment [ 49 ]. The current findings suggest that in the setting of clinical equipoise, randomized treatment allocation as part of an RCT is unlikely to be harmful This does not imply that all research is not risky, as the risks and benefits of experimental treatment may vary substantially between studies. However, in the situation in which patients will have access to the treatments that are used in the study setting regardless of whether the patient enrolls, prospective participants and their referring physicians should be reassured: there is no evidence that random treatment assignment leads to worse clinical outcomes. Furthermore, patients who do participate in such research can contribute to the important objective of improving health and well-being for all patients.

Supporting Information

Protocol s1.

(67 KB DOC)

Editors' Summary

Background..

When researchers test a new treatment, they give it to a group of patients. If the test is to be fair and provide useful results, there should also be a control group of patients who are studied in parallel. The patients in the control group receive either a different treatment, a pretend treatment (“a placebo”), or no treatment at all. But how do researchers decide who should be in the treatment group and who should be in the control group? This is an important question because the test would not be fair if, for example, all the individuals in the treatment group were elderly men and the controls were all young women, or if everyone in the treatment group received their treatment in a well-equipped specialist hospital and the controls received care in a local general hospital. Statisticians would say that the results from such studies were “confounded” by the differences between the two groups. Instead, patients should be allocated to treatment or control groups at random. Randomization also has the advantage that it can conceal from the researchers, and from the patients, whether the treatment being given is the new one or an old one or a placebo. This is important because—again for example—researchers might hold strong beliefs about the effectiveness of a new treatment and this bias in its favor might lead them, perhaps only subconsciously, to allocate younger, stronger patients to the treatment group. For these and other reasons, randomized clinical trials (RCTs) are regarded as the “gold standard” in assessing the effectiveness of treatments.

Why Was This Study Done?

Doctors normally decide on the “best” treatment for an individual patient based on their knowledge and experience. However, if a patient has agreed to be part of an RCT, then their treatment will instead be chosen at random. Some people worry that patients who participate in RCTs may, because their treatment is less “personalized,” have a lower chance of recovery from their illness than similar patients who are not in trials. In contrast, other argue that, particularly if the trial is part of an important research program, being in an RCT is to the patient's advantage. This study aimed to find out whether either of these possibilities is true.

What Did the Researchers Do and Find?

The researchers conducted a thorough electronic search of medical journals in order to find published RCTs for which information—both before and after treatment—had been recorded not only about the patients who were enrolled in the trials, but also about other patients whose condition made them eligible to participate but who were not actually enrolled. The researchers also decided in advance that they were only interested in such RCTs if the non-enrolled patients had access to the same treatment or treatments that were given to the trial participants. Only 25 RCTs were found that met these requirements. There were nearly 18,000 patients in these studies; overall 45% had received treatment after randomization and 55% had not been randomized. Most of the RCTs were for treatments for cancer, problems of the heart and circulation, and obstetric and gynecological issues. The “clinical outcomes” recorded in the trials varied and included, for example, death/survival, recurrence of cancer, and improvement of hearing. In 22 of these trials, there were no statistically significant differences in clinical outcomes between patients who received random assignment of treatment (i.e., the RCT participants) and those who received individualized treatment assignment (eligible non-participants). In one trial the randomized patients fared better, and in the remaining two the nonrandomized patients had the better outcomes.

What Do These Findings Mean?

These findings suggest that randomized treatment assignment as part of a clinical trial does not harm research participants, nor does there appear to be an advantage to being randomized in a trial.

Additional Information

Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0030188 .

•The James Lind Library has been created to help patients and researchers understand fair tests of treatments in health care by illustrating how fair tests have developed over the centuries

• Wikipedia , a free Internet encyclopedia that anyone can edit, has pages on RCTs

Acknowledgments

The authors would like to acknowledge Drs. Frank Miller and Stephen Straus for their thoughtful comments. The views expressed are the authors' own. They do not represent the position or policy of the National Institutes of Health or the Department of Health and Human Services.

Author contributions. CPG, EJE, and DW designed the study. GVW abstracted data from articles. CPG, GVW, EJE, and DW analyzed the data. CPG, HMK, GVW, EJE, and DW contributed to writing the paper.

Abbreviations

Citation: Gross CP, Krumholz HM, Van Wye G, Emanuel EJ, Wendler D (2006) Does random treatment assignment cause harm to research participants? PLoS Med 3(6): e188. DOI: 10.1371/journal.pmed.0030188

Funding: This work was funded with a contract from the Department of Clinical Bioethics, National Institutes of Health. CPG's efforts were supported by a Cancer Prevention, Control, and Population Sciences Career Development Award (1K07CA-90402) and the Claude D. Pepper Older Americans Independence Center at Yale (P30AG21342). The study sponsors played no role in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the paper for publication.

  • Appelbaum PS, Roth LH, Lidz CW, Benson P, Winslade W. False hopes and best data: Consent to research and the therapeutic misconception. Hastings Cent Rep. 1987; 17 :20–24. [ PubMed ] [ Google Scholar ]
  • Taylor KM, Margolese RG, Soskolne CL. Physicians' reasons for not entering eligible patients in a randomized clinical trial of surgery for breast cancer. N Engl J Med. 1984; 310 :1363–1367. [ PubMed ] [ Google Scholar ]
  • Feinstein AR. Current problems and future challenges in randomized clinical trials. Circulation. 1984; 70 :767–774. [ PubMed ] [ Google Scholar ]
  • Abel U, Koch A. The role of randomization in clinical studies: Myths and beliefs. J Clin Epidemiol. 1999; 52 :487–497. [ PubMed ] [ Google Scholar ]
  • Kemeny MM, Peterson BL, Kornblith AB, Muss HB, Wheeler J, et al. Barriers to clinical trial participation by older women with breast cancer. J Clin Oncol. 2003; 21 :2268–2275. [ PubMed ] [ Google Scholar ]
  • Jenkins V, Fallowfield L. Reasons for accepting or declining to participate in randomized clinical trials for cancer therapy. Br J Cancer. 2000; 82 :1783–1788. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fallowfield L, Ratcliffe D, Souhami R. Clinicians' attitudes to clinical trials of cancer therapy. Eur J Cancer. 1997; 33 :2221–2229. [ PubMed ] [ Google Scholar ]
  • Taylor K, Feldstein M, Skeel R, Pandya K, Carbone P. Fundamental dilemmas of the randomized clinical trial process: Results of a survey of 1737 Eastern Cooperative Oncology Group investigators. J Clin Oncol. 1994; 12 :1796–1805. [ PubMed ] [ Google Scholar ]
  • Daugherty C, Ratain MJ, Grochowski E, Stocking C, Kodish E, et al. Perceptions of cancer patients and their physicians involved in phase I trials. J Clin Oncol. 1995; 13 :1062–1072. [ PubMed ] [ Google Scholar ]
  • Joffe S, Weeks JC. Views of American oncologists about the purposes of clinical trials. J Natl Cancer Inst. 2002; 94 :1847–1853. [ PubMed ] [ Google Scholar ]
  • Yuval R, Halon DA, Merdler A, Khader N, Karkabi B, et al. Patient comprehension and reaction to participating in a double-blind randomized clinical trial (ISIS-4) in acute myocardial infarction. Arch Intern Med. 2000; 160 :1142–1146. [ PubMed ] [ Google Scholar ]
  • Karjalainen S, Palva I. Do treatment protocols improve end results? A study of survival of patients with multiple myeloma in Finland. BMJ. 1989; 299 :1069–1072. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Davis S, Wright P, Schulman S, Hill L, Pikham R, et al. Participants in prospective randomized clinical trials for resected non-small lung cancer have improved survival compared with non-participants in such trials. Cancer. 1985; 56 :1710–1718. [ PubMed ] [ Google Scholar ]
  • Marubini E, Mariani L, Salvadori B, Veronesi U, Saccozzi R, et al. Results of a breast-cancer-surgery trial compared with observational data from routine practice. Lancet. 1996; 347 :1000–1003. [ PubMed ] [ Google Scholar ]
  • National Comprehensive Cancer Network. NCCN clinical practice guidelines in oncology. Jenkintown (Pennsylvania): National Comprehensive Cancer Network; 2006. Available: http://www.nccn.org/professionals/physician_gls/f_guidelines.asp . Accessed 4 April 2006. [ Google Scholar ]
  • Peppercorn J, Weeks JC, Cook EF, Joffe S. Comparison of outcomes in cancer patients treated within and outside clinical trials: Conceptual framework and structured review. Lancet. 2004; 363 :263–270. [ PubMed ] [ Google Scholar ]
  • Heiat A, Gross CP, Krumholz HM. Representation of the elderly, women, and minorities in heart failure clinical trials. Arch Intern Med. 2002; 162 :1682–1688. [ PubMed ] [ Google Scholar ]
  • Kodish E, Eder M, Noll RB, Ruccione K, Lange B, et al. Communication of Randomization in Childhood Leukemia Trials. JAMA. 2004; 291 :470–475. [ PubMed ] [ Google Scholar ]
  • Tunis SR, Stryer DB, Clancy CM. Practical clinical trials: Increasing the value of clinical research for decision making in clinical and health policy. JAMA. 2003; 290 :1624–1632. [ PubMed ] [ Google Scholar ]
  • ALLHAT Officers and Coordinators for the ALLHAT Collaborative Research Group. Major outcomes in high-risk hypertensive patients randomized to angiotensin-converting enzyme inhibitor or calcium channel blocker vs diuretic: The Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT) JAMA. 2002; 288 :2981–2997. [ PubMed ] [ Google Scholar ]
  • Cannon CP, McCabe CH, Belder R, Breen J, Braunwald E. Design of the Pravastatin or Atorvastatin Evaluation and Infection Therapy (PROVE IT)-TIMI 22 trial. Am J Cardiol. 2002; 89 :860–861. [ PubMed ] [ Google Scholar ]
  • Kolata G. Medicare covering new treatments, but with a catch. New York Times; Sect A: 1; 2004. [ Google Scholar ]
  • Olschewski M, Scheurlen H. Comprehensive cohort study: An alternative to randomised consent. Methods Inf Med. 1985; 24 :131–134. [ PubMed ] [ Google Scholar ]
  • Feit F, Brooks M, Sopko G, Keller N, Rosen A, et al. Long-term clinical outcome in the Bypass Angioplasty Revascularization Investigation Registry. Circulation. 2000; 101 :2795–2802. [ PubMed ] [ Google Scholar ]
  • Bijker N, Peterse JL, Fentiman IS, Julien JP, Hart AA, et al. Effects of patient selection on the applicability of results from a randomised clinical trial (EORTC 10853) investigating breast-conserving therapy for DCIS. Br J Cancer. 2002; 87 :615–620. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • SAS Institute. SAS/STAT, version 8.1 [computer program] Cary (North Carolina): SAS Institute; 2000. [ Google Scholar ]
  • Marcus S. Assessing non-consent bias with parallel randomized and nonrandomized clinical trials. J Clin Epidemiol. 1997; 50 :823–828. [ PubMed ] [ Google Scholar ]
  • Paradise JL, Bluestone CD, Rogers KD, Taylor FH, Colborn DK, et al. Efficacy of adenoidectomy for recurrent otitis media in children previously treated with tympanostomy-tube placement. Results of parallel randomized and nonrandomized trials. JAMA. 1990; 263 :2066–2073. [ PubMed ] [ Google Scholar ]
  • Melchart D, Steger HG, Linde K, Makarian K, Hatahet Z, et al. Integrating patient preferences in clinical trials: A pilot study of acupuncture versus midazolam for gastroscopy. J Altern Complement Med. 2002; 8 :265–274. [ PubMed ] [ Google Scholar ]
  • Blichert-Toft M, Brincker H, Andersen JA, Andersen KW, Axelsson CK, et al. A Danish randomized trial comparing breast-preserving therapy with mastectomy in mammary carcinoma. Preliminary results. Acta Oncol. 1988; 27 :671–677. [ PubMed ] [ Google Scholar ]
  • Antman K, Amato D, Wood W, Carson J, Suit H, et al. Selection bias in clinical trials. J Clin Oncol. 1985; 3 :1142–1147. [ PubMed ] [ Google Scholar ]
  • Krzyzanowska MK, Pintilie M, Tannock IF. Factors associated with failure to publish large randomized trials presented at an oncology meeting. JAMA. 2003; 290 :495–501. [ PubMed ] [ Google Scholar ]
  • Weijer C. Selecting subjects for participation in clinical research: One sphere of justice. J Med Ethics. 1999; 25 :31–36. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • United States Public Law 103–43. NIH Revitalization Act of 1993, Subtitle B, Section. 1993 Jun 10;:131–133. [ Google Scholar ]
  • National Cancer Institute. Age alone should not prevent older patients from enrolling in clinical trials. Bethesda (Maryland): National Cancer Institute; 2003. Available: http://www.cancer.gov/clinicaltrials/developments/age-as-barrier1005 . Accessed 20 April 2006 . [ Google Scholar ]
  • Stallings FL, Ford ME, Simpson NK, Fouad M, Jernigan JC, et al. Black participation in the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. Control Clin Trials. 2000; 21 :379S–389S. [ PubMed ] [ Google Scholar ]
  • Kolata G, Eichenwald. Group of insurers to pay for experimental cancer therapy. New York Times; Sect C: 1, 9; 1999. [ PubMed ] [ Google Scholar ]
  • ECRI. Should I enter a clinical trial? A patient reference guide for adults with a serious or life-threatening illness. Plymouth Meeting (Pennsylvania): ECRI; 2002. Feb, Available: http://www.ecri.org/Patient_Information/Patient_Reference_Guide/prg.pdf . Accessed 4 April 2006 . [ Google Scholar ]
  • Braunholtz DA, Edwards SJ, Lilford RJ. Are randomized clinical trials good for us (in the short term)? Evidence for a “trial effect” J Clin Epidemiol. 2001; 54 :217–224. [ PubMed ] [ Google Scholar ]
  • Joffe S, Cook EF, Cleary PD, Clark JW, Weeks JC. Quality of informed consent in cancer clinical trials: A cross-sectional survey. Lancet. 2001; 358 :1772–1777. [ PubMed ] [ Google Scholar ]
  • Taub HA, Baker MT, Sturr JF. Informed consent for research. Effects of readability, patient age, and education. J Am Geriatr Soc. 1986; 34 :601–606. [ PubMed ] [ Google Scholar ]
  • Dunn LB, Lindamer LA, Palmer BW, Golshan S, Schneiderman LJ, et al. Improving understanding of research consent in middle-aged and elderly patients with psychotic disorders. Am J Geriatr Psychiatry. 2002; 10 :142–150. [ PubMed ] [ Google Scholar ]
  • Lidz CW, Appelbaum PS, Grisso T, Renaud M. Therapeutic misconception and the appreciation of risks in clinical trials. Soc Sci Med. 2004; 58 :1689–1697. [ PubMed ] [ Google Scholar ]
  • Sreenivasan G. Does informed consent to research require comprehension? Lancet. 2003; 362 :2016–2018. [ PubMed ] [ Google Scholar ]
  • McPherson K, Britton A. The impact of patient preferences on the interpretation of randomised controlled trials. Eur J Cancer. 1999; 35 :1598–1602. [ PubMed ] [ Google Scholar ]
  • King M, Nazareth I, Lampe F, Bower P, Chandler M, et al. Impact of participant and physician intervention preferences on randomized trials: A systematic review. JAMA. 2005; 293 :1089–1099. [ PubMed ] [ Google Scholar ]
  • Wendler D, Kington R, Madans J, Wye GV, Christ-Schmidt H, et al. Are racial and ethnic minorities less willing to participate in health research? PLoS Med. 2006; 3 :e19. doi: 10.1371/journal.pmed.0030019. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Fleming I. Clinical trials for cancer patients: The community practicing physician's perspective. Cancer. 1990; 65 :2388–2390. [ PubMed ] [ Google Scholar ]
  • Foley J, Moertel C. Improving accrual into cancer clinical trials. J Cancer Educ. 1991; 6 :165–173. [ PubMed ] [ Google Scholar ]
  • Hallstrom A, Friedman L, Denes P, Rizo-Patron C, Morris M. Do arrhythmia patients improve survival by participating in randomized clinical trials? Observations from the Cardiac Arrhythmia Suppression Trial (CAST) and the Antiarrhythmics Versus Implantable Defibrillators Trial (AVID) Control Clin Trials. 2003; 24 :341–352. [ PubMed ] [ Google Scholar ]
  • King SB, Barnhart HX, Kosinski AS, Weintraub WS, Lembo NJ, et al. Angioplasty or surgery for multivessel coronary artery disease: Comparison of eligible registry and randomized patients in the EAST trial and influence of treatment selection on outcomes. Am J Cardiol. 1997; 79 :1453–1459. [ PubMed ] [ Google Scholar ]
  • Cooper KG, Grant AM, Garratt AM. The impact of using a partially randomised patient preference design when evaluating alternative managements for heavy menstrual bleeding. Br J Obstet Gynaecol. 1997; 104 :1367–1373. [ PubMed ] [ Google Scholar ]
  • Chilvers C, Dewey M, Fielding K, Gretton V, Miller P, et al. Antidepressant drugs and generic counselling for treatment of major depression in primary care: Randomised trial with patient preference arms. BMJ. 2001; 322 :772–775. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bain C, Cooper KG, Parkin DE. A partially randomized patient preference trial of microwave endometrial ablation using local anaesthesia and intravenous sedation or general anaesthesia: A pilot study. Gynaecol Endosc. 2001; 10 :223–228. [ Google Scholar ]
  • CASS Principal Investigators and their associates. Coronary Artery Surgery Study (CASS): A randomized trial of coronary artery bypass surgery. Comparability of entry characteristics and survival in randomized patients and nonrandomized patients meeting randomization criteria. J Am Coll Cardiol. 1984; 3 :114–128. [ PubMed ] [ Google Scholar ]
  • Paradise JL, Bluestone CD, Bachman RZ, Colborn DK, Bernard BS, et al. Efficacy of tonsillectomy for recurrent throat infection in severely affected children. Results of parallel randomized and nonrandomized clinical trials. N Engl J Med. 1984; 310 :674–683. [ PubMed ] [ Google Scholar ]
  • Link MP, Goorin AM, Miser AW, Green AA, Pratt CB, et al. The effect of adjuvant chemotherapy on relapse-free survival in patients with osteosarcoma of the extremity. N Engl J Med. 1986; 314 :1600–1606. [ PubMed ] [ Google Scholar ]
  • Henshaw RC, Naji SA, Russell IT, Templeton AA. Comparison of medical abortion with surgical vacuum aspiration: Women's preferences and acceptability of treatment. BMJ. 1993; 307 :714–717. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Nicolaides K, Brizot Mde L, Patel F, Snijders R. Comparison of chorionic villus sampling and amniocentesis for fetal karyotyping at 10–13 weeks' gestation. Lancet. 1994; 344 :435–439. [ PubMed ] [ Google Scholar ]
  • McKay JR, Alterman AI, McLellan AT, Snider EC, O'Brien CP. Effect of random versus nonrandom assignment in a comparison of inpatient and day hospital rehabilitation for male alcoholics. J Consult Clin Psychol. 1995; 63 :70–78. [ PubMed ] [ Google Scholar ]
  • Schmoor C, Olschewski M, Schumacher M. Randomized and non-randomized patients in clinical trials: Experiences with comprehensive cohort studies. Stat Med. 1996; 15 :263–271. [ PubMed ] [ Google Scholar ]
  • de C Williams AC, Nicholas MK, Richardson PH, Pither CE, Fernandes J. Generalizing from a controlled trial: The effects of patient preference versus randomization on the outcome of inpatient versus outpatient chronic pain management. Pain. 1999; 83 :57–65. [ PubMed ] [ Google Scholar ]
  • Urban P, Stauffer JC, Bleed D, Khatchatrian N, Amann W, et al. A randomized evaluation of early revascularization to treat shock complicating acute myocardial infarction. The (Swiss) Multicenter Trial of Angioplasty for Shock—(S)MASH. Eur Heart J. 1999; 20 :1030–1038. [ PubMed ] [ Google Scholar ]
  • Mosekilde L, Beck-Nielsen H, Sorensen OH, Nielsen SP, Charles P, et al. Hormonal replacement therapy reduces forearm fracture incidence in recent postmenopausal women—Results of the Danish Osteoporosis Prevention Study. Maturitas. 2000; 36 :181–193. [ PubMed ] [ Google Scholar ]
  • Rovers MM, Straatman H, Ingels K, van der Wilt GJ, van den Broek P, et al. Generalizability of trial results based on randomized versus nonrandomized allocation of OME infants to ventilation tubes or watchful waiting. J Clin Epidemiol. 2001; 54 :789–794. [ PubMed ] [ Google Scholar ]
  • Wieringa–de Waard M, Vos J, Bonsel GJ, Bindels PJ, Ankum WM. Management of miscarriage: A randomized controlled trial of expectant management versus surgical evacuation. Hum Reprod. 2002; 17 :2445–2450. [ PubMed ] [ Google Scholar ]
  • Kerry S, Hilton S, Dundas D, Rink E, Oakeshott P. Radiography for low back pain: A randomised controlled trial and observational study in primary care. Bri J Gen Pract. 2002; 52 :469–474. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kim SG, Hallstrom A, Love JC, Rosenberg Y, Powell J, et al. Comparison of clinical characteristics and frequency of implantable defibrillator use between randomized patients in the Antiarrhythmics Vs Implantable Defibrillators (AVID) trial and nonrandomized registry patients. Am J Cardiol. 1997; 80 :454–457. [ PubMed ] [ Google Scholar ]
  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Verywell Mind Insights
  • 2023 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

The Definition of Random Assignment According to Psychology

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

define nonrandom assignment of research participants

Emily is a board-certified science editor who has worked with top digital publishing brands like Voices for Biodiversity, Study.com, GoodTherapy, Vox, and Verywell.

define nonrandom assignment of research participants

Materio / Getty Images

Random assignment refers to the use of chance procedures in psychology experiments to ensure that each participant has the same opportunity to be assigned to any given group in a study to eliminate any potential bias in the experiment at the outset. Participants are randomly assigned to different groups, such as the treatment group versus the control group. In clinical research, randomized clinical trials are known as the gold standard for meaningful results.

Simple random assignment techniques might involve tactics such as flipping a coin, drawing names out of a hat, rolling dice, or assigning random numbers to a list of participants. It is important to note that random assignment differs from random selection .

While random selection refers to how participants are randomly chosen from a target population as representatives of that population, random assignment refers to how those chosen participants are then assigned to experimental groups.

Random Assignment In Research

To determine if changes in one variable will cause changes in another variable, psychologists must perform an experiment. Random assignment is a critical part of the experimental design that helps ensure the reliability of the study outcomes.

Researchers often begin by forming a testable hypothesis predicting that one variable of interest will have some predictable impact on another variable.

The variable that the experimenters will manipulate in the experiment is known as the independent variable , while the variable that they will then measure for different outcomes is known as the dependent variable. While there are different ways to look at relationships between variables, an experiment is the best way to get a clear idea if there is a cause-and-effect relationship between two or more variables.

Once researchers have formulated a hypothesis, conducted background research, and chosen an experimental design, it is time to find participants for their experiment. How exactly do researchers decide who will be part of an experiment? As mentioned previously, this is often accomplished through something known as random selection.

Random Selection

In order to generalize the results of an experiment to a larger group, it is important to choose a sample that is representative of the qualities found in that population. For example, if the total population is 60% female and 40% male, then the sample should reflect those same percentages.

Choosing a representative sample is often accomplished by randomly picking people from the population to be participants in a study. Random selection means that everyone in the group stands an equal chance of being chosen to minimize any bias. Once a pool of participants has been selected, it is time to assign them to groups.

By randomly assigning the participants into groups, the experimenters can be fairly sure that each group will have the same characteristics before the independent variable is applied.

Participants might be randomly assigned to the control group , which does not receive the treatment in question. The control group may receive a placebo or receive the standard treatment. Participants may also be randomly assigned to the experimental group , which receives the treatment of interest. In larger studies, there can be multiple treatment groups for comparison.

There are simple methods of random assignment, like rolling the die. However, there are more complex techniques that involve random number generators to remove any human error.

There can also be random assignment to groups with pre-established rules or parameters. For example, if you want to have an equal number of men and women in each of your study groups, you might separate your sample into two groups (by sex) before randomly assigning each of those groups into the treatment group and control group.

Random assignment is essential because it increases the likelihood that the groups are the same at the outset. With all characteristics being equal between groups, other than the application of the independent variable, any differences found between group outcomes can be more confidently attributed to the effect of the intervention.

Example of Random Assignment

Imagine that a researcher is interested in learning whether or not drinking caffeinated beverages prior to an exam will improve test performance. After randomly selecting a pool of participants, each person is randomly assigned to either the control group or the experimental group.

The participants in the control group consume a placebo drink prior to the exam that does not contain any caffeine. Those in the experimental group, on the other hand, consume a caffeinated beverage before taking the test.

Participants in both groups then take the test, and the researcher compares the results to determine if the caffeinated beverage had any impact on test performance.

A Word From Verywell

Random assignment plays an important role in the psychology research process. Not only does this process help eliminate possible sources of bias, but it also makes it easier to generalize the results of a tested sample of participants to a larger population.

Random assignment helps ensure that members of each group in the experiment are the same, which means that the groups are also likely more representative of what is present in the larger population of interest. Through the use of this technique, psychology researchers are able to study complex phenomena and contribute to our understanding of the human mind and behavior.

Lin Y, Zhu M, Su Z. The pursuit of balance: An overview of covariate-adaptive randomization techniques in clinical trials . Contemp Clin Trials. 2015;45(Pt A):21-25. doi:10.1016/j.cct.2015.07.011

Sullivan L. Random assignment versus random selection . In: The SAGE Glossary of the Social and Behavioral Sciences. SAGE Publications, Inc.; 2009. doi:10.4135/9781412972024.n2108

Alferes VR. Methods of Randomization in Experimental Design . SAGE Publications, Inc.; 2012. doi:10.4135/9781452270012

Nestor PG, Schutt RK. Research Methods in Psychology: Investigating Human Behavior. (2nd Ed.). SAGE Publications, Inc.; 2015.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 29 July 2021

Clinical Research

Errors in the implementation, analysis, and reporting of randomization within obesity and nutrition research: a guide to their avoidance

  • Colby J. Vorland   ORCID: orcid.org/0000-0003-4225-372X 1 ,
  • Andrew W. Brown   ORCID: orcid.org/0000-0002-1758-8205 1 ,
  • John A. Dawson 2 ,
  • Stephanie L. Dickinson   ORCID: orcid.org/0000-0002-2998-4467 3 ,
  • Lilian Golzarri-Arroyo   ORCID: orcid.org/0000-0002-1221-6701 3 ,
  • Bridget A. Hannon 4 ,
  • Moonseong Heo   ORCID: orcid.org/0000-0001-7711-1209 5 ,
  • Steven B. Heymsfield   ORCID: orcid.org/0000-0003-1127-9425 6 ,
  • Wasantha P. Jayawardene   ORCID: orcid.org/0000-0002-8798-0894 1 ,
  • Chanaka N. Kahathuduwa 7 ,
  • Scott W. Keith 8 ,
  • J. Michael Oakes 9 ,
  • Carmen D. Tekwe 3 ,
  • Lehana Thabane 10 &
  • David B. Allison   ORCID: orcid.org/0000-0003-3566-9399 3  

International Journal of Obesity volume  45 ,  pages 2335–2346 ( 2021 ) Cite this article

13k Accesses

15 Citations

74 Altmetric

Metrics details

  • Nutrition disorders

Randomization is an important tool used to establish causal inferences in studies designed to further our understanding of questions related to obesity and nutrition. To take advantage of the inferences afforded by randomization, scientific standards must be upheld during the planning, execution, analysis, and reporting of such studies. We discuss ten errors in randomized experiments from real-world examples from the literature and outline best practices for their avoidance. These ten errors include: representing nonrandom allocation as random, failing to adequately conceal allocation, not accounting for changing allocation ratios, replacing subjects in nonrandom ways, failing to account for non-independence, drawing inferences by comparing statistical significance from within-group comparisons instead of between-groups, pooling data and breaking the randomized design, failing to account for missing data, failing to report sufficient information to understand study methods, and failing to frame the causal question as testing the randomized assignment per se. We hope that these examples will aid researchers, reviewers, journal editors, and other readers to endeavor to a high standard of scientific rigor in randomized experiments within obesity and nutrition research.

Introduction

Randomization in scientific experiments bolsters causal inference. Determining a true causal effect would require observing the difference between two outcomes within a single unit (e.g., person, animal) in one case after exogenous manipulation (e.g., “treatment”) and in another case without the manipulation, with all else, including the time of observation, held constant [ 1 ]. However, this true causal effect would require parallel universes in which the same unit at the same time undergoes manipulation in one universe but does not in the other. In the absence of parallel universes, we can estimate average causal effects by balancing all differences between multiple units, such that one group looks as similar as possible to the other group. In practice, however, balancing all variables is likely impossible. For practical application, randomization is an alternative because the selection process is independent of the individual’s pre-randomization (observed and unobserved) characteristics that could confound the outcome, and also balances in the long run the distributions of variables that would otherwise be potential confounders, thereby providing unbiased estimation of treatment effects [ 2 ]. Randomization and exogenous treatment allow inferential statistics to create unbiased effect estimates [ 3 ]. Departures from randomization may increase uncertainty and yield bias.

Randomization is a seemingly simple concept: just assign people (or more generically, “units” [e.g., mice, rats, flies, classrooms, clinics, families]) randomly to one treatment or intervention versus another. The importance of randomization may have been first recognized at the end of the nineteenth century, and formalized in the 1920s [ 4 ]. Yet since its inception there have been errors in the implementation or interpretation of randomized experiments. In 1930, the Lanarkshire Milk investigation tested whether raw or pasteurized milk altered weight and height vs. a control condition in 20,000 schoolchildren [ 5 ]. After publication of the experiment, William Gosset (writing as “Student” of “Student’s t -test” fame) critiqued the study [ 6 ], noting that while there was some random selection of students, a subset of the children were selected on the basis of being either “well fed or ill nourished,” which favored more of the smaller and lighter children being selected, rather than randomized, to the milk groups. Thus, the greater growth in individuals assigned to the milk groups could have been from receiving the milk intervention, or the result of selection bias, an invalidating design flaw. This violates the assumption that the intervention is independent of pre-randomization characteristics of the person being assigned.

Methodologists continue to improve our understanding of the implications of effective randomization, including random sequence generation, implementation (like allocation concealment and blinding), special randomization situations (e.g., randomizing groups of individuals), analysis (e.g., how to analyze an experiment with missing data), and reporting (e.g., how to describe the randomization procedures). Herein, we identify recent publications within obesity and nutrition literature that contain errors in these aspects (see Supplementary Table 1 for a structured list). These examples largely focus on errors arising in the context of null hypothesis significance testing; while there are misconceptions associated with the understanding of p values per se [ 7 , 8 ], it is the framework by which authors typically draw conclusions. The examples span randomized experiments and trials, without or with control groups (i.e., randomized controlled trials [RCTs]). We use these examples to discuss how errors can bias study findings and fail to meet best practices for performing and reporting randomized studies. We clarify that the examples represent a convenience sample, and we make no claims about the frequency of these errors other than that they are frequent enough to have caught our attention. Our categories of errors are neither exhaustive nor in any rank order of severity. Furthermore, we make no assumptions about the circumstances that led to the errors. Rather, we share these examples in the spirit of Gosset who wrote in 1931 on the Lanarkshire Milk experiment, “…but what follows is written not so much in criticism of what was done…as in the hope that in any further work full advantage may be taken of the light which may be thrown on the best methods of arrangement by the defects as well as by the merits” [ 6 ].

Errors in implementing group allocation

1. error: representing nonrandom allocation methods as random, description.

Participants are allocated into treatment groups by use of methods that are not random, but the study is labeled as randomized.

Explanation

Allocation refers to the assignment of subjects into experimental groups. The use of random methods gives each study participant a known probability of being assigned to any experimental group. When any nonrandom allocation is used, studies should not be labeled as randomized.

Authors of studies published in a sample of Chinese journals that were labeled as randomized were interviewed about their methods, and in only ~7% was randomization determined to be properly implemented [ 9 ]. Improperly labeling studies as randomized is not uncommon in both human and animal research on topics of nutrition and obesity, and can occur in different ways.

In one instance, a vitamin D supplementation trial used a nonrandomized convenience sample from a different hospital as a control group, yet labeled the trial as randomized [ 10 ]. In a reply [ 11 ], the authors suggested that no selection bias occurred during the allocation because they detected no significant differences between groups on measured covariates. However, this assumption is unjustified because (a) unobserved or mismeasured covariates can potentially introduce bias, or measurement of a covariate may be imperfect, (b) the inferential validity of randomization rests on the assumption that the distributions of all pre-randomization variables are the same in the long run across levels of the treatment groups, not that the distributions are the same across groups in any one sample, and (c) concluding that groups are identical at baseline because no significant differences were detected entails fallaciously “accepting the null.” Regardless of the lack of observed statistical differences between groups, treatment allocation was not randomized and should not be labeled as such.

In another example, researchers first allocated all participants to the intervention to ensure a sufficient sample size and then randomized future participants [ 12 ]. This violates the assumption that every subject has some probability of being assigned to every group [ 13 ]; the participants first allocated had no probability of being in the control group. In addition, those in the initial allocation wave may have had different characteristics from those with later enrollment.

If units are not all concurrently randomized (e.g., one group is enrolled at a different time), there is also a time-associated confound [ 14 ]. This is exemplified by a study of the effects of a nutraceutical formulation on hair growth that was labeled as randomized [ 15 ]. Participants were randomized to one of two treatment groups, and then each group underwent placebo and treatment sequentially (essentially a pretest-posttest design). The sequential order suggested a hair growth-by-time confound, with hair growth differing by season [ 16 ].

Nonrandom allocation can leave a signature in baseline between-group differences. With randomization, on average, the p values of baseline group comparisons will be uniform for independent measurements. While there are limitations to applying this principle broadly to assessing literature [ 17 , 18 , 19 ], in some cases it has proved useful as a prompt for more information about how and whether randomization was actually employed. An analysis by Carlisle of baseline p value distributions in over 5000 trials flagged apparent deviations from this expectation [ 20 ], suggesting that many studies labeled as randomized may not be. One trial flagged [ 21 ] was the Primary Prevention of Cardiovascular Disease with a Mediterranean Diet (PREDIMED) trial, which highlighted the significant impact of advice to consume a Mediterranean-style diet coupled with additional intake of extra-virgin olive oil or mixed nuts on risk for cardiovascular disease, compared with advice to consume a low-fat diet [ 22 ]. An audit by the PREDIMED authors discovered that members of some of the households were nonrandomly assigned to the same group as the randomized member. Furthermore, one intervention site switched from individuals to clinics as the randomization unit [ 23 ] (see section 5, “Error: failing to account for non-independence” for discussion of non-independence). Thus, the original analysis at the individual level was inappropriate for these participants because some did not have a known probability of being assigned to one of the treatment groups or the control. A retraction and reanalysis did not change the main results or conclusions [ 23 ], although causal language in the article was tempered. Conclusions from secondary analyses were affected, however, such as the 5-year change in body weight and waist circumference, which changed statistical significance for the olive oil group [ 24 ]. Use of statistical principles to examine the likelihood that randomization was properly implemented has flagged other studies related to nutrition and obesity, too [ 25 , 26 , 27 , 28 ]. In at least four cases, publications were retracted [ 22 , 26 , 29 , 30 ].

Best practices

Where randomization is impossible, methods should be clearly stated so that there is no conflation of nonrandomized with randomized experiments. Investigators should establish procedures a priori to monitor how randomization is implemented. Furthermore, although a given randomized sample may not appear balanced on all measurable baseline variables, by definition those imbalances have occurred by chance. Altering the allocation process to enforce balance with the use of nonrandom methods may introduce bias. Importantly, use of nonrandom methods may warrant changing how study results are communicated. At a practical level, most methodologists and statisticians would agree that if an RCT is properly randomized, it is reasonable to make causal claims about intervention assignment and outcomes. Whereas the purpose of most research is to seek causal effects [ 31 ], errors discussed herein break randomization, and thereby introduce additional concerns that must be satisfied to increase the confidence in unbiased estimates. While a nuanced discussion of the use of causal language is outside the scope of this review, from a purist perspective, the description of relationships as causal from nonrandom methods is inappropriate [ 32 ].

Where important pre-randomization factors are identified that could influence results if they are imbalanced (such as animal body weight), forms of restricted randomization exist to maintain the benefits of randomization with control over such factors, instead of using haphazard methods that may introduce bias. These include blocking and stratification [ 33 , 34 ], which necessitate additional consideration at the analysis stage beyond a simple randomization scheme (see section 5, “Error: failing to account for non-independence”).

2. Error: failing to adequately conceal allocation from investigators

Investigators who assign treatments, and the participants receiving them, are inadequately concealed from knowing what condition was assigned.

Allocation concealment, when implemented properly, prevents researchers from foreknowing the allocation of the next participant. Furthermore, it prevents participants from foreknowing their assignment, who may choose to dropout if they do not receive a preferred treatment. Thus, concealment prevents selection bias and confounding [ 35 , 36 , 37 ]. Whereas randomization is a method to create unbiased estimates of effect, allocation concealment is necessary to remove the human element of decisions (whether conscious or unconscious) when participants are assigned to groups, and both are important for a rigorous trial. When concealment is broken, sample estimates can become biased in different ways.

Even with the use of random allocation methods, the failure to conceal allocation means that the researchers, and sometimes participants, will know upcoming assignments. The audit of PREDIMED, as discussed in section 1, “Error: representing nonrandom allocation methods as random,” also clarified that allocation was not concealed [ 23 ], despite using computer-generated randomization tables. In the case of the Lanarkshire study as described above [ 5 , 6 ], the failure to conceal allocation led to conscious bias in how schoolchildren were assigned to the interventions. In other cases, researchers may unconsciously bias allocations if they have any involvement in the allocation. For example, if the researcher who is doing the allocation is using a physical method of randomization such as rolling a die or flipping a coin in the presence of the subject, their perception of how the die or coin is rolled or flipped, or how it falls, leaves room to redo it in ways that may select for certain subjects being allocated to particular assignments.

Nonrandom allocation also may make concealment impossible; examples and explanations are presented in Table 1 .

Appropriate concealment strategies may vary by study, but it is ideal that concealment be implemented. The random generation and storage of allocation codes is essential to allocation concealment, using generic numerals or letters unknown to the investigator. Electronic generation and storage of allocations in a protected centralized database is sometimes preferred [ 33 , 38 ] to opaque sealed envelopes [ 39 , 40 ], which is not completely immune to breach and can bias the results if poorly carried out or intentionally compromised [ 41 , 42 , 43 ]. Furthermore, if feasible, real-time generation may be favored over pre-generated allocations [ 44 ]. Regardless of physical or electronic concealment, the allocation codes and other important information about the assignment scheme, such as block size in permuted block randomization [ 45 ], should remain concealed from all research staff and participants. Initial allocation concealment can still be implemented and would improve the rigor of trials even if blinding (i.e., preventing post-randomization knowledge of group assignments) throughout the trial cannot be maintained.

3. Error: not accounting for changes in allocation ratios

The allocation ratio or number of treatment groups is changed partway through a study, but the change is not accounted for in the statistical analysis.

Over the course of a study, researchers may intentionally change treatment group allocations, such as adding, dropping, or combining treatment arms, for various reasons. When researchers change allocation ratios mid-study, this must be taken into account during statistical analysis [ 46 ]. Allocation ratios also change in “adaptive trials,” which have specific methods and concerns beyond what we can cover here (see [ 47 ] for more information).

A study evaluating effects of weight loss on telomere length performed one phase by randomizing participants to three treatment groups (in-person counseling, telephone counseling, and usual care) with 1:1:1 allocation. After no significant difference was found between in-person and telephone counseling, participants in the next phase of the study were randomized with 1:1 allocation into a combined intervention of in-person and telephone counseling or usual care [ 48 ]. In addition to the authors’ choice of analyzing interim results before starting another phase (which risks increasing false-positive findings and should be accounted for in statistical analysis [ 49 ]), the analysis combined these two phases, effectively analyzing 2:1 and 1:1 allocations together [ 50 ]. Another study of low-calorie sweeteners and sucrose and weight-related outcomes [ 51 ] started by randomly allocating participants evenly to five treatment groups with 1:1:1:1:1 allocation, but changed to 2:1:1:1:1 midway through after one group had a higher attrition rate. Neither of these two studies reported accounting for these different phases of study in the statistical analysis. Using different allocation ratios for different groups can bias study results [ 46 , 50 ]. This is because differences may exist between the different periods of recruitment in participant characteristics, such as baseline BMI [ 46 , 50 ]. Thus, baseline differences in the wave of participants allocated at the 2:1 ratio, when pooled with the ratio of those allocated at the 1:1 ratio, would exaggerate the differences when analyzed as though all participants were allocated at the same time.

When allocation ratios change within studies or between randomized experiments that are pooled, caution should be used in combining data. Changes in allocation ratios must be properly taken into account in statistical analysis (see section 7, “Error: improper pooling of data”).

4. Error: replacements are not randomly selected

Participants who dropout are replaced in ways that are nonrandom, for instance, by allocating individuals to a single treatment that experienced a high percentage of participant dropout.

Nonrandom replacement of dropouts is another example of changing allocation ratios. Dropout is common in real-world studies and often leads to missing data, bias, and potentially the loss of power. A meta-analysis of pharmaceutical trials for obesity estimated an average 1-year dropout rate of 37% [ 52 ]. Similarly, a secondary analysis of a diet intervention estimated that the probability of completing the trial was only 60% after just 12 weeks [ 53 ]. Analytical approaches like intention-to-treat [ITT] analysis and imputation of data (described in the Errors in analysis section below) may obviate the need to consider replacing subjects after the initial randomization [ 52 , 54 ]. Yet replacement is sometimes observed in the literature and failing to use random methods to do so introduces another source of potential bias.

In a properly implemented simple RCT, every subject will have the same a priori probability of belonging to any group as any other subject. When a subject who has dropped out is replaced with the next person enrolled instead of by using randomization for assignment, the new participant did not have the same chances as the other subjects in the study of being allocated to that group. This corrupts the process of randomization, potentially introducing bias, and compromises causal inference. Furthermore, allocating participants this way makes allocation concealment impossible.

It is vital to account for dropout in the calculation of sample size and allocation ratios when designing the study. Nevertheless, if dropout was not accounted for a priori, one option is that for the number of dropouts encountered, new participants are enrolled, but each new participant is randomly assigned to groups with the same allocation ratios as the originals [ 55 ]. Note that if dropouts are higher from a particular group and if completers only are analyzed, this may result in an imbalance in the final sample group allocation, but this is not an issue if the ITT principle is adhered to (see section 8, “Error: failing to account for missing data”).

Often, studies do not specify the methods used to replace subjects and use nondescript sentences similar to “subjects who dropped out were replaced” [ 56 , 57 , 58 , 59 ]. As discussed in regard to a trial on green tea ointment and pain and wound healing [ 60 ], such vagueness might suggest introduction of bias and lead to questionable conclusions.

Although replacing subjects may indeed help with the problem of power, the consequences can be detrimental if not properly implemented. Therefore, the decision to replace participants should be thoroughly considered, preplanned if at all possible, and performed by using correct methods, if found to be necessary.

Errors in the analysis of randomized experiments

5. error: failing to account for non-independence.

Groups of subjects (e.g., classrooms, schools, cages of animals) are randomly assigned to experimental conditions together but the data are analyzed as if they were randomized individually, or repeated within-subject measures are treated as independent. Or, measures are treated as independent when subjects individually randomized have repeated within-subject measures or are treated in groups.

The use of cluster randomized trial (cRCT) designs is increasing in nutrition and obesity studies, particularly for the study of school-based interventions, and in contexts where participants are exposed to the other group(s) and as such there is a lack of independence. Similarly, animals are commonly housed together (e.g., in cages, tanks) or grouped by litter. If investigators randomize all animals to treatments by groups instead of individually, this correlation must be addressed in the analysis, but is often unrecognized or ignored. These concerns also exist in cell culture experiments, for example, if treatments are randomized to an entire plate instead of individual wells. In cluster designs, the unit of randomization is the cluster, and not the individual. A frequent error in such interventions is to power and analyze the study at the individual (e.g., person, animal) level instead of the cluster level. Failing to account for within-cluster correlation (often measured by the intraclass correlation coefficient) and cluster-level impacts during study planning leads to an overestimation of statistical power [ 61 ] and typically leads to p values and associated confidence intervals that are artificially small [ 62 , 63 ].

If cRCTs are implemented incorrectly to start, valid inferential analysis for treatment effects is not possible without untestable assumptions [ 61 ]. For instance, randomly assigning one school to an intervention and one to a control yields no degrees of freedom, akin to randomizing one individual to treatment and one to control and treating multiple measurements on each of the two individuals as though those measurements were independent [ 61 ].

Studies that randomize at the individual level may also have correlated observations that should be considered in the analysis, and so it is important to identify potential sources of clustering. For example, outcome measures may be correlated when animals are individually randomized but then group housed for treatment. Likewise, participants individually randomized may be treated in group sessions (such as classes related to the intervention), or may be grouped together within surgeons that do not equally operate in all study arms. These types of scenarios require consideration in statistical analysis [ 64 ]. When repeated measurements are taken on subjects, they similarly must account for within-subject correlation. Taking multiple measurements within individuals (e.g., measuring eyesight in the left and right eye or longitudinal data within person over time) and treating them as independent will lead to invalid inferences [ 64 ].

A distinct issue exists when using forms of restricted randomization (e.g., stratification, blocking, minimization) that are employed to have tighter control over particular factors of interest. In such situations, it is important to include the factors on which randomization restrictions occur as covariates in the statistical model to account for the added correlation between groups [ 65 , 66 ]. Not doing so can result in p values and associated confidence intervals that are artificially large and reduced statistical power. On the other hand, given that one is likely employing restricted randomization because of a small number of units of randomization, losing even a few “denominator” degrees of freedom due to the inclusion of additional covariates in the model may also adversely affect power [ 67 , 68 ].

Failing to account for clustering is one of the most pervasive errors in nutrition and obesity studies that we observe [ 6 , 61 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 ]. A review of school-based randomized trials with weight-related outcomes found that only 21.5% of studies used intracluster correlation coefficients in their power analysis, and only 68.6% applied multilevel models to account for clustering [ 80 ]. In the most severe cases that we observe, a failure to appropriately focus on the cluster as the unit of randomization invalidates any hope of deriving causal inferences [ 70 , 75 , 81 ]. For additional discussion of errors in implementation and reporting in cRCTs, see ref. [ 61 ].

In an example of clustering within participants, a study of vitamin E on diabetic neuropathy randomized participants to the intervention or placebo, but for outcomes related to nerve conduction, the authors conducted measurements in limbs, stating that “left and right sides were treated independently” [ 82 ]. Because these measures were taken within the same participants, within-subject correlations must be taken into account in statistical analyses. Treating non-independent measurements as independent in statistical analysis is sometimes called “pseudoreplication” and is also a common error in animal and cell culture experiments [ 83 ].

When planning cRCTs, it is critical to perform a power calculation that incorporates the number of clusters in the design [ 61 ]. Moreover, analyses of such designs, as well as individually randomized designs, need to include the correlations from clustering for proper treatment inferences, just as repeated measurements of outcomes within subjects must be treated as non-independent.

6. Error: basing conclusions on within-group statistical tests instead of between-groups tests

Experimental groups are analyzed separately for significant differences in the change from baseline and a difference is concluded if one is significant and the other(s) not, instead of comparing directly between groups.

The probative comparison for RCTs is between groups. Sometimes, however, researchers use pre-post within-group tests and draw conclusions based on whether the within-group significance is different, for example, significant in one group but not the other (the so-called “Difference in Nominal Significance” or DINS error [ 84 ]). Using these within-group tests to imply differences between groups increases the false-positive rate of 5% for equal group sizes to up to 50% (and higher for unequal groups) [ 85 ] and is therefore invalid.

The DINS error was identified in an RCT testing isomaltulose vs. sucrose in the context of effects of an energy-reduced diet on weight and fat mass, where some conclusions, such as the outcome of fat mass, were drawn from within-group comparisons but the between-group comparison was not statistically different [ 86 ]. We observe this error frequently in nutrition and obesity research [ 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 ]. Sometimes using this logic still reaches the correct conclusions (i.e., the between-group and within-group comparisons are both statistically significant or not), but often it does not, and therefore it is an unreliable approach for inferences.

For proper analysis of RCTs, within-group testing should not be represented as the comparison of interest [ 71 , 84 , 85 , 87 , 102 ]. Journal editors, reviewers, and readers should request that conclusions be drawn from between-group comparisons.

7. Error: improper pooling of data

Data for a single RCT are pooled without maintaining the randomized design, or data from multiple RCTs are pooled (i.e., meta-analysis) without accounting for study in statistical analysis.

Data for statistical analysis can be pooled either within one or multiple RCTs, but errors can arise when the random elements of assignment are disregarded. Pooling within one study refers to the process of combining data across different groups, subgroups, or sites to include in a single analysis. When a single RCT is performed across multiple sites or subgroups and the same allocation ratio is not used across all sites or subgroups, or the randomization allocation to study arms changes during the course of an RCT, these different sites, subgroups, or phases of the study need to be taken into account during data analysis. This is because assignment probability is confounded with subset. If data are pooled simply with no account for subsets, any differences between subsets can bias effect estimation [ 50 ].

When combining multiple RCTs, individual participant data (IPD) can be used (i.e., IPD meta-analysis). However, if they are treated as though they came from a single RCT without accounting for site, at best it will increase the residual variance and make the analysis inefficient, and at worst will confound the results and make the effect estimates biased [ 104 ]. Another error in IPD meta-analyses is the use of data pooled across trials to compare intervention effects in one subgroup of participants with another (e.g., to test the interaction between intervention and pre-randomization subgroups) without accounting for trial in the analysis. This increases the risk of bias, owing to lack of knowledge of individual within- and across-trial interaction effects and inability to separate them, as well as inappropriate standard errors for the interaction effect [ 105 ]. This differs from “typical” meta-analyses because the effect estimates already account for the fact that both treatment groups existed in the same study.

In the trial of how weight loss affects telomere length in women with breast cancer (see subsection “Examples” under section 3, “Error: not accounting for changes in allocation ratios”), data were pooled from two different phases of an RCT that had different allocation ratios, which was not taken into account in the analysis [ 50 ]. Another example is a pooling study that combined IPD from multiple RCTs to examine the effects of a school-based weight management program on summer weight gain among students but ignored “study” as a factor in the analysis [ 106 ].

When pooling data under the umbrella of one study (e.g., allocation ratio change during the study), statistical analysis should include variables for subgroups to prevent confounding [ 46 ]. When pooling IPD from multiple RCTs, care must be taken to include a term for “study” when group conditions or group allocation ratios are not identical across all included RCTs [ 106 ]. For additional information on methods for IPD meta-analysis, see ref. [ 105 ].

8. Error: failing to account for missing data

Missing data (due to dropouts, errors in measurement, or other reasons) are not accounted for in an RCT.

The integrity of the randomization of subjects must be maintained throughout a study. Any post-randomization exclusion of subjects or observations, or any instances of missingness in post-randomization measurements, violates both randomization and the ITT principle (analyzing all subjects according to their original treatment assignments) and thus potentially compromises the validity of any statistical analyses and the conclusions drawn from them. There are two main reasons for this. Whereas randomization minimizes potential confounding by providing similar distributions in baseline participant characteristics, missing data that are not completely at random breaks the randomization, introduces potential bias in various ways, and degrades the confidence that the effect (or lack thereof) is the result only of the experimental condition [ 107 , 108 ]. Consider as an example reported income. If individuals with very low or very high incomes are less likely to report their incomes, then non-missing income values and their corresponding covariate values cannot provide valid inference for individuals who did not report income, because the populations are simply not the same. Missing data are extremely common in RCTs, as discussed in section 4, “Error: replacements are not randomly selected.” Regardless of the intervention, investigators need to be prepared to handle missing data based on assumptions about how data are missing.

One review found that only 50% of trials use adequate methods to account for missing data [ 109 ], and studies of obesity and nutrition are no exception. For example, in a trial of intermittent vs. continuous energy restriction on body composition and resting metabolic rate with a 50% dropout rate, reanalysis of all participants halved the magnitude of effect estimates compared with analyses of completers only [ 99 ]. As in this case, investigators will often report analyses performed only on participants who have completed the study, without also reporting an ITT analysis that includes all subjects who were randomized. Investigators may dismiss ITT analyses because they perceive them as “diluting” the effect of the treatment [ 110 ]. However, this presumes that there is an effect of treatment at all. Dropouts may result in an apparent effect that is actually an artifact. If dropouts are nonrandom, then groups may simply appear different because people remaining in the treatment group are different people from those who dropped out. Attempts to estimate whether those who dropped out differ from those who stayed in are often underpowered.

Furthermore, some investigators may not understand ITT and mislabel their analysis. For instance, in an RCT of a ketogenic diet in patients with breast cancer, the authors reported that “[s]tatistical analysis was carried out according to the intention-to-treat protocol” of the 80 randomized participants, yet the flow diagram and results suggest that the analyses were restricted to completers only [ 111 ]. Surveys of ITT practices suggest that there is a general lack of adequate reporting of information pertaining to how missing data is handled [ 112 ].

Many analyses can be conducted on randomized data including “per protocol” (removing data from noncompliant subjects) and ITT. However, simply comparing per protocol to ITT analyses as a sensitivity analysis is suboptimal; they estimate different things [ 113 ]. As such, the Food and Drug Administration has recently focused on the concept of estimands to clearly establish the question being tested [ 114 ]. ITT can estimate the effect of assignment, not treatment per se, in an unbiased manner, whereas the per protocol analysis can only estimate in a way that allows the possibility for bias.

In an oft-paraphrased maxim of Lachin [ 108 ], “the best way to deal with the problem [of missing data] is to have as little missing data as possible.” This goal may be furthered through diligent administrative follow-up and constant contact with subjects; further considerations on minimization of loss-to-follow-up and other missingness may be found elsewhere [ 115 , 116 ]. However, having no missing data whatsoever is often not achievable in practice, especially for large, randomized studies. Thus, something must be done when missing data exist. In general, the simplest and best way to mitigate the problem of missing data is through the ITT principle when conducting the statistical analysis.

Statistical approaches for handling missing data require untestable assumptions, assumptions that lack face validity and hence are unfounded, or both [ 108 ]. Complete case analyses, where subjects with missing data are ignored, require assumptions that the data are missing completely at random that are not recommended [ 108 ]. Multiple imputation fills in missing data repeatedly, with relationship and predictions guided by other covariates, and is recommended under the assumption that data are missing at random (MAR); that is, the missingness or not of an observation is not directly impacted by its true value. Methods commonly used in obesity trials such as last observation carried forward (LOCF) [ 117 ] or baseline observation carried forward (BOCF) are not recommended because of the strict or unreasonable assumptions required to yield valid conclusions [ 108 , 117 , 118 ]. In such cases where values are missing not at random (MNAR; this set of assumptions may also be referred to as “not missing at random”, NMAR), explicit modeling for the missingness process is required [ 119 ], requiring stronger assumptions that may not be valid.

Finally, when it is apparent that data are MNAR, when the integrity of randomization is no longer intact, or both, estimates are no longer represented as a causal effect afforded by randomization and care should be taken that causal language is tempered. Even in cases where the assumptions are violated, however, ignoring the missingness (e.g., completers only analyses) is generally not recommended.

In summary, minimizing missing data should be a key goal in any randomized study. But when data are missing, thoughtful approaches are necessary to respect the ITT principle and produce unbiased effect estimates. Additional discussion about best practices to handle missing data in the nutrition context is available at ref. [ 107 ].

Errors in the reporting of randomization

9. error: failing to fully describe randomization.

Published reports fail to provide sufficient information so that readers can assess the methods used for randomization.

Studies cannot be adequately evaluated unless methods used for randomization are reported in sufficient detail. Indeed, many examples described herein were obscured by poor reporting until we or others were able to gain clarification from the study authors through personal communication or post-publication discourse. Accepted guidelines that define the standards of reporting the results of clinical trials (i.e., Consolidated Standards of Reporting Trials for human trials (CONSORT) [ 120 ]), animal research (i.e., Animal Research: Reporting of In Vivo Experiments (ARRIVE) [ 121 ]), and others [ 122 ] have emphasized the importance of adequate reporting of randomization methods. Researchers should, to the fullest extent possible, report according to accepted guidelines as part of responsible research conduct [ 123 ].

Most authors (including historically us), however, do not report adequately, and this includes randomization sequence generation and allocation concealment in human and animal research [ 124 , 125 ]. We have noted specific examples of a failure to include sufficient details about the method of randomization and allocation ratio in a study of dairy- and berry-based snacks on nutritional status and grip strength [ 126 ], which were clarified in a reply [ 127 ]. In a personal communication regarding another trial of a nutritional intervention on outcomes in individuals with autism spectrum disorder, we learned that the authors had used additional blocking factors, and randomized some siblings as pairs, neither of which were reported in the paper nor accounted for in the statistical analysis [ 128 ]. In another study that pooled RCTs of school-based weight management programs, the reported number of participants of the included studies was inconsistent with the original publications [ 106 ]. In other cases, the methods used to account for clustering may not be appropriately described for readers to assess them [ 129 , 130 ]. In one case, the authors reported randomizing in pairs, yet the number randomized was an odd number and differed between groups ( n  = 21 and n  = 24) [ 131 ], to which the authors reported a coding error [ 132 ]. Other vague language descriptions include statements such as “the samples were randomly divided into two groups” [ 27 ].

The use of non-specific language to describe allocation methods may also lead to confusion as to whether randomized methods were actually used. For example, we observed the term “semi-random” used to reflect stratified randomization [ 133 ] or minimization [ 134 ], whereas elsewhere it may describe methods that are nonrandom or not clearly stated [ 135 ].

Neglecting to report essential components of how randomization was implemented hinders the ability of a reader from fully evaluating the trial and hence from interpreting the validity of the reported findings. We emphasize that reporting guidelines such as CONSORT [ 120 ] should be consulted during the study planning and publication preparation stages to ensure that essential components related to randomization are reported, such as methods used to generate the allocation sequence, implement randomization, and conceal allocation; any matching or blocking procedures used; accuracy and consistency of the numbers in flow diagrams; and reporting baseline demographic and clinical variables. With regard to the last point, a common error is to report p values of baseline statistical comparisons and conclude covariate imbalance between groups if they are <0.05. An example of this type of thinking is as follows: “[a]s randomization was not fully successful concerning age, it was included as covariate in the main analyses.” [ 136 ], or conversely, “The similarity between the exercise plus supplement and exercise plus placebo groups for both demographic composition and pre-intervention fitness and cognitive scores provides strong evidence that participants were randomly assigned into groups” [ 137 ]. However, as discussed in section 1, “Error: representing nonrandom allocation methods as random,” the distribution of p values from baseline group comparisons is uniform in the long run with randomization and therefore we would expect on average that 1/20 p values will be <0.05 by chance, with some caveats [ 17 , 18 , 19 ]. In other words, per CONSORT, “[s]uch significance tests assess the probability that observed baseline differences could have occurred by chance; however, we already know that any differences are caused by chance” [ 120 ], and should not be reported. Baseline p values do not reflect whether imbalances might affect the results; imbalanced variables that are prognostic on the outcome that are not p  < 0.05 can still have a strong effect on the result [ 138 , 139 ]. Thus, statistical tests should not be used to determine prognostic covariates; such covariates should preferably be identified and included in an analysis plan prior to executing the study [ 139 ].

10. Error: failing to properly communicate inferences from randomized studies

The causal question is not framed as testing the randomized assignment per se.

The appropriate execution and analysis of a randomized experiment tests the effect of treatment assignment on the outcome of interest. The causal effect being tested is what participants are assigned to, not what they actually did. That is, if some participants drop out, do not comply with the intervention, are accidentally given the wrong treatment, or in other ways do not complete the intended treatment, the proper analysis maintains the randomized assignment of the subjects and tests the effect of assigning subjects to the treatment, which includes factors beyond the treatment itself. Indeed, it may be that dropout or non-compliance is caused by the assignment itself. This distinction is particularly important in nutrition trials, which often suffer from poor compliance, and is discussed in part in subsection “Explanation” under section 8, “Error: failing to account for missing data” with respect to the ITT principle. For instance, researchers may be interested in discussing the effect of eating their diet, when in fact what was tested was being assigned to eat the diet.

As discussed in section 8, “Error: failing to account for missing data,” there is often a perception among authors that including subjects that are, e.g., noncompliant or incorrectly assigned will preclude an understanding of the true effect of the intervention on the outcome(s) of interest. But the realization of unbiased effect estimates that the principles of randomization afford us is only achieved when subjects are analyzed as they are randomized. For example, the random assignment to 25% energy restriction of participants in a 2-year trial resulted in an average reduction of about 12% (~300 kcal) [ 140 ]. The public discussion of this trial advertised that “Cutting 300 Calories a Day Shows Health Benefits” [ 141 ]. Yet it is possible that assigning participants to cut only 300 kcal would not have produced the same benefits if they once again achieved only half of that assigned. In another example, the random assignment of high phytate bread did not lead to a statistically significant difference in whole body iron status as compared to dephytinized bread when missing data was imputed, but it was significantly higher when dropouts were excluded [ 98 , 142 , 143 ]. A difference cannot be concluded from these data based on the causal question of the assignment of high phytate bread, particularly because dropout was significantly higher in one group, which may create an artificial effect.

The appropriate framing of the treatment assignment (i.e., following the ITT principle) as the causal effect of interest is important when communicating and interpreting results of RCTs. From this perspective, maximizing the validity of randomized studies from planning, execution, and analysis is a matter of maintaining the randomized assignments to the greatest extent possible. To this end, randomized studies should be communicated carefully that the causal question is assignment to treatment.

Randomization is a powerful tool to examine causal relationships in nutrition and obesity research. Empirical evidence supports the use of both randomization and allocation concealment for unbiased effect estimates. Trials with inadequate concealment are associated with larger effect estimates than are those with adequate concealment [ 144 , 145 , 146 , 147 ], likely reflecting bias. Despite such undesirable potential consequences, many randomized studies of humans and animals do not adequately conceal allocation [ 43 , 124 , 148 ]. Although more difficult to compare in human studies, the results of nonrandomized studies sometimes differ from those of randomized trials [ 149 ], while nonrandomized animal studies are associated with increased effect sizes [ 148 ]. These empirical observations are suggestive of biased estimates, and when coupled with the theoretical arguments, indicate that randomization should be implemented whenever possible. For these reasons, where randomization is implemented per the best practices described herein, the use of causal language to communicate results is appropriate. But where it is not correctly implemented or maintained, the greater potential for bias in the effect estimates and additional assumptions that need to be met to increase confidence in causal relationships invariably changes how such effects should be communicated.

Even when randomization is implemented, errors related to randomization are common, suggesting that researchers in nutrition and obesity may benefit from statistical support during the design, execution, analysis, and reporting of randomized experiments for more rigorous, reproducible, and replicable research [ 150 ]. When errors are discovered, authors and editors have a responsibility to correct the scientific record, and journals should have procedures in place to do so expeditiously [ 151 ]. The severity of the error, ranging from invalidating the conclusions [ 152 ] to simply requiring clarification, means that different considerations exist for each type of error. For example, some invalidating errors are consequent to the design and cannot be fixed, and retractions have been issued [ 29 , 153 , 154 ]. For other examples such as PREDIMED, for which errors in randomization required a reanalysis as a quasi-experimental design, the reanalysis, retraction, and republication serve as an important example of scientific questioning and transparency of research methods [ 155 ]. Other cases require reanalysis or reporting of the appropriate statistical analyses but are otherwise not invalidated by design flaws [ 88 , 156 ]. Yet others need clarity on the methods, for instance when a study did not really use random allocation but reported as such [ 157 ].

The involvement of professional biostatisticians and others with methodological expertise from the planning stages of a study will prevent many of these errors. The use of trial and analysis plan preregistration can aid in thinking through decisions a priori while simultaneously increasing transparency and guarding against unpublished results and inflated false positives from analytic flexibility by pre-specifying outcomes and analyses [ 71 ]. Being cognizant of these errors and becoming familiar with CONSORT and other reporting guidelines enhance the value of the time, effort, and financial investment we devote to obesity and nutrition research.

Imbens GW, Rubin DB. Rubin causal model. In: Durlauf SN, Blume LE, (eds.) Microeconometrics. London: Springer; 2010. p. 229–41 https://doi.org/10.1057/9780230280816 .

Senn S. Seven myths of randomisation in clinical trials. Stat Med. 2013;32:1439–50.

Article   PubMed   Google Scholar  

Greenland S. Randomization, statistics, and causal inference. Epidemiology. 1990;1:421–9.

CG P, Gluud C. The controlled clinical trial turns 100 years: Fibiger’s trial of serum treatment of diphtheria. BMJ. 1998;317:1243–5.

Article   Google Scholar  

Leighton G, McKinlay PL. Milk consumption and the growth of school children. Report on an investigation in Lanarkshire schools. Scotland, Edinburgh: H.M.S.O.; 1930. p. 20.

Student. The Lanarkshire Milk experiment. Biometrika. 1931;23:398–406.

Wasserstein RL, Lazar NA. The ASA statement on p-values: context, process, and purpose. Am Stat. 2016;70:129–33.

Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31:337–50.

Article   PubMed   PubMed Central   Google Scholar  

Wu T, Li Y, Bian Z, Liu G, Moher D. Randomized trials published in some Chinese journals: how many are randomized? Trials. 2009;10:1–8.

Shub A, McCarthy EA. Letter to the Editor: “Effectiveness of prenatal vitamin D deficiency screening and treatment program: a stratified randomized field trial”. J Clin Endocrinol Metab. 2018;104:337–8.

Ramezani Tehrani F, Minooee S, Rostami M, Bidhendi Yarandi R, Hosseinpanah F. Response to Letter to the Editor: “Effectiveness of prenatal vitamin D deficiency screening and treatment program: a stratified randomized field trial”. J Clin Endocrinol Metab. 2018;104:339–40.

Williams LK, Abbott G, Thornton LE, Worsley A, Ball K, Crawford D. Improving perceptions of healthy food affordability: results from a pilot intervention. Int J Behav Nutr Phys Act. 2014;11:33.

Westreich D, Cole SR. Invited commentary: positivity in practice. Am J Epidemiol. 2010;171:674–7.

Campbell DT, Stanley JC. Experimental and quasi-experimental designs for research. Wilmington, MA: Houghton Mifflin Company; 1963.

Tenore GC, Caruso D, Buonomo G, D’Avino M, Santamaria R, Irace C, et al. Annurca apple nutraceutical formulation enhances keratin expression in a human model of skin and promotes hair growth and tropism in a randomized clinical trial. J Med Food. 2018;21:90–103.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Keith SW, Brown AW, Heo M, Heymsfield SB, Allison DB. Re: “Annurca apple nutraceutical formulation enhances keratin expression in a human model of skin and promotes hair growth and tropism in a randomized clinical trial” by Tenore et al. (J Med Food 2018;21:90–103). J Med Food. 2019;22:1301–2.

Bolland MJ, Gamble GD, Avenell A, Grey A. Rounding, but not randomization method, non-normality, or correlation, affected baseline P-value distributions in randomized trials. J Clin Epidemiol. 2019;110:50–62.

Bolland MJ, Gamble GD, Avenell A, Grey A, Lumley T. Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables. J Clin Epidemiol. 2019;112:67–76.

Mascha EJ, Vetter TR, Pittet J-F. An appraisal of the Carlisle-Stouffer-Fisher method for assessing study data integrity and fraud. Anesth Analg. 2017;125:1381–5.

Carlisle JB. Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia. 2017;72:944–52.

Article   CAS   PubMed   Google Scholar  

The Editors of the Lancet Diabetes & Endocrinology. Retraction and republication—effect of a high-fat Mediterranean diet on bodyweight and waist circumference: a prespecified secondary outcomes analysis of the PREDIMED randomised controlled trial. Lancet Diabetes Endocrinol. 2019;7:334

Article   CAS   Google Scholar  

Estruch R, Ros E, Salas-Salvadó J, Covas M-I, Corella D, Arós F, et al. Primary prevention of cardiovascular disease with a Mediterranean diet. N Engl J Med. 2013;368:1279–90.

Estruch R, Ros E, Salas-Salvado J, Covas MI, Corella D, Aros F, et al. Primary prevention of cardiovascular disease with a Mediterranean diet supplemented with extra-virgin olive oil or nuts. N Engl J Med. 2018;378:e34.

Estruch R, Martínez-González MA, Corella D, Salas-Salvadó J, Fitó M, Chiva-Blanch G, et al. Effect of a high-fat Mediterranean diet on bodyweight and waist circumference: a prespecified secondary outcomes analysis of the PREDIMED randomised controlled trial. Lancet Diabetes Endocrinol. 2019;7:e6–17.

Mestre LM, Dickinson SL, Golzarri-Arroyo L, Brown AW, Allison DB. Data anomalies and apparent reporting errors in ‘Randomized controlled trial testing weight loss and abdominal obesity outcomes of moxibustion’. Biomed Eng Online. 2020;19:1–3.

Abou-Raya A, Abou-Raya S, Helmii M. The effect of vitamin D supplementation on inflammatory and hemostatic markers and disease activity in patients with systemic lupus erythematosus: a randomized placebo-controlled trial. J Rheumatol. 2013;40:265–72.

George BJ, Brown AW, Allison DB. Errors in statistical analysis and questionable randomization lead to unreliable conclusions. J Paramed Sci. 2015;6:153–4.

PubMed   PubMed Central   Google Scholar  

Bolland M, Gamble GD, Grey A, Avenell A. Empirically generated reference proportions for baseline p values from rounded summary statistics. Anaesthesia. 2020;75:1685–7.

Hsieh C-H, Tseng C-C, Shen J-Y, Chuang P-Y. Retraction Note to: randomized controlled trial testing weight loss and abdominal obesity outcomes of moxibustion. Biomed Eng Online. 2020;19:1.

Hosseini R, Mirghotbi M, Pourvali K, Kimiagar SM, Rashidkhani B, Mirghotbi T. The effect of food service system modifications on staff body mass index in an industrial organization. J Paramed Sci. 2015;6:2008–4978.

Google Scholar  

Hernán MA. The C-word: scientific euphemisms do not improve causal inference from observational data. Am J Public Health. 2018;108:616–9.

Lazarus C, Haneef R, Ravaud P, Boutron I. Classification and prevalence of spin in abstracts of non-randomized studies evaluating an intervention. BMC Med Res Methodol. 2015;15:85.

Moher D, Hopewell S, Schulz KF, Montori V, Gotzsche PC, Devereaux PJ, et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c869.

Altman DG, Bland JM. How to randomise. BMJ. 1999;319:703–4.

Kahan BC, Rehal S, Cro S. Risk of selection bias in randomised trials. Trials. 2015;16:405.

McKenzie JE. Randomisation is more than a coin toss: the role of allocation concealment. BJOG. 2019;126:1288.

Chalmers I. Why transition from alternation to randomisation in clinical trials was made. BMJ. 1999;319:1372.

Torgerson DJ, Roberts C. Understanding controlled trials. Randomisation methods: concealment. BMJ. 1999;319:375–6.

Doig GS, Simpson F. Randomization and allocation concealment: a practical guide for researchers. J Crit Care. 2005;20:187–91. discussion 91–3.

Swingler GH, Zwarenstein M. An effectiveness trial of a diagnostic test in a busy outpatients department in a developing country: issues around allocation concealment and envelope randomization. J Clin Epidemiol. 2000;53:702–6.

Altman DG, Schulz KF. Statistics notes: concealing treatment allocation in randomised trials. BMJ. 2001;323:446–7.

Kennedy ADM, Torgerson DJ, Campbell MK, Grant AM. Subversion of allocation concealment in a randomised controlled trial: a historical case study. Trials. 2017;18:204.

Clark L, Fairhurst C, Torgerson DJ. Allocation concealment in randomised controlled trials: are we getting better? BMJ. 2016;355:i5663.

Zhao W. Selection bias, allocation concealment and randomization design in clinical trials. Contemp Clin Trials. 2013;36:263–5.

Broglio K. Randomization in clinical trials: permuted blocks and stratification. JAMA. 2018;319:2223–4.

Altman DG. Avoiding bias in trials in which allocation ratio is varied. J R Soc Med. 2018;111:143–4.

Pallmann P, Bedding AW, Choodari-Oskooei B, Dimairo M, Flight L, Hampson LV, et al. Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Med. 2018;16:1–15.

Sanft T, Usiskin I, Harrigan M, Cartmel B, Lu L, Li F-Y, et al. Randomized controlled trial of weight loss versus usual care on telomere length in women with breast cancer: the lifestyle, exercise, and nutrition (LEAN) study. Breast Cancer Res Treat. 2018;172:105–12.

Demets DL, Lan KG. Interim analysis: the alpha spending function approach. Stat Med. 1994;13:1341–52.

Dickinson SL, Golzarri-Arroyo L, Brown AW, McComb B, Kahathuduwa CN, Allison DB. Change in study randomization allocation needs to be included in statistical analysis: comment on ‘Randomized controlled trial of weight loss versus usual care on telomere length in women with breast cancer: the lifestyle, exercise, and nutrition (LEAN) study’. Breast Cancer Res Treat. 2019;175:263–4.

Higgins KA, Mattes RD. A randomized controlled trial contrasting the effects of 4 low-calorie sweeteners and sucrose on body weight in adults with overweight or obesity. Am J Clin Nutr. 2019;109:1288–301.

Elobeid MA, Padilla MA, McVie T, Thomas O, Brock DW, Musser B, et al. Missing data in randomized clinical trials for weight loss: scope of the problem, state of the field, and performance of statistical methods. PLoS One. 2009;4:e6624.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Landers PS, Landers TL. Survival analysis of dropout patterns in dieting clinical trials. J Am Diet Assoc. 2004;104:1586–8.

Wood AM, White IR, Thompson SG. Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clin Trials. 2004;1:368–76.

Biswal S, Jose VM. An overview of clinical trial operation: fundamentals of clinical trial planning and management in drug development. 2nd ed. 2018.

Lichtenstein AH, Jalbert SM, Adlercreutz H, Goldin BR, Rasmussen H, Schaefer EJ, et al. Lipoprotein response to diets high in soy or animal protein with and without isoflavones in moderately hypercholesterolemic subjects. Arterioscler Thromb Vasc Biol. 2002;22:1852–8.

Shahrahmani H, Kariman N, Jannesari S, Rafieian‐Kopaei M, Mirzaei M, Ghalandari S, et al. The effect of green tea ointment on episiotomy pain and wound healing in primiparous women: a randomized, double‐blind, placebo‐controlled clinical trial. Phytother Res. 2018;32:522–30.

Draijer R, de Graaf Y, Slettenaar M, de Groot E, Wright C. Consumption of a polyphenol-rich grape-wine extract lowers ambulatory blood pressure in mildly hypertensive subjects. Nutrients. 2015;7:3138–53.

de Clercq NC, van den Ende T, Prodan A, Hemke R, Davids M, Pedersen HK, et al. Fecal microbiota transplantation from overweight or obese donors in cachectic patients with advanced gastroesophageal cancer: a randomized, double-blind, placebo-controlled, phase II atudy. Clin Cancer Res. 2021;27:3784–92.

Golzarri-Arroyo L, Dickinson SL, Allison DB. Replacement of dropouts may bias results: Comment on “The effect of green tea ointment on episiotomy pain and wound healing in primiparous women: A randomized, double-blind, placebo-controlled clinical trial”. Phytother Res. 2019;33:1955–6.

Brown AW, Li P, Bohan Brown MM, Kaiser KA, Keith SW, Oakes JM, et al. Best (but oft-forgotten) practices: designing, analyzing, and reporting cluster randomized controlled trials. Am J Clin Nutr. 2015;102:241–8.

Donner A, Klar N. Pitfalls of and controversies in cluster randomization trials. Am J Public Health. 2004;94:416–22.

Campbell M, Donner A, Klar N. Developments in cluster randomized trials and Statistics in Medicine. Stat Med. 2007;26:2–19.

Kahan BC, Morris TP. Assessing potential sources of clustering in individually randomised trials. BMC Med Res Methodol. 2013;13:1–9.

Kahan BC, Morris TP. Reporting and analysis of trials using stratified randomisation in leading medical journals: review and reanalysis. BMJ. 2012;345:e5840.

Kahan BC, Morris TP. Improper analysis of trials randomised using stratified blocks or minimisation. Stat Med. 2012;31:328–40.

Allison DR. When is it worth measuring a covariate in a randomized clinical trial? J Consult Clin Psychol. 1995;63:339.

Kahan BC, Jairath V, Doré CJ, Morris TP. The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies. Trials. 2014;15:1–7.

Vorland CJ, Brown AW, Dickinson SL, Gelman A, Allison DB. Comment on: Comprehensive nutritional and dietary intervention for autism spectrum disorder—a randomized, controlled 12-month trial, Nutrients 2018, 10, 369. Nutrients. 2019;11:1126.

Article   PubMed Central   Google Scholar  

Koretz RL. JPEN Journal Club 45. Cluster randomization. JPEN J Parenter Enter Nutr. 2019;43:941–3.

Brown AW, Altman DG, Baranowski T, Bland JM, Dawson JA, Dhurandhar NV, et al. Childhood obesity intervention studies: a narrative review and guide for investigators, authors, editors, reviewers, journalists, and readers to guard against exaggerated effectiveness claims. Obes Rev. 2019;20:1523–41.

Golzarri-Arroyo L, Oakes JM, Brown AW, Allison DB. Incorrect analyses of cluster-randomized trials that do not take clustering and nesting into account likely lead to p-values that are too small. Child Obes. 2020;16:65–6.

Li P, Brown AW, Oakes JM, Allison DB. Comment on “Intervention effects of a school-based health promotion programme on obesity related behavioural outcomes”. J Obes. 2015;2015:708181.

Vorland CJ, Brown AW, Kahathuduwa CN, Dawson JA, Gletsu-Miller N, Kyle TK, et al. Questions on ‘Intervention effects of a kindergarten-based health promotion programme on obesity related behavioural outcomes and BMI percentiles’. Prev Med Rep. 2019;17:101022.

Golzarri-Arroyo L, Vorland CJ, Thabane L, Oakes JM, Hunt ET, Brown AW, et al. Incorrect design and analysis render conclusion unsubstantiated: comment on “A digital movement in the world of inactive children: favourable outcomes of playing active video games in a pilot randomized trial”. Eur J Pediatr. 2020;179:1487–8.

Golzarri-Arroyo L, Chen X, Dickinson SL, Short KR, Thompson DM, Allison DB. Corrected analysis of ‘Using financial incentives to promote physical activity in American Indian adolescents: a randomized controlled trial’confirms conclusions. PLos One. 2020;15:e0233273.

Li P, Brown AW, Oakes JM, Allison DB. Comment on “School-based obesity prevention intervention in chilean children: effective in controlling, but not reducing obesity”. J Obes. 2015;2015:183528.

Wood AC, Brown AW, Li P, Oakes JM, Pavela G, Thomas DM, et al. A Comment on Scherr et al “A multicomponent, school-based intervention, the shaping healthy choices program, improves nutrition-related outcomes”. J Nutr Educ Behav. 2018;50:324–5.

Mietus-Snyder M, Narayanan N, Krauss RM, Laine-Graves K, McCann JC, Shigenaga MK, et al. Randomized nutrient bar supplementation improves exercise-associated changes in plasma metabolome in adolescents and adult family members at cardiometabolic risk. PLoS One. 2020;15:e0240437.

Heo M, Nair SR, Wylie-Rosett J, Faith MS, Pietrobelli A, Glassman NR, et al. Trial characteristics and appropriateness of statistical methods applied for design and analysis of randomized school-based studies addressing weight-related issues: a literature review. J Obes. 2018;2018:8767315.

Meurer ST, Lopes ACS, Almeida FA, Mendonça RdD, Benedetti TRB. Effectiveness of the VAMOS strategy for increasing physical activity and healthy dietary habits: a randomized controlled community trial. Health Educ Behav. 2019;46:406–16.

Ng YT, Phang SCW, Tan GCJ, Ng EY, Botross Henien NP, Palanisamy UDM. et al. The effects of tocotrienol-rich vitamin E (Tocovid) on diabetic neuropathy: a phase II randomized controlled trial. Nutrients.2020;12:1522

Article   CAS   PubMed Central   Google Scholar  

Lazic SE, Clarke-Williams CJ, Munafò MR. What exactly is ‘N’ in cell culture and animal experiments? PLoS Biol. 2018;16:e2005282.

George BJ, Beasley TM, Brown AW, Dawson J, Dimova R, Divers J, et al. Common scientific and statistical errors in obesity research. Obesity (Silver Spring). 2016;24:781–90.

Bland JM, Altman DG. Comparisons against baseline within randomised groups are often used and can be highly misleading. Trials. 2011;12:264.

Vorland CJ, Kyle TK, Brown AW. Comparisons of within-group instead of between-group affect the conclusions. Comment on: “Changes in weight and substrate oxidation in overweight adults following isomaltulose intake during a 12-week weight loss intervention: a randomized, double-blind, controlled trial”. Nutrients 2019, 11 (10), 2367. Nutrients. 2020;12:2335.

Bland JM, Altman DG. Best (but oft forgotten) practices: testing for treatment effects in randomized trials by separate analyses of changes from baseline in each group is a misleading approach. Am J Clin Nutr. 2015;102:991–4.

Kroeger CM, Brown AW, Allison DB. Differences in Nominal Significance (DINS) Error leads to invalid conclusions: Letter regarding, “Diet enriched with fresh coconut decreases blood glucose levels and body weight in normal adults”. J Complement Integr Med. 2019;16:2.

Koretz RL. JPEN Journal Club 40. Differences in nominal significance. JPEN J Parenter Enter Nutr. 2019;43:311.

Dickinson SL, Brown AW, Mehta T, Heymsfield SB, Ebbeling CB, Ludwig DS, et al. Incorrect analyses were used in “Different enteral nutrition formulas have no effect on glucose homeostasis but on diet-induced thermogenesis in critically ill medical patients: a randomized controlled trial” and corrected analyses are requested. Eur J Clin Nutr. 2019;73:152–3.

Brown AW, Allison DB. Letter to the Editor and Response Letter to the Editor and Author Response of assessment of a health promotion model on obese turkish children. The Journal of Nursing Research, 25, 436-446. J Nurs Res. 2018;26:373–4.

Kaiser KA, George BJ, Allison DB. Re: Errors in Zhao et al (2015), Impact of enteral nutrition on energy metabolism in patients with Crohn’s disease. World J Gastroenterol. 2016;22:2867.

Allison DB, Brown AW, George BJ, Kaiser KA. Reproducibility: a tragedy of errors. Nature. 2016;530:27–9.

Dawson JA, Brown AW, Allison DB. The stated conclusions are contradicted by the data, based on inappropriate statistics, and should be corrected: Comment on “Intervention for childhood obesity based on parents only or parents and child compared with follow-up alone”. Pediatr Obes. 2018;13:656.

Allison D. The conclusions are unsupported by the data, are based on invalid analyses, are incorrect, and should be corrected: Letter regarding “Sleep quality and body composition variations in obese male adults after 14 weeks of yoga intervention: a randomized controlled trial”. Int J Yoga. 2018;11:83–4.

Dimova RB, Allison DB. Inappropriate statistical method in a parallel-group randomized controlled trial results in unsubstantiated conclusions. Nutr J. 2015;15:58.

Dickinson SL, Foote G, Allison DB. Commentary: studying a possible placebo effect of an imaginary low-calorie diet. Front Psychiatry. 2020;11:329.

Vorland CJ, Mestre LM, Mendis SS, Brown AW. Within-group comparisons led to unsubstantiated conclusions in “Low-phytate wholegrain bread instead of high-phytate wholegrain bread in a total diet context did not improve iron status of healthy Swedish females: a 12-week, randomized, parallel-design intervention study”. Eur J Nutr. 2020;59:2813–4.

Peos J, Brown AW, Vorland CJ, Allison DB, Sainsbury A. Contrary to the conclusions stated in the paper, only dry fat-free mass was different between groups upon reanalysis. Comment on: “Intermittent energy restriction attenuates the loss of fat-free mass in resistance trained individuals. a randomized controlled trial”. J Funct Morphol Kinesiol. 2020;5:85.

Eckert I. Letter to the editor: Inadequate statistical inferences in the randomized clinical trial by Canheta et al. Clin Nutr. 2021;40:338.

Vorland CJ, Foote G, Dickinson SL, Mayo-Wilson E, Allison DB, Brown AW. Letter to the Editor Medicine Correspondence. Blog2020. https://journals.lww.com/md-journal/Blog/MedicineCorrespondenceBlog/pages/post.aspx?PostID=126 .

Sainani K. Misleading comparisons: the fallacy of comparing statistical significance. PM R. 2010;2:559–62.

Allison DB, Antoine LH, George BJ. Incorrect statistical method in parallel-groups RCT led to unsubstantiated conclusions. Lipids Health Dis. 2016;15:1–5.

Tierney JF, Vale C, Riley R, Smith CT, Stewart L, Clarke M, et al. Individual participant data (IPD) meta-analyses of randomised controlled trials: guidance on their use. PLos Med. 2015;12:e1001855.

Fisher D, Copas A, Tierney J, Parmar M. A critical review of methods for the assessment of patient-level interactions in individual participant data meta-analysis of randomized trials, and guidance for practitioners. J Clin Epidemiol. 2011;64:949–67.

Jayawardene WP, Brown AW, Dawson JA, Kahathuduwa CN, McComb B, Allison DB. Conditioning on “study” is essential for valid inference when combining individual data from multiple randomized controlled trials: a comment on Reesor et al’s School-based weight management program curbs summer weight gain among low-income Hispanic middle school students. J Sch Health. 2019;89(1):59–67. J Sch Health. 2019;89:515–8.

Li P, Stuart EA. Best (but oft-forgotten) practices: missing data methods in randomized controlled nutrition trials. Am J Clin Nutr. 2019;109:504–8.

Lachin JM. Statistical considerations in the intent-to-treat principle. Control Clin Trials. 2000;21:167–89.

Powney M, Williamson P, Kirkham J, Kolamunnage-Dona R. A review of the handling of missing longitudinal outcome data in clinical trials. Trials. 2014;15:237.

Hoppe M, Ross AB, Svelander C, Sandberg AS, Hulthen L. Reply to the comments by Vorland et al. on our paper: “low-phytate wholegrain bread instead of high-phytate wholegrain bread in a total diet context did not improve iron status of healthy Swedish females: a 12-week, randomized, parallel-design intervention study”. Eur J Nutr. 2020;59:2815–7.

Khodabakhshi A, Akbari ME, Mirzaei HR, Seyfried TN, Kalamian M, Davoodi SH. Effects of Ketogenic metabolic therapy on patients with breast cancer: a randomized controlled clinical trial. Clin Nutr. 2021;40:751–8.

Hollis S, Campbell F. What is meant by intention to treat analysis? Survey of published randomised controlled trials. BMJ. 1999;319:670–4.

Morris TP, Kahan BC, White IR. Choosing sensitivity analyses for randomised trials: principles. BMC Med Res Methodol. 2014;14:1–5.

ICH Expert Working Group. Addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials; E9(R1) 2019. https://database.ich.org/sites/default/files/E9-R1_Step4_Guideline_2019_1203.pdf .

Gupta SK. Intention-to-treat concept: a review. Perspect Clin Res. 2011;2:109.

Lichtenstein AH, Petersen K, Barger K, Hansen KE, Anderson CA, Baer DJ, et al. Perspective: design and conduct of human nutrition randomized controlled trials. Adv Nutr. 2021;12:4–20.

Gadbury G, Coffey C, Allison D. Modern statistical methods for handling missing repeated measurements in obesity trial data: beyond LOCF. Obes Rev. 2003;4:175–84.

Veberke G, Molenberghs G, Bijnens L, Shaw D. Linear mixed models in practice. New York: Springer; 1997.

Linero AR, Daniels MJ. Bayesian approaches for missing not at random outcome data: the role of identifying restrictions. Statist Sci. 2018;33:198.

Schulz KF, Altman DG, Moher D. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMC Med. 2010;8:18.

Percie du Sert N, Hurst V, Ahluwalia A, Alam S, Avey MT, Baker M, et al. The ARRIVE guidelines 2.0: updated guidelines for reporting animal research. J Cereb Blood Flow Metab. 2020;40:1769–77.

Enhancing the QUAlity and Transparency Of health Research. https://www.equator-network.org/ .

Altman DG, Simera I. Responsible reporting of health research studies: transparent, complete, accurate and timely. J Antimicrob Chemother. 2010;65:1–3.

Dechartres A, Trinquart L, Atal I, Moher D, Dickersin K, Boutron I, et al. Evolution of poor reporting and inadequate methods over time in 20 920 randomised controlled trials included in Cochrane reviews: research on research study. BMJ. 2017;357:j2490.

Kilkenny C, Parsons N, Kadyszewski E, Festing MF, Cuthill IC, Fry D, et al. Survey of the quality of experimental design, statistical analysis and reporting of research using animals. PLoS One. 2009;4:e7824.

Kahathuduwa CN, Allison DB. Letter to the editor: Insufficient reporting of randomization procedures and unexplained unequal allocation: a commentary on “Dairy-based and energy-enriched berry-based snacks improve or maintain nutritional and functional status in older people in home care. J Nutr Health Aging. 2019;23:396.

Nykänen I. Insufficient reporting of randomization procedures and unexplained unequal allocation: a commentary on “Dairy-based and energy-enriched berry-based snacks improve or maintain nutritional and functional status in older people in home care”. J Nutr Health Aging. 2019;23:397.

Vorland CJ, Brown AW, Dickinson SL, Gelman A, Allison DB. The implementation of randomization requires corrected analyses. Comment on “Comprehensive nutritional and dietary intervention for autism spectrum disorder—a randomized, controlled 12-month trial, Nutrients 2018, 10, 369”. Nutrients. 2019;11:1126.

Tekwe CD, Allison DB. Randomization by cluster, but analysis by individual without accommodating clustering in the analysis is incorrect: comment. Ann Behav Med. 2020;54:139.

Morgan PJ, Young MD, Barnes AT, Eather N, Pollock ER, Lubans DR. Correction that the analyses were adjusted for clustering: a response to Tekwe et al. Ann Behav Med. 2020;54:140.

Barnard ND, Levin SM, Gloede L, Flores R. Turning the waiting room into a classroom: weekly classes using a vegan or a portion-controlled eating plan improve diabetes control in a randomized translational study. J Acad Nutr Diet. 2018;118:1072–9.

Erratum. J Acad Nutr Diet. 2019;119:1391–3.

Douglas SM, Byers AW, Leidy HJ. Habitual breakfast patterns do not influence appetite and satiety responses in normal vs. high-protein breakfasts in overweight adolescent girls. Nutrients. 2019;11:1223.

Dalenberg JR, Patel BP, Denis R, Veldhuizen MG, Nakamura Y, Vinke PC, et al. Short-term consumption of sucralose with, but not without, carbohydrate impairs neural and metabolic sensitivity to sugar in humans. Cell Metab. 2020;31:493–502 e7.

Quin C, Erland BM, Loeppky JL, Gibson DL. Omega-3 polyunsaturated fatty acid supplementation during the pre and post-natal period: a meta-analysis and systematic review of randomized and semi-randomized controlled trials. J Nutr Intermed Metab. 2016;5:34–54.

Folkvord F, Anschütz D, Geurts M. Watching TV cooking programs: effects on actual food intake among children. J Nutr Educ Behav. 2020;52:3–9.

Zwilling CE, Strang A, Anderson E, Jurcsisn J, Johnson E, Das T, et al. Enhanced physical and cognitive performance in active duty Airmen: evidence from a randomized multimodal physical fitness and nutritional intervention. Sci Rep. 2020;10:1–13.

Altman DG. Comparability of randomised groups. J R Stat Soc Series D. 1985;34:125–36.

Senn S. Testing for baseline balance in clinical trials. Stat Med. 1994;13:1715–26.

Kraus WE, Bhapkar M, Huffman KM, Pieper CF, Das SK, Redman LM, et al. 2 years of calorie restriction and cardiometabolic risk (CALERIE): exploratory outcomes of a multicentre, phase 2, randomised controlled trial. Lancet Diabetes Endocrinol. 2019;7:673–83.

O’Connor A. Cutting 300 calories a day shows health benefits. 2019. https://www.nytimes.com/2019/07/16/well/eat/cutting-300-calories-a-day-shows-health-benefits.html .

Hoppe M, Ross AB, Svelander C, Sandberg AS, Hulthen L. Correction to: Low-phytate wholegrain bread instead of high-phytate wholegrain bread in a total diet context did not improve iron status of healthy Swedish females: a 12 week, randomized, parallel-design intervention study. Eur J Nutr. 2020;59:2819–20.

Hoppe M, Ross AB, Svelander C, Sandberg A-S, Hulthén L. Low-phytate wholegrain bread instead of high-phytate wholegrain bread in a total diet context did not improve iron status of healthy Swedish females: a 12-week, randomized, parallel-design intervention study. Eur J Nutr. 2019;58:853–64.

Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995;273:408–12.

Hewitt C, Hahn S, Torgerson DJ, Watson J, Bland JM. Adequacy and reporting of allocation concealment: review of recent trials published in four general medical journals. BMJ. 2005;330:1057–8.

Savović J, Jones HE, Altman DG, Harris RJ, Jüni P, Pildal J, et al. Influence of reported study design characteristics on intervention effect estimates from randomized, controlled trials. Ann Intern Med. 2012;157:429–38.

Page MJ, Higgins JP, Clayton G, Sterne JA, Hróbjartsson A, Savović J. Empirical evidence of study design biases in randomized trials: systematic review of meta-epidemiological studies. PLoS One. 2016;11:e0159267.

Hirst JA, Howick J, Aronson JK, Roberts N, Perera R, Koshiaris C, et al. The need for randomization in animal trials: an overview of systematic reviews. PLoS One. 2014;9:e98856.

Peinemann F, Tushabe DA, Kleijnen J. Using multiple types of studies in systematic reviews of health care interventions—a systematic review. PLoS One. 2013;8:e85035.

National Academies of Sciences, Engineering, and Medicine. Reproducibility and replicability in science. Washington, DC: The National Academies Press; 2019. p. 218.

Vorland CJ, Brown AW, Ejima K, Mayo-Wilson E, Valdez D, Allison DB. Toward fulfilling the aspirational goal of science as self-correcting: a call for editorial courage and diligence for error correction. Eur J Clin Invest. 2020;50:e13190.

Brown AW, Kaiser KA, Allison DB. Issues with data and analyses: errors, underlying themes, and potential solutions. Proc Natl Acad Sci USA. 2018;115:2563–70.

Retraction Statement. LA sprouts randomized controlled nutrition, cooking and gardening program reduces obesity and metabolic risk in Latino youth. Obesity (Silver Spring). 2015;23:2522.

The effect of vitamin D supplementation on inflammatory and hemostatic markers and disease activity in patients with systemic lupus erythematosus: a randomized placebo-controlled trial. J Rheumatol. 2018;45:1713.

Estruch R, Ros E, Salas-Salvadó J, Covas M-I, Corella D, Arós F, et al. Retraction and republication: primary prevention of cardiovascular disease with a Mediterranean diet. N Engl J Med. 2013;368:1279–90.

Kroeger CM, Brown AW, Allison DB. Unsubstantiated conclusions in randomized controlled trial of bingeeating program due to Differences in Nominal Significance (DINS) Error. https://pubpeer.com/publications/3596ABE0460E074A8FA5063606FFAB .

Zhang J, Wei Y, Allison DB. Comment on: "Chronic exposure to air pollution particles increases the risk ofobesity and metabolic syndrome: findings from a natural experiment in Beijing". https://hypothes.is/a/AQKsEg1lEeiymitN4n0bQQ .

Hannon BA, Oakes JM, Allison DB. Alternating assignment was incorrectly labeled as randomization. J Alzheimers Dis. 2019;62:1767–75.

Ito N, Saito H, Seki S, Ueda F, Asada T. Effects of composite supplement containing astaxanthin and sesamin on cognitive functions in people with mild cognitive impairment: a randomized, double-blind, placebo-controlled trial. J Alzheimers Dis. 2018;62:1767–75.

Rae P, Robb P. Megaloblastic anaemia of pregnancy: a clinical and laboratory study with particular reference to the total and labile serum folate levels. J Clin Pathol. 1970;23:379–91.

Griffen WO Jr, Young VL, Stevenson CC. A prospective comparison of gastric and jejunoileal bypass procedures for morbid obesity. Surg Obes Relat Dis. 2005;1:163–72.

Lang TA, Secic M. How to report statistics in medicine: annotated guidelines for authors, editors, and reviewers. Philadelphia, PA: ACP Press; 2006.

Altman DG. Randomisation. BMJ. 1991;302:1481.

Download references

Acknowledgements

We thank Zad Rafi for insightful feedback on an earlier draft and Jennifer Holmes for editing our manuscript.

CJV is supported in part by the Gordon and Betty Moore Foundation. DBA and AWB are supported in part by NIH grants R25HL124208 and R25DK099080. SBH is supported in part by National Institutes of Health NORC Center Grants P30DK072476, Pennington/Louisiana and P30DK040561, Harvard. CDT research is supported by National Cancer Institute Supplemental Award Number U01-CA057030-29S2. Other authors received no specific funding for this work. The opinions expressed are those of the authors and do not necessarily represent those of the NIH or any other organization.

Author information

Authors and affiliations.

Department of Applied Health Science, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA

Colby J. Vorland, Andrew W. Brown & Wasantha P. Jayawardene

Department of Nutritional Sciences, Texas Tech University, Lubbock, TX, USA

John A. Dawson

Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA

Stephanie L. Dickinson, Lilian Golzarri-Arroyo, Carmen D. Tekwe & David B. Allison

Division of Nutritional Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA

Bridget A. Hannon

Department of Public Health Sciences, Clemson University, Clemson, SC, USA

Moonseong Heo

Pennington Biomedical Research Center, Louisiana State University, Baton Rouge, LA, USA

Steven B. Heymsfield

Department of Psychiatry, School of Medicine, Texas Tech University Health Sciences Center, Lubbock, TX, USA

Chanaka N. Kahathuduwa

Department of Pharmacology and Experimental Therapeutics, Division of Biostatistics, Thomas Jefferson University, Philadelphia, PA, USA

Scott W. Keith

Department of Epidemiology, School of Public Health, University of Minnesota, Minneapolis, MN, USA

J. Michael Oakes

Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada

Lehana Thabane

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, D.B.A.; writing - original draft preparation & review and editing, all authors. All authors haveread and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Colby J. Vorland or David B. Allison .

Ethics declarations

Competing interests.

In the 36 months prior to the initial submission, DBA has received personal payments or promises for the same from: American Society for Nutrition; Alkermes, Inc.; American Statistical Association; Biofortis; California Walnut Commission; Clark Hill PLC; Columbia University; Fish & Richardson, P.C.; Frontiers Publishing; Gelesis; Henry Stewart Talks; IKEA; Indiana University; Arnold Ventures (formerly the Laura and John Arnold Foundation); Johns Hopkins University; Law Offices of Ronald Marron; MD Anderson Cancer Center; Medical College of Wisconsin; National Institutes of Health (NIH); Medpace; National Academies of Science; Sage Publishing; The Obesity Society; Sports Research Corp.; The Elements Agency, LLC; Tomasik, Kotin & Kasserman LLC; University of Alabama at Birmingham; University of Miami; Nestle; WW (formerly Weight Watchers International, LLC). Donations to a foundation have been made on his behalf by the Northarvest Bean Growers Association. DBA was previously a member (unpaid) of the International Life Sciences Institute North America Board of Trustees. In the last 36 months prior to the initial submission, AWB has received travel expenses from University of Louisville; speaking fees from Kentuckiana Health Collaborative, Purdue University, and Rippe Lifestyle Institute, Inc.; consulting fees from Epigeum (Oxford University Press), LA NORC, and Pennington Biomedical Research Center. The institution of DBA, AWB, CJV, SLD, and LG-A, Indiana University, has received funds to support their research or educational activities from: NIH; USDA; Soleno Therapeutics; National Cattlemen’s Beef Association; Eli Lilly and Co.; Reckitt Benckiser Group PLC; Alliance for Potato Research and Education; American Federation for Aging Research; Dairy Management Inc; Arnold Ventures; the Gordon and Betty Moore Foundation; the Alfred P. Sloan Foundation; Indiana CTSI, and numerous other for-profit and non-profit organizations to support the work of the School of Public Health and the university more broadly. BAH is an employee of Abbott Nutrition in Columbus, OH. SBH is on the Medical Advisory Board of Medifast Corporation. Other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary table – highlighted changes, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Vorland, C.J., Brown, A.W., Dawson, J.A. et al. Errors in the implementation, analysis, and reporting of randomization within obesity and nutrition research: a guide to their avoidance. Int J Obes 45 , 2335–2346 (2021). https://doi.org/10.1038/s41366-021-00909-z

Download citation

Received : 10 November 2020

Revised : 26 June 2021

Accepted : 06 July 2021

Published : 29 July 2021

Issue Date : November 2021

DOI : https://doi.org/10.1038/s41366-021-00909-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Optimizing clinical nutrition research: the role of adaptive and pragmatic trials.

  • Camila E. Orsso
  • Katherine L. Ford
  • Carla M. Prado

European Journal of Clinical Nutrition (2023)

Quality Assessment of Randomized Controlled Trials Published In Journal of Maxillofacial and Oral Surgery (MAOS) From 2009–2021 Using RoB-2.0 Tool

  • Amanjot Kaur
  • Rishi Kumar Bali
  • Kirti Chaudhry

Journal of Maxillofacial and Oral Surgery (2022)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

define nonrandom assignment of research participants

COMMENTS

  1. Nonrandomized control: design, measures, classic example

    A nonrandomized control study is a clinical trial where participants are not assigned by randomization. Each group in the study is assigned by the researcher or chosen by the participant to receive a certain treatment, procedure, or intervention.

  2. 8.2 Non-Equivalent Groups Designs

    When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design, then, is a between-subjects design in which participants have not been randomly assigned to conditions.

  3. PDF ap 2005 psychology cover

    Nonrandom assignment of research participants Optimistic explanatory style Proactive interference General Issues Answers must be written in sentences (subject and verb), not outlines. Expect answers to use psychological, not merely common, knowledge. Defining a concept is not sufficient.

  4. Errors in the implementation, analysis, and reporting of randomization

    These ten errors include: representing nonrandom allocation as random, failing to adequately conceal allocation, not accounting for changing allocation ratios, replacing subjects in nonrandom ways, failing to account for non-independence, drawing inferences by comparing statistical significance from within-group comparisons instead of between-gr...

  5. Random versus nonrandom assignment in controlled experiments ...

    Abstract Psychotherapy meta-analyses commonly combine results from controlled experiments that use random and nonrandom assignment without examining whether the 2 methods give the same answer. Results from this article call this practice into question.

  6. Nonexperimental Designs

    Nonrandom Assignment of Participants and Absence of Conditions In nonexperiments, there are typically no explicitly defined research conditions. For example, a researcher interested in assessing the relation between job satisfaction (an assumed cause) and organizational commitment (an assumed effect) would simply measure the level of both such ...

  7. PDF Randomized and Nonrandomized Experiments Comparing Random to Nonrandom

    the best approximation to this true counterfactual may be a group of participants whose assignment method (random or nonrandom) was itself randomly assigned to them, where all other features of the experiment are held equal. This was not done in Dehejia and Wahba (1999), Heckman et al. (1997) and Hill et al. (2004), or any other such studies.

  8. PDF Can Nonrandomized Experiments Yield Accurate Answers? A Randomized

    unclear—they confound assignment method with other study features. These confounds are problematic. Adjustments such as propensity score analysis are attempting to estimate what the effect would have been had the participants in a nonrandom-ized experiment instead been randomly assigned to the same

  9. Nonrandomized Trials: Designs and Methodology

    Abstract The randomized controlled trial is the study design of choice in order to assess the effectiveness of an intervention. Sometimes, the nonrandomized trial study design is used, whereby participants are allocated to treatment groups using nonrandom methods.

  10. Nonrandomized controlled trials

    Hence, nonrandomized controlled trials (NCTs)—a quasi-experimental study design that does not utilize random assignment—can be good alternatives when RCTs are not feasible. It should be noted that NCTs can also fall under prospective studies in addition to experimental studies. This chapter will focus on NCTs including basic principles ...

  11. Reporting Standards for Research in Psychology

    Some research areas refer to the use of random assignment of participants, whereas others use the term random allocation. Another example involves the terms multilevel model, hierarchical linear model, and mixed effects model, all of which are used to identify a similar approach to data analysis.

  12. PDF Randomized and Nonrandomized Studies

    sometimes want to consider nonrandom allocations. For example, it is possible to use systematic processes such as allocating every second subject or all the subjects with odd birth years to one of the two groups. Such processes may be much easier to administer than is randomization, and generally these systematic

  13. Random Assignment in Experiments

    In experimental research, random assignment is a way of placing participants from your sample into different treatment groups using randomization. With simple random assignment, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

  14. Sampling, Nonrandom

    Abstract. Nonrandom sampling, also called "nonprobabilistic" or "nonprobability sampling," is any sampling method in which the process that determines whether a member of the population is selected for inclusion in the sample is guided by a nonchance or nonrandom process. Such nonrandom processes can include the investigator choosing ...

  15. Nonrandom Assignment in ANCOVA

    A specific form of nonrandom assignment to treatment groups, the "alternate ranks" design, was investigated. This design eliminates the possibility of a correlation between the covariate and ...

  16. Random Selection & Assignment

    This is nonrandom (or nonequivalent) assignment. Random selection is related to sampling. Therefore it is most related to the external validity (or generalizability) of your results. After all, we would randomly sample so that our research participants better represent the larger group from which they're drawn.

  17. Treatment Assignment: Random vs. Nonrandom

    July 1984 · Statistics in Medicine. R McHugh. Employing randomization theory, this note examines the manner in which a clinical trial employing Zelen's randomized consent design yields an ...

  18. Can Nonrandomized Experiments Yield Accurate Answers? A Randomized

    This hypothesis has not been consistently supported by empirical studies; however, previous methods used to study this hypothesis have confounded assignment method with other study features. To avoid these confounding factors, this study randomly assigned participants to be in a randomized experiment or a nonrandomized experiment.

  19. Preference in Random Assignment: Implications for the Interpretation of

    Impact of Assignment Preference on Service Engagement and Retention. Research participants who are disappointed in their random assignment to a non-preferred experimental condition may refuse to participate, or else withdraw from assigned services or treatment early in the study (Hofmann et al. 1998; Kearney and Silverman 1998; Laengle et al. 2000; Macias et al. 2005; Shadish et al. 2000 ...

  20. (PDF) Can Nonrandomized Experiments Yield Accurate Answers? A

    fects of adjustments for nonrandom assignment unconfounded with assumptions about missing outcome data, partial treatment implementation, or other differences between the randomized

  21. Does Random Treatment Assignment Cause Harm to Research Participants?

    It has been claimed that by foregoing individualized treatment assignment, the process of choosing research participants' treatments by random assignment leads to an "inevitable compromise of personal care in the service of obtaining valid research results" [ 1 ].

  22. The Definition of Random Assignment In Psychology

    Random assignment refers to the use of chance procedures in psychology experiments to ensure that each participant has the same opportunity to be assigned to any given group in a study to eliminate any potential bias in the experiment at the outset.

  23. Errors in the implementation, analysis, and reporting of ...

    Randomization is an important tool used to establish causal inferences in studies designed to further our understanding of questions related to obesity and nutrition. To take advantage of the ...