Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Quasi-Experimental Design | Definition, Types & Examples

Quasi-Experimental Design | Definition, Types & Examples

Published on July 31, 2020 by Lauren Thomas . Revised on January 22, 2024.

Like a true experiment , a quasi-experimental design aims to establish a cause-and-effect relationship between an independent and dependent variable .

However, unlike a true experiment, a quasi-experiment does not rely on random assignment . Instead, subjects are assigned to groups based on non-random criteria.

Quasi-experimental design is a useful tool in situations where true experiments cannot be used for ethical or practical reasons.

Quasi-experimental design vs. experimental design

Table of contents

Differences between quasi-experiments and true experiments, types of quasi-experimental designs, when to use quasi-experimental design, advantages and disadvantages, other interesting articles, frequently asked questions about quasi-experimental designs.

There are several common differences between true and quasi-experimental designs.

True experimental design Quasi-experimental design
Assignment to treatment The researcher subjects to control and treatment groups. Some other, method is used to assign subjects to groups.
Control over treatment The researcher usually . The researcher often , but instead studies pre-existing groups that received different treatments after the fact.
Use of Requires the use of . Control groups are not required (although they are commonly used).

Example of a true experiment vs a quasi-experiment

However, for ethical reasons, the directors of the mental health clinic may not give you permission to randomly assign their patients to treatments. In this case, you cannot run a true experiment.

Instead, you can use a quasi-experimental design.

You can use these pre-existing groups to study the symptom progression of the patients treated with the new therapy versus those receiving the standard course of treatment.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

example of single subject quasi experimental research

Many types of quasi-experimental designs exist. Here we explain three of the most common types: nonequivalent groups design, regression discontinuity, and natural experiments.

Nonequivalent groups design

In nonequivalent group design, the researcher chooses existing groups that appear similar, but where only one of the groups experiences the treatment.

In a true experiment with random assignment , the control and treatment groups are considered equivalent in every way other than the treatment. But in a quasi-experiment where the groups are not random, they may differ in other ways—they are nonequivalent groups .

When using this kind of design, researchers try to account for any confounding variables by controlling for them in their analysis or by choosing groups that are as similar as possible.

This is the most common type of quasi-experimental design.

Regression discontinuity

Many potential treatments that researchers wish to study are designed around an essentially arbitrary cutoff, where those above the threshold receive the treatment and those below it do not.

Near this threshold, the differences between the two groups are often so minimal as to be nearly nonexistent. Therefore, researchers can use individuals just below the threshold as a control group and those just above as a treatment group.

However, since the exact cutoff score is arbitrary, the students near the threshold—those who just barely pass the exam and those who fail by a very small margin—tend to be very similar, with the small differences in their scores mostly due to random chance. You can therefore conclude that any outcome differences must come from the school they attended.

Natural experiments

In both laboratory and field experiments, researchers normally control which group the subjects are assigned to. In a natural experiment, an external event or situation (“nature”) results in the random or random-like assignment of subjects to the treatment group.

Even though some use random assignments, natural experiments are not considered to be true experiments because they are observational in nature.

Although the researchers have no control over the independent variable , they can exploit this event after the fact to study the effect of the treatment.

However, as they could not afford to cover everyone who they deemed eligible for the program, they instead allocated spots in the program based on a random lottery.

Although true experiments have higher internal validity , you might choose to use a quasi-experimental design for ethical or practical reasons.

Sometimes it would be unethical to provide or withhold a treatment on a random basis, so a true experiment is not feasible. In this case, a quasi-experiment can allow you to study the same causal relationship without the ethical issues.

The Oregon Health Study is a good example. It would be unethical to randomly provide some people with health insurance but purposely prevent others from receiving it solely for the purposes of research.

However, since the Oregon government faced financial constraints and decided to provide health insurance via lottery, studying this event after the fact is a much more ethical approach to studying the same problem.

True experimental design may be infeasible to implement or simply too expensive, particularly for researchers without access to large funding streams.

At other times, too much work is involved in recruiting and properly designing an experimental intervention for an adequate number of subjects to justify a true experiment.

In either case, quasi-experimental designs allow you to study the question by taking advantage of data that has previously been paid for or collected by others (often the government).

Quasi-experimental designs have various pros and cons compared to other types of studies.

  • Higher external validity than most true experiments, because they often involve real-world interventions instead of artificial laboratory settings.
  • Higher internal validity than other non-experimental types of research, because they allow you to better control for confounding variables than other types of studies do.
  • Lower internal validity than true experiments—without randomization, it can be difficult to verify that all confounding variables have been accounted for.
  • The use of retrospective data that has already been collected for other purposes can be inaccurate, incomplete or difficult to access.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Ecological validity

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Thomas, L. (2024, January 22). Quasi-Experimental Design | Definition, Types & Examples. Scribbr. Retrieved July 10, 2024, from https://www.scribbr.com/methodology/quasi-experimental-design/

Is this article helpful?

Lauren Thomas

Lauren Thomas

Other students also liked, guide to experimental design | overview, steps, & examples, random assignment in experiments | introduction & examples, control variables | what are they & why do they matter, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Quasi Experimental Design Overview & Examples

By Jim Frost Leave a Comment

What is a Quasi Experimental Design?

A quasi experimental design is a method for identifying causal relationships that does not randomly assign participants to the experimental groups. Instead, researchers use a non-random process. For example, they might use an eligibility cutoff score or preexisting groups to determine who receives the treatment.

Image illustrating a quasi experimental design.

Quasi-experimental research is a design that closely resembles experimental research but is different. The term “quasi” means “resembling,” so you can think of it as a cousin to actual experiments. In these studies, researchers can manipulate an independent variable — that is, they change one factor to see what effect it has. However, unlike true experimental research, participants are not randomly assigned to different groups.

Learn more about Experimental Designs: Definition & Types .

When to Use Quasi-Experimental Design

Researchers typically use a quasi-experimental design because they can’t randomize due to practical or ethical concerns. For example:

  • Practical Constraints : A school interested in testing a new teaching method can only implement it in preexisting classes and cannot randomly assign students.
  • Ethical Concerns : A medical study might not be able to randomly assign participants to a treatment group for an experimental medication when they are already taking a proven drug.

Quasi-experimental designs also come in handy when researchers want to study the effects of naturally occurring events, like policy changes or environmental shifts, where they can’t control who is exposed to the treatment.

Quasi-experimental designs occupy a unique position in the spectrum of research methodologies, sitting between observational studies and true experiments. This middle ground offers a blend of both worlds, addressing some limitations of purely observational studies while navigating the constraints often accompanying true experiments.

A significant advantage of quasi-experimental research over purely observational studies and correlational research is that it addresses the issue of directionality, determining which variable is the cause and which is the effect. In quasi-experiments, an intervention typically occurs during the investigation, and the researchers record outcomes before and after it, increasing the confidence that it causes the observed changes.

However, it’s crucial to recognize its limitations as well. Controlling confounding variables is a larger concern for a quasi-experimental design than a true experiment because it lacks random assignment.

In sum, quasi-experimental designs offer a valuable research approach when random assignment is not feasible, providing a more structured and controlled framework than observational studies while acknowledging and attempting to address potential confounders.

Types of Quasi-Experimental Designs and Examples

Quasi-experimental studies use various methods, depending on the scenario.

Natural Experiments

This design uses naturally occurring events or changes to create the treatment and control groups. Researchers compare outcomes between those whom the event affected and those it did not affect. Analysts use statistical controls to account for confounders that the researchers must also measure.

Natural experiments are related to observational studies, but they allow for a clearer causality inference because the external event or policy change provides both a form of quasi-random group assignment and a definite start date for the intervention.

For example, in a natural experiment utilizing a quasi-experimental design, researchers study the impact of a significant economic policy change on small business growth. The policy is implemented in one state but not in neighboring states. This scenario creates an unplanned experimental setup, where the state with the new policy serves as the treatment group, and the neighboring states act as the control group.

Researchers are primarily interested in small business growth rates but need to record various confounders that can impact growth rates. Hence, they record state economic indicators, investment levels, and employment figures. By recording these metrics across the states, they can include them in the model as covariates and control them statistically. This method allows researchers to estimate differences in small business growth due to the policy itself, separate from the various confounders.

Nonequivalent Groups Design

This method involves matching existing groups that are similar but not identical. Researchers attempt to find groups that are as equivalent as possible, particularly for factors likely to affect the outcome.

For instance, researchers use a nonequivalent groups quasi-experimental design to evaluate the effectiveness of a new teaching method in improving students’ mathematics performance. A school district considering the teaching method is planning the study. Students are already divided into schools, preventing random assignment.

The researchers matched two schools with similar demographics, baseline academic performance, and resources. The school using the traditional methodology is the control, while the other uses the new approach. Researchers are evaluating differences in educational outcomes between the two methods.

They perform a pretest to identify differences between the schools that might affect the outcome and include them as covariates to control for confounding. They also record outcomes before and after the intervention to have a larger context for the changes they observe.

Regression Discontinuity

This process assigns subjects to a treatment or control group based on a predetermined cutoff point (e.g., a test score). The analysis primarily focuses on participants near the cutoff point, as they are likely similar except for the treatment received. By comparing participants just above and below the cutoff, the design controls for confounders that vary smoothly around the cutoff.

For example, in a regression discontinuity quasi-experimental design focusing on a new medical treatment for depression, researchers use depression scores as the cutoff point. Individuals with depression scores just above a certain threshold are assigned to receive the latest treatment, while those just below the threshold do not receive it. This method creates two closely matched groups: one that barely qualifies for treatment and one that barely misses out.

By comparing the mental health outcomes of these two groups over time, researchers can assess the effectiveness of the new treatment. The assumption is that the only significant difference between the groups is whether they received the treatment, thereby isolating its impact on depression outcomes.

Controlling Confounders in a Quasi-Experimental Design

Accounting for confounding variables is a challenging but essential task for a quasi-experimental design.

In a true experiment, the random assignment process equalizes confounders across the groups to nullify their overall effect. It’s the gold standard because it works on all confounders, known and unknown.

Unfortunately, the lack of random assignment can allow differences between the groups to exist before the intervention. These confounding factors might ultimately explain the results rather than the intervention.

Consequently, researchers must use other methods to equalize the groups roughly using matching and cutoff values or statistically adjust for preexisting differences they measure to reduce the impact of confounders.

A key strength of quasi-experiments is their frequent use of “pre-post testing.” This approach involves conducting initial tests before collecting data to check for preexisting differences between groups that could impact the study’s outcome. By identifying these variables early on and including them as covariates, researchers can more effectively control potential confounders in their statistical analysis.

Additionally, researchers frequently track outcomes before and after the intervention to better understand the context for changes they observe.

Statisticians consider these methods to be less effective than randomization. Hence, quasi-experiments fall somewhere in the middle when it comes to internal validity , or how well the study can identify causal relationships versus mere correlation . They’re more conclusive than correlational studies but not as solid as true experiments.

In conclusion, quasi-experimental designs offer researchers a versatile and practical approach when random assignment is not feasible. This methodology bridges the gap between controlled experiments and observational studies, providing a valuable tool for investigating cause-and-effect relationships in real-world settings. Researchers can address ethical and logistical constraints by understanding and leveraging the different types of quasi-experimental designs while still obtaining insightful and meaningful results.

Cook, T. D., & Campbell, D. T. (1979).  Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin

Share this:

example of single subject quasi experimental research

Reader Interactions

Comments and questions cancel reply.

A Modern Guide to Understanding and Conducting Research in Psychology

Chapter 7 quasi-experimental research, learning objectives.

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix quasi means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions ( Cook et al., 1979 ) . Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here, focusing first on nonequivalent groups, pretest-posttest, interrupted time series, and combination designs before turning to single subject designs (including reversal and multiple-baseline designs).

7.1 Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

7.2 Pretest-Posttest Design

In a pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an STEM education program on elementary school students’ attitudes toward science, technology, engineering and math. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the STEM program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of history . Other things might have happened between the pretest and the posttest. Perhaps an science program aired on television and many of the students watched it, or perhaps a major scientific discover occured and many of the students heard about it. Another category of alternative explanations goes under the name of maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become more exposed to STEM subjects in class or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study because of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all ( Posternak & Miller, 2001 ) . Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Finally, it is possible that the act of taking a pretest can sensitize participants to the measurement process or heighten their awareness of the variable under investigation. This heightened sensitivity, called a testing effect , can subsequently lead to changes in their posttest responses, even in the absence of any external intervention effect.

7.3 Interrupted Time Series Design

A variant of the pretest-posttest design is the interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this is “interrupted” by a treatment. In a recent COVID-19 study, the intervention involved the implementation of state-issued mask mandates and restrictions on on-premises restaurant dining. The researchers examined the impact of these measures on COVID-19 cases and deaths ( Guy Jr et al., 2021 ) . Since there was a rapid reduction in daily case and death growth rates following the implementation of mask mandates, and this effect persisted for an extended period, the researchers concluded that the implementation of mask mandates was the cause of the decrease in COVID-19 transmission. This study employed an interrupted time series design, similar to a pretest-posttest design, as it involved measuring the outcomes before and after the intervention. However, unlike the pretest-posttest design, it incorporated multiple measurements before and after the intervention, providing a more comprehensive analysis of the policy impacts.

Figure 7.1 shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of Figure 7.1 shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of Figure 7.1 shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Two line graphs. The x-axes on both are labeled Week and range from 0 to 14. The y-axes on both are labeled Absences and range from 0 to 8. Between weeks 7 and 8 a vertical dotted line indicates when a treatment was introduced. Both graphs show generally high levels of absences from weeks 1 through 7 (before the treatment) and only 2 absences in week 8 (the first observation after the treatment). The top graph shows the absence level staying low from weeks 9 to 14. The bottom graph shows the absence level for weeks 9 to 15 bouncing around at the same high levels as before the treatment.

Figure 7.1: Hypothetical interrupted time-series design. The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not.

7.4 Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does not receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve more than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their current level of engagement in pro-environmental behaviors (i.e., recycling, eating less red meat, abstaining for single-use plastics, etc.), then are exposed to an pro-environmental program in which they learn about the effects of human caused climate change on the planet, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an pro-environmental program, and finally are given a posttest. Again, if students in the treatment condition become more involved in pro-environmental behaviors, this could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become engage in more pro-environmental behaviors than students in the control condition. But if it is a matter of history (e.g., news of a forest fire or drought) or maturation (e.g., improved reasoning or sense of responsibility), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a local heat wave with record high temperatures), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, this kind of design has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

KEY TAKEAWAYS

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two college professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.

regression to the mean

Spontaneous remission, 7.5 single-subject research.

  • Explain what single-subject research is, including how it differs from other types of psychological research and who uses single-subject research and why.
  • Design simple single-subject studies using reversal and multiple-baseline designs.
  • Explain how single-subject research designs address the issue of internal validity.
  • Interpret the results of simple single-subject studies based on the visual inspection of graphed data.
  • Explain some of the points of disagreement between advocates of single-subject research and advocates of group research.

Researcher Vance Hall and his colleagues were faced with the challenge of increasing the extent to which six disruptive elementary school students stayed focused on their schoolwork ( Hall et al., 1968 ) . For each of several days, the researchers carefully recorded whether or not each student was doing schoolwork every 10 seconds during a 30-minute period. Once they had established this baseline, they introduced a treatment. The treatment was that when the student was doing schoolwork, the teacher gave him or her positive attention in the form of a comment like “good work” or a pat on the shoulder. The result was that all of the students dramatically increased their time spent on schoolwork and decreased their disruptive behavior during this treatment phase. For example, a student named Robbie originally spent 25% of his time on schoolwork and the other 75% “snapping rubber bands, playing with toys from his pocket, and talking and laughing with peers” (p. 3). During the treatment phase, however, he spent 71% of his time on schoolwork and only 29% on other activities. Finally, when the researchers had the teacher stop giving positive attention, the students all decreased their studying and increased their disruptive behavior. This was consistent with the claim that it was, in fact, the positive attention that was responsible for the increase in studying. This was one of the first studies to show that attending to positive behavior—and ignoring negative behavior—could be a quick and effective way to deal with problem behavior in an applied setting.

Single-subject research has shown that positive attention from a teacher for studying can increase studying and decrease disruptive behavior. *Photo by Jerry Wang on Unsplash.*

Figure 7.2: Single-subject research has shown that positive attention from a teacher for studying can increase studying and decrease disruptive behavior. Photo by Jerry Wang on Unsplash.

Most of this book is about what can be called group research, which typically involves studying a large number of participants and combining their data to draw general conclusions about human behavior. The study by Hall and his colleagues, in contrast, is an example of single-subject research, which typically involves studying a small number of participants and focusing closely on each individual. In this section, we consider this alternative approach. We begin with an overview of single-subject research, including some assumptions on which it is based, who conducts it, and why they do. We then look at some basic single-subject research designs and how the data from those designs are analyzed. Finally, we consider some of the strengths and weaknesses of single-subject research as compared with group research and see how these two approaches can complement each other.

Overview of Single-Subject Research

What is single-subject research.

Single-subject research is a type of quantitative, quasi-experimental research that involves studying in detail the behavior of each of a small number of participants. Note that the term single-subject does not mean that only one participant is studied; it is more typical for there to be somewhere between two and 10 participants. (This is why single-subject research designs are sometimes called small-n designs, where n is the statistical symbol for the sample size.) Single-subject research can be contrasted with group research , which typically involves studying large numbers of participants and examining their behavior primarily in terms of group means, standard deviations, and so on. The majority of this book is devoted to understanding group research, which is the most common approach in psychology. But single-subject research is an important alternative, and it is the primary approach in some areas of psychology.

Before continuing, it is important to distinguish single-subject research from two other approaches, both of which involve studying in detail a small number of participants. One is qualitative research, which focuses on understanding people’s subjective experience by collecting relatively unstructured data (e.g., detailed interviews) and analyzing those data using narrative rather than quantitative techniques (see. Single-subject research, in contrast, focuses on understanding objective behavior through experimental manipulation and control, collecting highly structured data, and analyzing those data quantitatively.

It is also important to distinguish single-subject research from case studies. A case study is a detailed description of an individual, which can include both qualitative and quantitative analyses. (Case studies that include only qualitative analyses can be considered a type of qualitative research.) The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see box “The Case of ‘Anna O.’”) and John Watson and Rosalie Rayner’s description of Little Albert ( Watson & Rayner, 1920 ) who learned to fear a white rat—along with other furry objects—when the researchers made a loud noise while he was playing with the rat. Case studies can be useful for suggesting new research questions and for illustrating general principles. They can also help researchers understand rare phenomena, such as the effects of damage to a specific part of the human brain. As a general rule, however, case studies cannot substitute for carefully designed group or single-subject research studies. One reason is that case studies usually do not allow researchers to determine whether specific events are causally related, or even related at all. For example, if a patient is described in a case study as having been sexually abused as a child and then as having developed an eating disorder as a teenager, there is no way to determine whether these two events had anything to do with each other. A second reason is that an individual case can always be unusual in some way and therefore be unrepresentative of people more generally. Thus case studies have serious problems with both internal and external validity.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis ( Freud, 1957 ) . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst (p. 9).

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return.

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

"Anna O." was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: Wikimedia Commons

Figure 7.3: “Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: Wikimedia Commons

Assumptions of Single-Subject Research

Again, single-subject research involves studying a small number of participants and focusing intensively on the behavior of each one. But why take this approach instead of the group approach? There are two important assumptions underlying single-subject research, and it will help to consider them now.

First and foremost is the assumption that it is important to focus intensively on the behavior of individual participants. One reason for this is that group research can hide individual differences and generate results that do not represent the behavior of any individual. For example, a treatment that has a positive effect for half the people exposed to it but a negative effect for the other half would, on average, appear to have no effect at all. Single-subject research, however, would likely reveal these individual differences. A second reason to focus intensively on individuals is that sometimes it is the behavior of a particular individual that is primarily of interest. A school psychologist, for example, might be interested in changing the behavior of a particular disruptive student. Although previous published research (both single-subject and group research) is likely to provide some guidance on how to do this, conducting a study on this student would be more direct and probably more effective.

Another assumption of single-subject research is that it is important to study strong and consistent effects that have biological or social importance. Applied researchers, in particular, are interested in treatments that have substantial effects on important behaviors and that can be implemented reliably in the real-world contexts in which they occur. This is sometimes referred to as social validity ( Wolf, 1978 ) . The study by Hall and his colleagues, for example, had good social validity because it showed strong and consistent effects of positive teacher attention on a behavior that is of obvious importance to teachers, parents, and students. Furthermore, the teachers found the treatment easy to implement, even in their often chaotic elementary school classrooms.

Who Uses Single-Subject Research?

Single-subject research has been around as long as the field of psychology itself. In the late 1800s, one of psychology’s founders, Wilhelm Wundt, studied sensation and consciousness by focusing intensively on each of a small number of research participants. Herman Ebbinghaus’s research on memory and Ivan Pavlov’s research on classical conditioning are other early examples, both of which are still described in almost every introductory psychology textbook.

In the middle of the 20th century, B. F. Skinner clarified many of the assumptions underlying single-subject research and refined many of its techniques ( Skinner, 1938 ) . He and other researchers then used it to describe how rewards, punishments, and other external factors affect behavior over time. This work was carried out primarily using nonhuman subjects—mostly rats and pigeons. This approach, which Skinner called the experimental analysis of behavior —remains an important subfield of psychology and continues to rely almost exclusively on single-subject research. For examples of this work, look at any issue of the Journal of the Experimental Analysis of Behavior . By the 1960s, many researchers were interested in using this approach to conduct applied research primarily with humans—a subfield now called applied behavior analysis ( Baer et al., 1968 ) . Applied behavior analysis plays a significant role in contemporary research on developmental disabilities, education, organizational behavior, and health, among many other areas. Examples of this work (including the study by Hall and his colleagues) can be found in the Journal of Applied Behavior Analysis . The single-subject approach can also be used by clinicians who take any theoretical perspective—behavioral, cognitive, psychodynamic, or humanistic—to study processes of therapeutic change with individual clients and to document their clients’ improvement ( Kazdin, 2019 ) .

Single-Subject Research Designs

General features of single-subject designs.

Before looking at any specific single-subject research designs, it will be helpful to consider some features that are common to most of them. Many of these features are illustrated in Figure 7.4 , which shows the results of a generic single-subject study. First, the dependent variable (represented on the y-axis of the graph) is measured repeatedly over time (represented by the x-axis) at regular intervals. Second, the study is divided into distinct phases, and the participant is tested under one condition per phase. The conditions are often designated by capital letters: A, B, C, and so on. Thus Figure 7.4 represents a design in which the participant was tested first in one condition (A), then tested in another condition (B), and finally retested in the original condition (A). (This is called a reversal design and will be discussed in more detail shortly.)

Results of a generic single-subject study illustrating several principles of single-subject research.

Figure 7.4: Results of a generic single-subject study illustrating several principles of single-subject research.

Another important aspect of single-subject research is that the change from one condition to the next does not usually occur after a fixed amount of time or number of observations. Instead, it depends on the participant’s behavior. Specifically, the researcher waits until the participant’s behavior in one condition becomes fairly consistent from observation to observation before changing conditions. This is sometimes referred to as the steady state strategy ( Sidman, 1960 ) . The idea is that when the dependent variable has reached a steady state, then any change across conditions will be relatively easy to detect. Recall that we encountered this same principle when discussing experimental research more generally. The effect of an independent variable is easier to detect when the “noise” in the data is minimized.

Reversal Designs

The most basic single-subject research design is the reversal design , also called the ABA design . During the first phase, A, a baseline is established for the dependent variable. This is the level of responding before any treatment is introduced, and therefore the baseline phase is a kind of control condition. When steady state responding is reached, phase B begins as the researcher introduces the treatment. Again, the researcher waits until that dependent variable reaches a steady state so that it is clear whether and how much it has changed. Finally, the researcher removes the treatment and again waits until the dependent variable reaches a steady state. This basic reversal design can also be extended with the reintroduction of the treatment (ABAB), another return to baseline (ABABA), and so on. The study by Hall and his colleagues was an ABAB reversal design (Figure 7.5 ).

An approximation of the results for Hall and colleagues’ participant Robbie in their ABAB reversal design. The percentage of time he spent studying (the dependent variable) was low during the first baseline phase, increased during the first treatment phase until it leveled off, decreased during the second baseline phase, and again increased during the second treatment phase.

Figure 7.5: An approximation of the results for Hall and colleagues’ participant Robbie in their ABAB reversal design. The percentage of time he spent studying (the dependent variable) was low during the first baseline phase, increased during the first treatment phase until it leveled off, decreased during the second baseline phase, and again increased during the second treatment phase.

Why is the reversal—the removal of the treatment—considered to be necessary in this type of design? If the dependent variable changes after the treatment is introduced, it is not always clear that the treatment was responsible for the change. It is possible that something else changed at around the same time and that this extraneous variable is responsible for the change in the dependent variable. But if the dependent variable changes with the introduction of the treatment and then changes back with the removal of the treatment, it is much clearer that the treatment (and removal of the treatment) is the cause. In other words, the reversal greatly increases the internal validity of the study.

Multiple-Baseline Designs

There are two potential problems with the reversal design—both of which have to do with the removal of the treatment. One is that if a treatment is working, it may be unethical to remove it. For example, if a treatment seemed to reduce the incidence of self-injury in a developmentally disabled child, it would be unethical to remove that treatment just to show that the incidence of self-injury increases. The second problem is that the dependent variable may not return to baseline when the treatment is removed. For example, when positive attention for studying is removed, a student might continue to study at an increased rate. This could mean that the positive attention had a lasting effect on the student’s studying, which of course would be good, but it could also mean that the positive attention was not really the cause of the increased studying in the first place.

One solution to these problems is to use a multiple-baseline design , which is represented in Figure 7.6 . In one version of the design, a baseline is established for each of several participants, and the treatment is then introduced for each one. In essence, each participant is tested in an AB design. The key to this design is that the treatment is introduced at a different time for each participant. The idea is that if the dependent variable changes when the treatment is introduced for one participant, it might be a coincidence. But if the dependent variable changes when the treatment is introduced for multiple participants—especially when the treatment is introduced at different times for the different participants—then it is less likely to be a coincidence.

Results of a generic multiple-baseline study. The multiple baselines can be for different participants, dependent variables, or settings. The treatment is introduced at a different time on each baseline.

Figure 7.6: Results of a generic multiple-baseline study. The multiple baselines can be for different participants, dependent variables, or settings. The treatment is introduced at a different time on each baseline.

As an example, consider a study by Scott Ross and Robert Horner ( Ross et al., 2009 ) . They were interested in how a school-wide bullying prevention program affected the bullying behavior of particular problem students. At each of three different schools, the researchers studied two students who had regularly engaged in bullying. During the baseline phase, they observed the students for 10-minute periods each day during lunch recess and counted the number of aggressive behaviors they exhibited toward their peers. (The researchers used handheld computers to help record the data.) After 2 weeks, they implemented the program at one school. After 2 more weeks, they implemented it at the second school. And after 2 more weeks, they implemented it at the third school. They found that the number of aggressive behaviors exhibited by each student dropped shortly after the program was implemented at his or her school. Notice that if the researchers had only studied one school or if they had introduced the treatment at the same time at all three schools, then it would be unclear whether the reduction in aggressive behaviors was due to the bullying program or something else that happened at about the same time it was introduced (e.g., a holiday, a television program, a change in the weather). But with their multiple-baseline design, this kind of coincidence would have to happen three separate times—an unlikely occurrence—to explain their results.

Data Analysis in Single-Subject Research

In addition to its focus on individual participants, single-subject research differs from group research in the way the data are typically analyzed. As we have seen throughout the book, group research involves combining data across participants. Inferential statistics are used to help decide whether the result for the sample is likely to generalize to the population. Single-subject research, by contrast, relies heavily on a very different approach called visual inspection . This means plotting individual participants’ data as shown throughout this chapter, looking carefully at those data, and making judgments about whether and to what extent the independent variable had an effect on the dependent variable. Inferential statistics are typically not used.

In visually inspecting their data, single-subject researchers take several factors into account. One of them is changes in the level of the dependent variable from condition to condition. If the dependent variable is much higher or much lower in one condition than another, this suggests that the treatment had an effect. A second factor is trend , which refers to gradual increases or decreases in the dependent variable across observations. If the dependent variable begins increasing or decreasing with a change in conditions, then again this suggests that the treatment had an effect. It can be especially telling when a trend changes directions—for example, when an unwanted behavior is increasing during baseline but then begins to decrease with the introduction of the treatment. A third factor is latency , which is the time it takes for the dependent variable to begin changing after a change in conditions. In general, if a change in the dependent variable begins shortly after a change in conditions, this suggests that the treatment was responsible.

In the top panel of Figure 7.7 , there are fairly obvious changes in the level and trend of the dependent variable from condition to condition. Furthermore, the latencies of these changes are short; the change happens immediately. This pattern of results strongly suggests that the treatment was responsible for the changes in the dependent variable. In the bottom panel of Figure 7.7 , however, the changes in level are fairly small. And although there appears to be an increasing trend in the treatment condition, it looks as though it might be a continuation of a trend that had already begun during baseline. This pattern of results strongly suggests that the treatment was not responsible for any changes in the dependent variable—at least not to the extent that single-subject researchers typically hope to see.

Visual inspection of the data suggests an effective treatment in the top panel but an ineffective treatment in the bottom panel.

Figure 7.7: Visual inspection of the data suggests an effective treatment in the top panel but an ineffective treatment in the bottom panel.

The results of single-subject research can also be analyzed using statistical procedures—and this is becoming more common. There are many different approaches, and single-subject researchers continue to debate which are the most useful. One approach parallels what is typically done in group research. The mean and standard deviation of each participant’s responses under each condition are computed and compared, and inferential statistical tests such as the t test or analysis of variance are applied ( Fisch, 2001 ) . (Note that averaging across participants is less common.) Another approach is to compute the percentage of nonoverlapping data (PND) for each participant ( Scruggs & Mastropieri, 2021 ) . This is the percentage of responses in the treatment condition that are more extreme than the most extreme response in a relevant control condition. In the study of Hall and his colleagues, for example, all measures of Robbie’s study time in the first treatment condition were greater than the highest measure in the first baseline, for a PND of 100%. The greater the percentage of nonoverlapping data, the stronger the treatment effect. Still, formal statistical approaches to data analysis in single-subject research are generally considered a supplement to visual inspection, not a replacement for it.

The Single-Subject Versus Group “Debate”

Single-subject research is similar to group research—especially experimental group research—in many ways. They are both quantitative approaches that try to establish causal relationships by manipulating an independent variable, measuring a dependent variable, and controlling extraneous variables. As we will see, single-subject research and group research are probably best conceptualized as complementary approaches.

Data Analysis

One set of disagreements revolves around the issue of data analysis. Some advocates of group research worry that visual inspection is inadequate for deciding whether and to what extent a treatment has affected a dependent variable. One specific concern is that visual inspection is not sensitive enough to detect weak effects. A second is that visual inspection can be unreliable, with different researchers reaching different conclusions about the same set of data ( Danov & Symons, 2008 ) . A third is that the results of visual inspection—an overall judgment of whether or not a treatment was effective—cannot be clearly and efficiently summarized or compared across studies (unlike the measures of relationship strength typically used in group research).

In general, single-subject researchers share these concerns. However, they also argue that their use of the steady state strategy, combined with their focus on strong and consistent effects, minimizes most of them. If the effect of a treatment is difficult to detect by visual inspection because the effect is weak or the data are noisy, then single-subject researchers look for ways to increase the strength of the effect or reduce the noise in the data by controlling extraneous variables (e.g., by administering the treatment more consistently). If the effect is still difficult to detect, then they are likely to consider it neither strong enough nor consistent enough to be of further interest. Many single-subject researchers also point out that statistical analysis is becoming increasingly common and that many of them are using it as a supplement to visual inspection—especially for the purpose of comparing results across studies ( Scruggs & Mastropieri, 2021 ) .

Turning the tables, some advocates of single-subject research worry about the way that group researchers analyze their data. Specifically, they point out that focusing on group means can be highly misleading. Again, imagine that a treatment has a strong positive effect on half the people exposed to it and an equally strong negative effect on the other half. In a traditional between-subjects experiment, the positive effect on half the participants in the treatment condition would be statistically cancelled out by the negative effect on the other half. The mean for the treatment group would then be the same as the mean for the control group, making it seem as though the treatment had no effect when in fact it had a strong effect on every single participant!

But again, group researchers share this concern. Although they do focus on group statistics, they also emphasize the importance of examining distributions of individual scores. For example, if some participants were positively affected by a treatment and others negatively affected by it, this would produce a bimodal distribution of scores and could be detected by looking at a histogram of the data. The use of within-subjects designs is another strategy that allows group researchers to observe effects at the individual level and even to specify what percentage of individuals exhibit strong, medium, weak, and even negative effects.

External Validity

The second issue about which single-subject and group researchers sometimes disagree has to do with external validity—the ability to generalize the results of a study beyond the people and situation actually studied. In particular, advocates of group research point out the difficulty in knowing whether results for just a few participants are likely to generalize to others in the population. Imagine, for example, that in a single-subject study, a treatment has been shown to reduce self-injury for each of two developmentally disabled children. Even if the effect is strong for these two children, how can one know whether this treatment is likely to work for other developmentally disabled children?

Again, single-subject researchers share this concern. In response, they note that the strong and consistent effects they are typically interested in—even when observed in small samples—are likely to generalize to others in the population. Single-subject researchers also note that they place a strong emphasis on replicating their research results. When they observe an effect with a small sample of participants, they typically try to replicate it with another small sample—perhaps with a slightly different type of participant or under slightly different conditions. Each time they observe similar results, they rightfully become more confident in the generality of those results. Single-subject researchers can also point to the fact that the principles of classical and operant conditioning—most of which were discovered using the single-subject approach—have been successfully generalized across an incredibly wide range of species and situations.

And again turning the tables, single-subject researchers have concerns of their own about the external validity of group research. One extremely important point they make is that studying large groups of participants does not entirely solve the problem of generalizing to other individuals. Imagine, for example, a treatment that has been shown to have a small positive effect on average in a large group study. It is likely that although many participants exhibited a small positive effect, others exhibited a large positive effect, and still others exhibited a small negative effect. When it comes to applying this treatment to another large group , we can be fairly sure that it will have a small effect on average. But when it comes to applying this treatment to another individual , we cannot be sure whether it will have a small, a large, or even a negative effect. Another point that single-subject researchers make is that group researchers also face a similar problem when they study a single situation and then generalize their results to other situations. For example, researchers who conduct a study on the effect of cell phone use on drivers on a closed oval track probably want to apply their results to drivers in many other real-world driving situations. But notice that this requires generalizing from a single situation to a population of situations. Thus the ability to generalize is based on much more than just the sheer number of participants one has studied. It requires a careful consideration of the similarity of the participants and situations studied to the population of participants and situations that one wants to generalize to ( Shadish et al., 2002 ) .

Single-Subject and Group Research as Complementary Methods

As with quantitative and qualitative research, it is probably best to conceptualize single-subject research and group research as complementary methods that have different strengths and weaknesses and that are appropriate for answering different kinds of research questions ( Kazdin, 2019 ) . Single-subject research is particularly good for testing the effectiveness of treatments on individuals when the focus is on strong, consistent, and biologically or socially important effects. It is especially useful when the behavior of particular individuals is of interest. Clinicians who work with only one individual at a time may find that it is their only option for doing systematic quantitative research.

Group research, on the other hand, is good for testing the effectiveness of treatments at the group level. Among the advantages of this approach is that it allows researchers to detect weak effects, which can be of interest for many reasons. For example, finding a weak treatment effect might lead to refinements of the treatment that eventually produce a larger and more meaningful effect. Group research is also good for studying interactions between treatments and participant characteristics. For example, if a treatment is effective for those who are high in motivation to change and ineffective for those who are low in motivation to change, then a group design can detect this much more efficiently than a single-subject design. Group research is also necessary to answer questions that cannot be addressed using the single-subject approach, including questions about independent variables that cannot be manipulated (e.g., number of siblings, extroversion, culture).

  • Single-subject research—which involves testing a small number of participants and focusing intensively on the behavior of each individual—is an important alternative to group research in psychology.
  • Single-subject studies must be distinguished from case studies, in which an individual case is described in detail. Case studies can be useful for generating new research questions, for studying rare phenomena, and for illustrating general principles. However, they cannot substitute for carefully controlled experimental or correlational studies because they are low in internal and external validity.
  • Single-subject research designs typically involve measuring the dependent variable repeatedly over time and changing conditions (e.g., from baseline to treatment) when the dependent variable has reached a steady state. This approach allows the researcher to see whether changes in the independent variable are causing changes in the dependent variable.
  • Single-subject researchers typically analyze their data by graphing them and making judgments about whether the independent variable is affecting the dependent variable based on level, trend, and latency.
  • Differences between single-subject research and group research sometimes lead to disagreements between single-subject and group researchers. These disagreements center on the issues of data analysis and external validity (especially generalization to other people). Single-subject research and group research are probably best seen as complementary methods, with different strengths and weaknesses, that are appropriate for answering different kinds of research questions.
  • Does positive attention from a parent increase a child’s toothbrushing behavior?
  • Does self-testing while studying improve a student’s performance on weekly spelling tests?
  • Does regular exercise help relieve depression?
  • Practice: Create a graph that displays the hypothetical results for the study you designed in Exercise 1. Write a paragraph in which you describe what the results show. Be sure to comment on level, trend, and latency.
  • Discussion: Imagine you have conducted a single-subject study showing a positive effect of a treatment on the behavior of a man with social anxiety disorder. Your research has been criticized on the grounds that it cannot be generalized to others. How could you respond to this criticism?
  • Discussion: Imagine you have conducted a group study showing a positive effect of a treatment on the behavior of a group of people with social anxiety disorder, but your research has been criticized on the grounds that “average” effects cannot be generalized to individuals. How could you respond to this criticism?

7.6 Glossary

The simplest reversal design, in which there is a baseline condition (A), followed by a treatment condition (B), followed by a return to baseline (A).

applied behavior analysis

A subfield of psychology that uses single-subject research and applies the principles of behavior analysis to real-world problems in areas that include education, developmental disabilities, organizational behavior, and health behavior.

A condition in a single-subject research design in which the dependent variable is measured repeatedly in the absence of any treatment. Most designs begin with a baseline condition, and many return to the baseline condition at least once.

A detailed description of an individual case.

experimental analysis of behavior

A subfield of psychology founded by B. F. Skinner that uses single-subject research—often with nonhuman animals—to study relationships primarily between environmental conditions and objectively observable behaviors.

group research

A type of quantitative research that involves studying a large number of participants and examining their behavior in terms of means, standard deviations, and other group-level statistics.

interrupted time-series design

A research design in which a series of measurements of the dependent variable are taken both before and after a treatment.

item-order effect

The effect of responding to one survey item on responses to a later survey item.

Refers collectively to extraneous developmental changes in participants that can occur between a pretest and posttest or between the first and last measurements in a time series. It can provide an alternative explanation for an observed change in the dependent variable.

multiple-baseline design

A single-subject research design in which multiple baselines are established for different participants, different dependent variables, or different contexts and the treatment is introduced at a different time for each baseline.

naturalistic observation

An approach to data collection in which the behavior of interest is observed in the environment in which it typically occurs.

nonequivalent groups design

A between-subjects research design in which participants are not randomly assigned to conditions, usually because participants are in preexisting groups (e.g., students at different schools).

nonexperimental research

Research that lacks the manipulation of an independent variable or the random assignment of participants to conditions or orders of conditions.

open-ended item

A questionnaire item that asks a question and allows respondents to respond in whatever way they want.

percentage of nonoverlapping data

A statistic sometimes used in single-subject research. The percentage of observations in a treatment condition that are more extreme than the most extreme observation in a relevant baseline condition.

pretest-posttest design

A research design in which the dependent variable is measured (the pretest), a treatment is given, and the dependent variable is measured again (the posttest) to see if there is a change in the dependent variable from pretest to posttest.

quasi-experimental research

Research that involves the manipulation of an independent variable but lacks the random assignment of participants to conditions or orders of conditions. It is generally used in field settings to test the effectiveness of a treatment.

rating scale

An ordered set of response options to a closed-ended questionnaire item.

The statistical fact that an individual who scores extremely on one occasion will tend to score less extremely on the next occasion.

A term often used to refer to a participant in survey research.

reversal design

A single-subject research design that begins with a baseline condition with no treatment, followed by the introduction of a treatment, and after that a return to the baseline condition. It can include additional treatment conditions and returns to baseline.

single-subject research

A type of quantitative research that involves examining in detail the behavior of each of a small number of participants.

single-variable research

Research that focuses on a single variable rather than on a statistical relationship between variables.

social validity

The extent to which a single-subject study focuses on an intervention that has a substantial effect on an important behavior and can be implemented reliably in the real-world contexts (e.g., by teachers in a classroom) in which that behavior occurs.

Improvement in a psychological or medical problem over time without any treatment.

steady state strategy

In single-subject research, allowing behavior to become fairly consistent from one observation to the next before changing conditions. This makes any effect of the treatment easier to detect.

survey research

A quantitative research approach that uses self-report measures and large, carefully selected samples.

testing effect

A bias in participants’ responses in which scores on the posttest are influenced by simple exposure to the pretest

visual inspection

The primary approach to data analysis in single-subject research, which involves graphing the data and making a judgment as to whether and to what extent the independent variable affected the dependent variable.

Last updated 09/07/24: Online ordering is currently unavailable due to technical issues. We apologise for any delays responding to customers while we resolve this. For further updates please visit our website: https://www.cambridge.org/news-and-insights/technical-incident

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

example of single subject quasi experimental research

  • > The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
  • > Quasi-Experimental Research

example of single subject quasi experimental research

Book contents

  • The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
  • Cambridge Handbooks in Psychology
  • Copyright page
  • Contributors
  • Part I From Idea to Reality: The Basics of Research
  • Part II The Building Blocks of a Study
  • Part III Data Collection
  • 13 Cross-Sectional Studies
  • 14 Quasi-Experimental Research
  • 15 Non-equivalent Control Group Pretest–Posttest Design in Social and Behavioral Research
  • 16 Experimental Methods
  • 17 Longitudinal Research: A World to Explore
  • 18 Online Research Methods
  • 19 Archival Data
  • 20 Qualitative Research Design
  • Part IV Statistical Approaches
  • Part V Tips for a Successful Research Career

14 - Quasi-Experimental Research

from Part III - Data Collection

Published online by Cambridge University Press:  25 May 2023

In this chapter, we discuss the logic and practice of quasi-experimentation. Specifically, we describe four quasi-experimental designs – one-group pretest–posttest designs, non-equivalent group designs, regression discontinuity designs, and interrupted time-series designs – and their statistical analyses in detail. Both simple quasi-experimental designs and embellishments of these simple designs are presented. Potential threats to internal validity are illustrated along with means of addressing their potentially biasing effects so that these effects can be minimized. In contrast to quasi-experiments, randomized experiments are often thought to be the gold standard when estimating the effects of treatment interventions. However, circumstances frequently arise where quasi-experiments can usefully supplement randomized experiments or when quasi-experiments can fruitfully be used in place of randomized experiments. Researchers need to appreciate the relative strengths and weaknesses of the various quasi-experiments so they can choose among pre-specified designs or craft their own unique quasi-experiments.

Access options

Save book to kindle.

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • Quasi-Experimental Research
  • By Charles S. Reichardt , Daniel Storage , Damon Abraham
  • Edited by Austin Lee Nichols , Central European University, Vienna , John Edlund , Rochester Institute of Technology, New York
  • Book: The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
  • Online publication: 25 May 2023
  • Chapter DOI: https://doi.org/10.1017/9781009010054.015

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Experimental and Quasi-Experimental Designs in Implementation Research

Christopher j. miller.

a VA Boston Healthcare System, Center for Healthcare Organization and Implementation Research (CHOIR), United States Department of Veterans Affairs, Boston, MA, USA

b Department of Psychiatry, Harvard Medical School, Boston, MA, USA

Shawna N. Smith

c Department of Psychiatry, University of Michigan Medical School, Ann Arbor, MI, USA

d Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA

Marianne Pugatch

Implementation science is focused on maximizing the adoption, appropriate use, and sustainability of effective clinical practices in real world clinical settings. Many implementation science questions can be feasibly answered by fully experimental designs, typically in the form of randomized controlled trials (RCTs). Implementation-focused RCTs, however, usually differ from traditional efficacy- or effectiveness-oriented RCTs on key parameters. Other implementation science questions are more suited to quasi-experimental designs, which are intended to estimate the effect of an intervention in the absence of randomization. These designs include pre-post designs with a non-equivalent control group, interrupted time series (ITS), and stepped wedges, the last of which require all participants to receive the intervention, but in a staggered fashion. In this article we review the use of experimental designs in implementation science, including recent methodological advances for implementation studies. We also review the use of quasi-experimental designs in implementation science, and discuss the strengths and weaknesses of these approaches. This article is therefore meant to be a practical guide for researchers who are interested in selecting the most appropriate study design to answer relevant implementation science questions, and thereby increase the rate at which effective clinical practices are adopted, spread, and sustained.

1. Background

The first documented clinical trial was conducted in 1747 by James Lind, a royal navy physician, who tested the hypothesis that citrus fruit could cure scurvy. Since then, based on foundational work by Fisher and others (1935), the randomized controlled trial (RCT) has emerged as the gold standard for testing the efficacy of treatment versus a control condition for individual patients. Randomization of patients is seen as a crucial to reducing the impact of measured or unmeasured confounding variables, in turn allowing researchers to draw conclusions regarding causality in clinical trials.

As described elsewhere in this special issue, implementation science is ultimately focused on maximizing the adoption, appropriate use, and sustainability of effective clinical practices in real world clinical settings. As such, some implementation science questions may be addressed by experimental designs. For our purposes here, we use the term “experimental” to refer to designs that feature two essential ingredients: first, manipulation of an independent variable; and second, random assignment of subjects. This corresponds to the definition of randomized experiments originally championed by Fisher (1925) . From this perspective, experimental designs usually take the form of RCTs—but implementation- oriented RCTs typically differ in important ways from traditional efficacy- or effectiveness-oriented RCTs. Other implementation science questions require different methodologies entirely: specifically, several forms of quasi-experimental designs may be used for implementation research in situations where an RCT would be inappropriate. These designs are intended to estimate the effect of an intervention despite a lack of randomization. Quasi-experimental designs include pre-post designs with a nonequivalent control group, interrupted time series (ITS), and stepped wedge designs. Stepped wedges are studies in which all participants receive the intervention, but in a staggered fashion. It is important to note that quasi-experimental designs are not unique to implementation science. As we will discuss below, however, each of them has strengths that make them particularly useful in certain implementation science contexts.

Our goal for this manuscript is two-fold. First, we will summarize the use of experimental designs in implementation science. This will include discussion of ways that implementation-focused RCTs may differ from efficacy- or effectiveness-oriented RCTs. Second, we will summarize the use of quasi-experimental designs in implementation research. This will include discussion of the strengths and weaknesses of these types of approaches in answering implementation research questions. For both experimental and quasi-experimental designs, we will discuss a recent implementation study as an illustrative example of one approach.

1. Experimental Designs in Implementation Science

RCTs in implementation science share the same basic structure as efficacy- or effectiveness-oriented RCTs, but typically feature important distinctions. In this section we will start by reviewing key factors that separate implementation RCTs from more traditional efficacy- or effectiveness-oriented RCTs. We will then discuss optimization trials, which are a type of experimental design that is especially useful for certain implementation science questions. We will then briefly turn our attention to single subject experimental designs (SSEDs) and on-off-on (ABA) designs.

The first common difference that sets apart implementation RCTs from more traditional clinical trials is the primary research question they aim to address. For most implementation trials, the primary research question is not the extent to which a particular treatment or evidence-based practice is more effective than a comparison condition, but instead the extent to which a given implementation strategy is more effective than a comparison condition. For more detail on this pivotal issue, see Drs. Bauer and Kirchner in this special issue.

Second, as a corollary of this point, implementation RCTs typically feature different outcome measures than efficacy or effectiveness RCTs, with an emphasis on the extent to which a health intervention was successfully implemented rather than an evaluation of the health effects of that intervention ( Proctor et al., 2011 ). For example, typical implementation outcomes might include the number of patients who receive the intervention, or the number of providers who administer the intervention as intended. A variety of evaluation-oriented implementation frameworks may guide the choices of such measures (e.g. RE-AIM; Gaglio et al., 2013 ; Glasgow et al., 1999 ). Hybrid implementation-effectiveness studies attend to both effectiveness and implementation outcomes ( Curran et al., 2012 ); these designs are also covered in more detail elsewhere in this issue (Landes, this issue).

Third, given their focus, implementation RCTs are frequently cluster-randomized (i.e. with sites or clinics as the unit of randomization, and patients nested within those sites or clinics). For example, consider a hypothetical RCT that aims to evaluate the implementation of a training program for cognitive behavioral therapy (CBT) in community clinics. Randomizing at the patient level for such a trial would be inappropriate due to the risk of contamination, as providers trained in CBT might reasonably be expected to incorporate CBT principles into their treatment even to patients assigned to the control condition. Randomizing at the provider level would also risk contamination, as providers trained in CBT might discuss this treatment approach with their colleagues. Thus, many implementation trials are cluster randomized at the site or clinic level. While such clustering minimizes the risk of contamination, it can unfortunately create commensurate problems with confounding, especially for trials with very few sites to randomize. Stratification may be used to at least partially address confounding issues in cluster- randomized and more traditional trials alike, by ensuring that intervention and control groups are broadly similar on certain key variables. Furthermore, such allocation schemes typically require analytic models that account for this clustering and the resulting correlations among error structures (e.g., generalized estimating equations [GEE] or mixed-effects models; Schildcrout et al., 2018 ).

1.1. Optimization trials

Key research questions in implementation science often involve determining which implementation strategies to provide, to whom, and when, to achieve optimal implementation success. As such, trials designed to evaluate comparative effectiveness, or to optimize provision of different types or intensities of implementation strategies, may be more appealing than traditional effectiveness trials. The methods described in this section are not unique to implementation science, but their application in the context of implementation trials may be particularly useful for informing implementation strategies.

While two-arm RCTs can be used to evaluate comparative effectiveness, trials focused on optimizing implementation support may use alternative experimental designs ( Collins et al., 2005 ; Collins et al., 2007 ). For example, in certain clinical contexts, multi-component “bundles” of implementation strategies may be warranted (e.g. a bundle consisting of clinician training, technical assistance, and audit/feedback to encourage clinicians to use a new evidence-based practice). In these situations, implementation researchers might consider using factorial or fractional-factorial designs. In the context of implementation science, these designs randomize participants (e.g. sites or providers) to different combinations of implementation strategies, and can be used to evaluate the effectiveness of each strategy individually to inform an optimal combination (e.g. Coulton et al., 2009 ; Pellegrini et al., 2014 ; Wyrick, et al., 2014 ). Such designs can be particularly useful in informing multi-component implementation strategies that are not redundant or overly burdensome ( Collins et al., 2014a ; Collins et al., 2009 ; Collins et al., 2007 ).

Researchers interested in optimizing sequences of implementation strategies that adapt to ongoing needs over time may be interested in a variant of factorial designs known as the sequential, multiple-assignment randomized trial (SMART; Almirall et al., 2012 ; Collins et al., 2014b ; Kilbourne et al., 2014b ; Lei et al., 2012 ; Nahum-Shani et al., 2012 ; NeCamp et al., 2017 ). SMARTs are multistage randomized trials in which some or all participants are randomized more than once, often based on ongoing information (e.g., treatment response). In implementation research, SMARTs can inform optimal sequences of implementation strategies to maximize downstream clinical outcomes. Thus, such designs are well-suited to answering questions about what implementation strategies should be used, in what order, to achieve the best outcomes in a given context.

One example of an implementation SMART is the Adaptive Implementation of Effective Program Trial (ADEPT; Kilbourne et al., 2014a ). ADEPT was a clustered SMART ( NeCamp et al., 2017 ) designed to inform an adaptive sequence of implementation strategies for implementing an evidence-based collaborative chronic care model, Life Goals ( Kilbourne et al., 2014c ; Kilbourne et al., 2012a ), into community-based practices. Life Goals, the clinical intervention being implemented, has proven effective at improving physical and mental health outcomes for patients with unipolar and bipolar depression by encouraging providers to instruct patients in self-management, and improving clinical information systems and care management across physical and mental health providers ( Bauer et al., 2006 ; Kilbourne et al., 2012a ; Kilbourne et al., 2008 ; Simon et al., 2006 ). However, in spite of its established clinical effectiveness, community-based clinics experienced a number of barriers in trying to implement the Life Goals model, and there were questions about how best to efficiently and effectively augment implementation strategies for clinics that struggled with implementation.

The ADEPT study was thus designed to determine the best sequence of implementation strategies to offer sites interested in implementing Life Goals. The ADEPT study involved use of three different implementation strategies. First, all sites received implementation support based on Replicating Effective Programs (REP), which offered an implementation manual, brief training, and low- level technical support ( Kilbourne et al., 2007 ; Kilbourne et al., 2012b ; Neumann and Sogolow, 2000 ). REP implementation support had been previously found to be low-cost and readily scalable, but also insufficient for uptake for many community-based settings ( Kilbourne et al., 2015 ). For sites that failed to implement Life Goals under REP, two additional implementation strategies were considered as augmentations to REP: External Facilitation (EF; Kilbourne et al., 2014b ; Stetler et al., 2006 ), consisting of phone-based mentoring in strategic skills from a study team member; and Internal Facilitation (IF; Kirchner et al., 2014 ), which supported protected time for a site employee to address barriers to program adoption.

The ADEPT study was designed to evaluate the best way to augment support for these sites that were not able to implement Life Goals under REP, specifically querying whether it was better to augment REP with EF only or the more intensive EF/IF, and whether augmentations should be provided all at once, or staged. Intervention assignments are mapped in Figure 1 . Seventy-nine community-based clinics across Michigan and Colorado were provided with initial implementation support under REP. After six months, implementation of the clinical intervention, Life Goals, was evaluated at all sites. Sites that had failed to reach an adequate level of delivery (defined as those sites enrolling fewer than ten patients in Life Goals, or those at which fewer than 50% of enrolled patients had received at least three Life Goals sessions) were considered non-responsive to REP and randomized to receive additional support through either EF or combined EF/IF. After six further months, Life Goals implementation at these sites was again evaluated. Sites surpassing the implementation response benchmark had their EF or EF/IF support discontinued. EF/IF sites that remained non-responsive continued to receive EF/IF for an additional six months. EF sites that remained non-responsive were randomized a second time to either continue with EF or further augment with IF. This design thus allowed for comparison of three different adaptive implementation interventions for sites that were initially non-responsive to REP to determine the best adaptive sequence of implementation support for sites that were initially non-responsive under REP:

An external file that holds a picture, illustration, etc.
Object name is nihms-1533574-f0001.jpg

SMART design from ADEPT trial.

  • Provide EF for 6 months; continue EF for a further six months for sites that remain nonresponsive; discontinue EF for sites that are responsive;
  • Provide EF/IF for 6 months; continue EF/IF for a further six months for sites that remain non-responsive; discontinue EF/IF for sites that are responsive; and
  • Provide EF for 6 months; step up to EF/IF for a further six months for sites that remain non-responsive; discontinue EF for sites that are responsive.

While analyses of this study are still ongoing, including the comparison of these three adaptive sequences of implementation strategies, results have shown that patients at sites that were randomized to receive EF as the initial augmentation to REP saw more improvement in clinical outcomes (SF-12 mental health quality of life and PHQ-9 depression scores) after 12 months than patients at sites that were randomized to receive the more intensive EF/IF augmentation.

1.2. Single Subject Experimental Designs and On-Off-On (ABA) Designs

We also note that there are a variety of Single Subject Experimental Designs (SSEDs; Byiers et al., 2012 ), including withdrawal designs and alternating treatment designs, that can be used in testing evidence-based practices. Similarly, an implementation strategy may be used to encourage the use of a specific treatment at a particular site, followed by that strategy’s withdrawal and subsequent reinstatement, with data collection throughout the process (on-off-on or ABA design). A weakness of these approaches in the context of implementation science, however, is that they usually require reversibility of the intervention (i.e. that the withdrawal of implementation support truly allows the healthcare system to revert to its pre-implementation state). When this is not the case—for example, if a hypothetical study is focused on training to encourage use of an evidence-based psychotherapy—then these designs may be less useful.

2. Quasi-Experimental Designs in Implementation Science

In some implementation science contexts, policy-makers or administrators may not be willing to have a subset of participating patients or sites randomized to a control condition, especially for high-profile or high-urgency clinical issues. Quasi-experimental designs allow implementation scientists to conduct rigorous studies in these contexts, albeit with certain limitations. We briefly review the characteristics of these designs here; other recent review articles are available for the interested reader (e.g. Handley et al., 2018 ).

2.1. Pre-Post with Non-Equivalent Control Group

The pre-post with non-equivalent control group uses a control group in the absence of randomization. Ideally, the control group is chosen to be as similar to the intervention group as possible (e.g. by matching on factors such as clinic type, patient population, geographic region, etc.). Theoretically, both groups are exposed to the same trends in the environment, making it plausible to decipher if the intervention had an effect. Measurement of both treatment and control conditions classically occurs pre- and post-intervention, with differential improvement between the groups attributed to the intervention. This design is popular due to its practicality, especially if data collection points can be kept to a minimum. It may be especially useful for capitalizing on naturally occurring experiments such as may occur in the context of certain policy initiatives or rollouts—specifically, rollouts in which it is plausible that a control group can be identified. For example, Kirchner and colleagues (2014) used this type of design to evaluate the integration of mental health services into primary care clinics at seven US Department of Veterans Affairs (VA) medical centers and seven matched controls.

One overarching drawback of this design is that it is especially vulnerable to threats to internal validity ( Shadish, 2002 ), because pre-existing differences between the treatment and control group could erroneously be attributed to the intervention. While unmeasured differences between treatment and control groups are always a possibility in healthcare research, such differences are especially likely to occur in the context of these designs due to the lack of randomization. Similarly, this design is particularly sensitive to secular trends that may differentially affect the treatment and control groups ( Cousins et al., 2014 ; Pape et al., 2013 ), as well as regression to the mean confounding study results ( Morton and Torgerson, 2003 ). For example, if a study site is selected for the experimental condition precisely because it is underperforming in some way, then regression to the mean would suggest that the site will show improvement regardless of any intervention; in the context of a pre-post with non-equivalent control group study, however, this improvement would erroneously be attributed to the intervention itself (Type I error).

There are, however, various ways that implementation scientists can mitigate these weaknesses. First, as mentioned briefly above, it is important to select a control group that is as similar as possible to the intervention site(s), which can include matching at both the health care network and clinic level (e.g. Kirchner et al., 2014 ). Second, propensity score weighting (e.g. Morgan, 2018 ) can statistically mitigate internal validity concerns, although this approach may be of limited utility when comparing secular trends between different study cohorts ( Dimick and Ryan, 2014 ). More broadly, qualitative methods (e.g. periodic interviews with staff at intervention and control sites) can help uncover key contextual factors that may be affecting study results above and beyond the intervention itself.

2.2. Interrupted Time Series

Interrupted time series (ITS; Shadish, 2002 ; Taljaard et al., 2014 ; Wagner et al., 2002 ) designs represent one of the most robust categories of quasi-experimental designs. Rather than relying on a non-equivalent control group, ITS designs rely on repeated data collections from intervention sites to determine whether a particular intervention is associated with improvement on a given metric relative to the pre-intervention secular trend. They are particularly useful in cases where a comparable control group cannot be identified—for example, following widespread implementation of policy mandates, quality improvement initiatives, or dissemination campaigns ( Eccles et al., 2003 ). In ITS designs, data are collected at multiple time points both before and after an intervention (e.g., policy change, implementation effort), and analyses explore whether the intervention was associated with the outcome beyond any pre-existing secular trend. More formally, ITS evaluations focus on identifying whether there is discontinuity in the trend (change in slope or level) after the intervention relative to before the intervention, using segmented regression to model pre- and post-intervention trends ( Gebski et al., 2012 ; Penfold and Zhang, 2013 ; Taljaard et al., 2014 ; Wagner et al., 2002 ). A number of recent implementation studies have used ITS designs, including an evaluation of implementation of a comprehensive smoke-free policy in a large UK mental health organization to reduce physical assaults ( Robson et al., 2017 ); the impact of a national policy limiting alcohol availability on suicide mortality in Slovenia ( Pridemore and Snowden, 2009 ); and the effect of delivery of a tailored intervention for primary care providers to increase psychological referrals for women with mild to moderate postnatal depression ( Hanbury et al., 2013 ).

ITS designs are appealing in implementation work for several reasons. Relative to uncontrolled pre-post analyses, ITS analyses reduce the chances that intervention effects are confounded by secular trends ( Bernal et al., 2017 ; Eccles et al., 2003 ). Time-varying confounders, such as seasonality, can also be adjusted for, provided adequate data ( Bernal et al., 2017 ). Indeed, recent work has confirmed that ITS designs can yield effect estimates similar to those derived from cluster-randomized RCTs ( Fretheim et al., 2013 ; Fretheim et al., 2015 ). Relative to an RCT, ITS designs can also allow for a more comprehensive assessment of the longitudinal effects of an intervention (positive or negative), as effects can be traced over all included time points ( Bernal et al., 2017 ; Penfold and Zhang, 2013 ).

ITS designs also present a number of challenges. First, the segmented regression approach requires clear delineation between pre- and post-intervention periods; interventions with indeterminate implementation periods are likely not good candidates for ITS. While ITS designs that include multiple ‘interruptions’ (e.g. introductions of new treatment components) are possible, they will require collection of enough time points between interruptions to ensure that each intervention’s effects can be ascertained individually ( Bernal et al., 2017 ). Second, collecting data from sufficient time points across all sites of interest, especially for the pre-intervention period, can be challenging ( Eccles et al., 2003 ): a common recommendation is at least eight time points both pre- and post-intervention ( Penfold and Zhang, 2013 ). This may be onerous, particularly if the data are not routinely collected by the health system(s) under study. Third, ITS cannot protect against confounding effects from other interventions that begin contemporaneously and may impact similar outcomes ( Eccles et al., 2003 ).

2.3. Stepped Wedge Designs

Stepped wedge trials are another type of quasi-experimental design. In a stepped wedge, all participants receive the intervention, but are assigned to the timing of the intervention in a staggered fashion ( Betran et al., 2018 ; Brown and Lilford, 2006 ; Hussey and Hughes, 2007 ), typically at the site or cluster level. Stepped wedge designs have their analytic roots in balanced incomplete block designs, in which all pairs of treatments occur an equal number of times within each block ( Hanani, 1961 ). Traditionally, all sites in stepped wedge trials have outcome measures assessed at all time points, thus allowing sites that receive the intervention later in the trial to essentially serve as controls for early intervention sites. A recent special issue of the journal Trials includes more detail on these designs ( Davey et al., 2015 ), which may be ideal for situations in which it is important for all participating patients or sites to receive the intervention during the trial. Stepped wedge trials may also be useful when resources are scarce enough that intervening at all sites at once (or even half of the sites as in a standard treatment-versus-control RCT) would not be feasible. If desired, the administration of the intervention to sites in waves allows for lessons learned in early sites to be applied to later sites (via formative evaluation; see Elwy et al., this issue).

The Behavioral Health Interdisciplinary Program (BHIP) Enhancement Project is a recent example of a stepped-wedge implementation trial ( Bauer et al., 2016 ; Bauer et al., 2019 ). This study involved using blended facilitation (including internal and external facilitators; Kirchner et al., 2014 ) to implement care practices consistent with the collaborative chronic care model (CCM; Bodenheimer et al., 2002a , b ; Wagner et al., 1996 ) in nine outpatient mental health teams in VA medical centers. Figure 2 illustrates the implementation and stepdown periods for that trial, with black dots representing primary data collection points.

An external file that holds a picture, illustration, etc.
Object name is nihms-1533574-f0002.jpg

BHIP Enhancement Project stepped wedge (adapted form Bauer et al., 2019).

The BHIP Enhancement Project was conducted as a stepped wedge for several reasons. First, the stepped wedge design allowed the trial to reach nine sites despite limited implementation resources (i.e. intervening at all nine sites simultaneously would not have been feasible given study funding). Second, the stepped wedge design aided in recruitment and retention, as all participating sites were certain to receive implementation support during the trial: at worst, sites that were randomized to later- phase implementation had to endure waiting periods totaling about eight months before implementation began. This was seen as a major strength of the design by its operational partner, the VA Office of Mental Health and Suicide Prevention. To keep sites engaged during the waiting period, the BHIP Enhancement Project offered a guiding workbook and monthly technical support conference calls.

Three additional features of the BHIP Enhancement Project deserve special attention. First, data collection for late-implementing sites did not begin until immediately before the onset of implementation support (see Figure 2 ). While this reduced statistical power, it also significantly reduced data collection burden on the study team. Second, onset of implementation support was staggered such that wave 2 began at the end of month 4 rather than month 6. This had two benefits: first, this compressed the overall amount of time required for implementation during the trial. Second, it meant that the study team only had to collect data from one site at a time, with data collection periods coming every 2–4 months. More traditional stepped wedge approaches typically have data collection across sites temporally aligned (e.g. Betran et al., 2018 ). Third, the BHIP Enhancement Project used a balancing algorithm ( Lew et al., 2019 ) to assign sites to waves, retaining some of the benefits of randomization while ensuring balance on key site characteristics (e.g. size, geographic region).

Despite their utility, stepped wedges have some important limitations. First, because they feature delayed implementation at some sites, stepped wedges typically take longer than similarly-sized parallel group RCTs. This increases the chances that secular trends, policy changes, or other external forces impact study results. Second, as with RCTs, imbalanced site assignment can confound results. This may occur deliberately in some cases—for example, if sites that develop their implementation plans first are assigned to earlier waves. Even if sites are randomized, however, early and late wave sites may still differ on important characteristics such as size, rurality, and case mix. The resulting confounding between site assignment and time can threaten the internal validity of the study—although, as above, balancing algorithms can reduce this risk. Third, the use of formative evaluation (Elwy, this issue), while useful for maximizing the utility of implementation efforts in a stepped wedge, can mean that late-wave sites receive different implementation strategies than early-wave sites. Similarly, formative evaluation may inform midstream adaptations to the clinical innovation being implemented. In either case, these changes may again threaten internal validity. Overall, then, stepped wedges represent useful tools for evaluating the impact of health interventions that (as with all designs) are subject to certain weaknesses and limitations.

3. Conclusions and Future Directions

Implementation science is focused on maximizing the extent to which effective healthcare practices are adopted, used, and sustained by clinicians, hospitals, and systems. Answering questions in these domains frequently requires different research methods than those employed in traditional efficacy- or effectiveness-oriented randomized clinical trials (RCTs). Implementation-oriented RCTs typically feature cluster or site-level randomization, and emphasize implementation outcomes (e.g. the number of patients receiving the new treatment as intended) rather than traditional clinical outcomes. Hybrid implementation-effectiveness designs incorporate both types of outcomes; more details on these approaches can be found elsewhere in this special issue (Landes, this issue). Other methodological innovations, such as factorial designs or sequential, multiple-assignment randomized trials (SMARTs), can address questions about multi-component or adaptive interventions, still under the umbrella of experimental designs. These types of trials may be especially important for demystifying the “black box” of implementation—that is, determining what components of an implementation strategy are most strongly associated with implementation success. In contrast, pre-post designs with non-equivalent control groups, interrupted time series (ITS), and stepped wedge designs are all examples of quasiexperimental designs that may serve implementation researchers when experimental designs would be inappropriate. A major theme cutting across each of these designs is that there are relative strengths and weaknesses associated with any study design decision. Determining what design to use ultimately will need to be informed by the primary research question to be answered, while simultaneously balancing the need for internal validity, external validity, feasibility, and ethics.

New innovations in study design are constantly being developed and refined. Several such innovations are covered in other articles within this special issue (e.g. Kim et al., this issue). One future direction relevant to the study designs presented in this article is the potential for adaptive trial designs, which allow information gleaned during the trial to inform the adaptation of components like treatment allocation, sample size, or study recruitment in the later phases of the same trial ( Pallmann et al., 2018 ). These designs are becoming increasingly popular in clinical treatment ( Bhatt and Mehta, 2016 ) but could also hold promise for implementation scientists, especially as interest grows in rapid-cycle testing of implementation strategies or efforts. Adaptive designs could potentially be incorporated into both SMART designs and stepped wedge studies, as well as traditional RCTs to further advance implementation science ( Cheung et al., 2015 ). Ideally, these and other innovations will provide researchers with increasingly robust and useful methodologies for answering timely implementation science questions.

  • Many implementation science questions can be addressed by fully experimental designs (e.g. randomized controlled trials [RCTs]).
  • Implementation trials differ in important ways, however, from more traditional efficacy- or effectiveness-oriented RCTs.
  • Adaptive designs represent a recent innovation to determine optimal implementation strategies within a fully experimental framework.
  • Quasi-experimental designs can be used to answer implementation science questions in the absence of randomization.
  • The choice of study designs in implementation science requires careful consideration of scientific, pragmatic, and ethical issues.

Acknowledgments

This work was supported by Department of Veterans Affairs grants QUE 15–289 (PI: Bauer) and CIN 13403 and National Institutes of Health grant RO1 MH 099898 (PI: Kilbourne).

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

  • Almirall D, Compton SN, Gunlicks-Stoessel M, Duan N, Murphy SA, 2012. Designing a pilot sequential multiple assignment randomized trial for developing an adaptive treatment strategy . Stat Med 31 ( 17 ), 1887–1902. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bauer MS, McBride L, Williford WO, Glick H, Kinosian B, Altshuler L, Beresford T, Kilbourne AM, Sajatovic M, Cooperative Studies Program 430 Study, T., 2006. Collaborative care for bipolar disorder: Part II. Impact on clinical outcome, function, and costs . Psychiatr Serv 57 ( 7 ), 937–945. [ PubMed ] [ Google Scholar ]
  • Bauer MS, Miller C, Kim B, Lew R, Weaver K, Coldwell C, Henderson K, Holmes S, Seibert MN, Stolzmann K, Elwy AR, Kirchner J, 2016. Partnering with health system operations leadership to develop a controlled implementation trial . Implement Sci 11 , 22. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bauer MS, Miller CJ, Kim B, Lew R, Stolzmann K, Sullivan J, Riendeau R, Pitcock J, Williamson A, Connolly S, Elwy AR, Weaver K, 2019. Effectiveness of Implementing a Collaborative Chronic Care Model for Clinician Teams on Patient Outcomes and Health Status in Mental Health: A Randomized Clinical Trial . JAMA Netw Open 2 ( 3 ), e190230. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bernal JL, Cummins S, Gasparrini A, 2017. Interrupted time series regression for the evaluation of public health interventions: a tutorial . Int J Epidemiol 46 ( 1 ), 348–355. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Betran AP, Bergel E, Griffin S, Melo A, Nguyen MH, Carbonell A, Mondlane S, Merialdi M, Temmerman M, Gulmezoglu AM, 2018. Provision of medical supply kits to improve quality of antenatal care in Mozambique: a stepped-wedge cluster randomised trial . Lancet Glob Health 6 ( 1 ), e57–e65. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bhatt DL, Mehta C, 2016. Adaptive Designs for Clinical Trials . N Engl J Med 375 ( 1 ), 65–74. [ PubMed ] [ Google Scholar ]
  • Bodenheimer T, Wagner EH, Grumbach K, 2002a. Improving primary care for patients with chronic illness . JAMA 288 ( 14 ), 1775–1779. [ PubMed ] [ Google Scholar ]
  • Bodenheimer T, Wagner EH, Grumbach K, 2002b. Improving primary care for patients with chronic illness: the chronic care model, Part 2 . JAMA 288 ( 15 ), 1909–1914. [ PubMed ] [ Google Scholar ]
  • Brown CA, Lilford RJ, 2006. The stepped wedge trial design: a systematic review . BMC medical research methodology 6 ( 1 ), 54. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Byiers BJ, Reichle J, Symons FJ, 2012. Single-subject experimental design for evidence-based practice . Am J Speech Lang Pathol 21 ( 4 ), 397–414. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cheung YK, Chakraborty B, Davidson KW, 2015. Sequential multiple assignment randomized trial (SMART) with adaptive randomization for quality improvement in depression treatment program . Biometrics 71 ( 2 ), 450–459. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Dziak JJ, Kugler KC, Trail JB, 2014a. Factorial experiments: efficient tools for evaluation of intervention components . Am J Prev Med 47 ( 4 ), 498–504. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Dziak JJ, Li R, 2009. Design of experiments with multiple independent variables: a resource management perspective on complete and reduced factorial designs . Psychol Methods 14 ( 3 ), 202–224. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Murphy SA, Bierman KL, 2004. A conceptual framework for adaptive preventive interventions . Prev Sci 5 ( 3 ), 185–196. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Murphy SA, Nair VN, Strecher VJ, 2005. A strategy for optimizing and evaluating behavioral interventions . Ann Behav Med 30 ( 1 ), 65–73. [ PubMed ] [ Google Scholar ]
  • Collins LM, Murphy SA, Strecher V, 2007. The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): new methods for more potent eHealth interventions . Am J Prev Med 32 ( 5 Suppl ), S112–118. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Nahum-Shani I, Almirall D, 2014b. Optimization of behavioral dynamic treatment regimens based on the sequential, multiple assignment, randomized trial (SMART) . Clin Trials 11 ( 4 ), 426–434. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Coulton S, Perryman K, Bland M, Cassidy P, Crawford M, Deluca P, Drummond C, Gilvarry E, Godfrey C, Heather N, Kaner E, Myles J, Newbury-Birch D, Oyefeso A, Parrott S, Phillips T, Shenker D, Shepherd J, 2009. Screening and brief interventions for hazardous alcohol use in accident and emergency departments: a randomised controlled trial protocol . BMC Health Serv Res 9 , 114. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cousins K, Connor JL, Kypri K, 2014. Effects of the Campus Watch intervention on alcohol consumption and related harm in a university population . Drug Alcohol Depend 143 , 120–126. [ PubMed ] [ Google Scholar ]
  • Curran GM, Bauer M, Mittman B, Pyne JM, Stetler C, 2012. Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact . Med Care 50 ( 3 ), 217–226. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Davey C, Hargreaves J, Thompson JA, Copas AJ, Beard E, Lewis JJ, Fielding KL, 2015. Analysis and reporting of stepped wedge randomised controlled trials: synthesis and critical appraisal of published studies, 2010 to 2014 . Trials 16 ( 1 ), 358. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dimick JB, Ryan AM, 2014. Methods for evaluating changes in health care policy: the difference-in- differences approach . JAMA 312 ( 22 ), 2401–2402. [ PubMed ] [ Google Scholar ]
  • Eccles M, Grimshaw J, Campbell M, Ramsay C, 2003. Research designs for studies evaluating the effectiveness of change and improvement strategies . Qual Saf Health Care 12 ( 1 ), 47–52. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fisher RA, 1925, July Theory of statistical estimation In Mathematical Proceedings of the Cambridge Philosophical Society (Vol. 22, No. 5, pp. 700–725). Cambridge University Press. [ Google Scholar ]
  • Fisher RA, 1935. The design of experiments . Oliver and Boyd, Edinburgh. [ Google Scholar ]
  • Fretheim A, Soumerai SB, Zhang F, Oxman AD, Ross-Degnan D, 2013. Interrupted time-series analysis yielded an effect estimate concordant with the cluster-randomized controlled trial result . Journal of Clinical Epidemiology 66 ( 8 ), 883–887. [ PubMed ] [ Google Scholar ]
  • Fretheim A, Zhang F, Ross-Degnan D, Oxman AD, Cheyne H, Foy R, Goodacre S, Herrin J, Kerse N, McKinlay RJ, Wright A, Soumerai SB, 2015. A reanalysis of cluster randomized trials showed interrupted time-series studies were valuable in health system evaluation . J Clin Epidemiol 68 ( 3 ), 324–333. [ PubMed ] [ Google Scholar ]
  • Gaglio B, Shoup JA, Glasgow RE, 2013. The RE-AIM framework: a systematic review of use over time . Am J Public Health 103 ( 6 ), e38–46. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gebski V, Ellingson K, Edwards J, Jernigan J, Kleinbaum D, 2012. Modelling interrupted time series to evaluate prevention and control of infection in healthcare . Epidemiol Infect 140 ( 12 ), 2131–2141. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Glasgow RE, Vogt TM, Boles SM, 1999. Evaluating the public health impact of health promotion interventions: the RE-AIM framework . Am J Public Health 89 ( 9 ), 1322–1327. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hanani H, 1961. The existence and construction of balanced incomplete block designs . The Annals of Mathematical Statistics 32 ( 2 ), 361–386. [ Google Scholar ]
  • Hanbury A, Farley K, Thompson C, Wilson PM, Chambers D, Holmes H, 2013. Immediate versus sustained effects: interrupted time series analysis of a tailored intervention . Implement Sci 8 , 130. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Handley MA, Lyles CR, McCulloch C, Cattamanchi A, 2018. Selecting and Improving Quasi-Experimental Designs in Effectiveness and Implementation Research . Annu Rev Public Health 39 , 5–25. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hussey MA, Hughes JP, 2007. Design and analysis of stepped wedge cluster randomized trials . Contemp Clin Trials 28 ( 2 ), 182–191. [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Almirall D, Eisenberg D, Waxmonsky J, Goodrich DE, Fortney JC, Kirchner JE, Solberg LI, Main D, Bauer MS, Kyle J, Murphy SA, Nord KM, Thomas MR, 2014a. Protocol: Adaptive Implementation of Effective Programs Trial (ADEPT): cluster randomized SMART trial comparing a standard versus enhanced implementation strategy to improve outcomes of a mood disorders program . Implement Sci 9 , 132. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Almirall D, Goodrich DE, Lai Z, Abraham KM, Nord KM, Bowersox NW, 2014b. Enhancing outreach for persons with serious mental illness: 12-month results from a cluster randomized trial of an adaptive implementation strategy . Implement Sci 9 , 163. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Bramlet M, Barbaresso MM, Nord KM, Goodrich DE, Lai Z, Post EP, Almirall D, Verchinina L, Duffy SA, Bauer MS, 2014c. SMI life goals: description of a randomized trial of a collaborative care model to improve outcomes for persons with serious mental illness . Contemp Clin Trials 39 ( 1 ), 74–85. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Goodrich DE, Lai Z, Clogston J, Waxmonsky J, Bauer MS, 2012a. Life Goals Collaborative Care for patients with bipolar disorder and cardiovascular disease risk . Psychiatr Serv 63 ( 12 ), 1234–1238. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Goodrich DE, Nord KM, Van Poppelen C, Kyle J, Bauer MS, Waxmonsky JA, Lai Z, Kim HM, Eisenberg D, Thomas MR, 2015. Long-Term Clinical Outcomes from a Randomized Controlled Trial of Two Implementation Strategies to Promote Collaborative Care Attendance in Community Practices . Adm Policy Ment Health 42 ( 5 ), 642–653. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Neumann MS, Pincus HA, Bauer MS, Stall R, 2007. Implementing evidence-based interventions in health care: application of the replicating effective programs framework . Implement Sci 2 , 42. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Neumann MS, Waxmonsky J, Bauer MS, Kim HM, Pincus HA, Thomas M, 2012b. Public-academic partnerships: evidence-based implementation: the role of sustained community-based practice and research partnerships . Psychiatr Serv 63 ( 3 ), 205–207. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Post EP, Nossek A, Drill L, Cooley S, Bauer MS, 2008. Improving medical and psychiatric outcomes among individuals with bipolar disorder: a randomized controlled trial . Psychiatr Serv 59 ( 7 ), 760–768. [ PubMed ] [ Google Scholar ]
  • Kirchner JE, Ritchie MJ, Pitcock JA, Parker LE, Curran GM, Fortney JC, 2014. Outcomes of a partnered facilitation strategy to implement primary care-mental health . J Gen Intern Med 29 Suppl 4 , 904–912. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lei H, Nahum-Shani I, Lynch K, Oslin D, Murphy SA, 2012. A “SMART” design for building individualized treatment sequences . Annu Rev Clin Psychol 8 , 21–48. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lew RA, Miller CJ, Kim B, Wu H, Stolzmann K, Bauer MS, 2019. A robust method to reduce imbalance for site-level randomized controlled implementation trial designs . Implementation Sci , 14 , 46. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Morgan CJ, 2018. Reducing bias using propensity score matching . J Nucl Cardiol 25 ( 2 ), 404–406. [ PubMed ] [ Google Scholar ]
  • Morton V, Torgerson DJ, 2003. Effect of regression to the mean on decision making in health care . BMJ 326 ( 7398 ), 1083–1084. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Nahum-Shani I, Qian M, Almirall D, Pelham WE, Gnagy B, Fabiano GA, Waxmonsky JG, Yu J, Murphy SA, 2012. Experimental design and primary data analysis methods for comparing adaptive interventions . Psychol Methods 17 ( 4 ), 457–477. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • NeCamp T, Kilbourne A, Almirall D, 2017. Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations . Stat Methods Med Res 26 ( 4 ), 1572–1589. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Neumann MS, Sogolow ED, 2000. Replicating effective programs: HIV/AIDS prevention technology transfer . AIDS Educ Prev 12 ( 5 Suppl ), 35–48. [ PubMed ] [ Google Scholar ]
  • Pallmann P, Bedding AW, Choodari-Oskooei B, Dimairo M, Flight L, Hampson LV, Holmes J, Mander AP, Odondi L.o., Sydes MR, Villar SS, Wason JMS, Weir CJ, Wheeler GM, Yap C, Jaki T, 2018. Adaptive designs in clinical trials: why use them, and how to run and report them . BMC medicine 16 ( 1 ), 29–29. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pape UJ, Millett C, Lee JT, Car J, Majeed A, 2013. Disentangling secular trends and policy impacts in health studies: use of interrupted time series analysis . J R Soc Med 106 ( 4 ), 124–129. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pellegrini CA, Hoffman SA, Collins LM, Spring B, 2014. Optimization of remotely delivered intensive lifestyle treatment for obesity using the Multiphase Optimization Strategy: Opt-IN study protocol . Contemp Clin Trials 38 ( 2 ), 251–259. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Penfold RB, Zhang F, 2013. Use of Interrupted Time Series Analysis in Evaluating Health Care Quality Improvements . Academic Pediatrics 13 ( 6, Supplement ), S38–S44. [ PubMed ] [ Google Scholar ]
  • Pridemore WA, Snowden AJ, 2009. Reduction in suicide mortality following a new national alcohol policy in Slovenia: an interrupted time-series analysis . Am J Public Health 99 ( 5 ), 915–920. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Proctor E, Silmere H, Raghavan R, Hovmand P, Aarons G, Bunger A, Griffey R, Hensley M, 2011. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda . Adm Policy Ment Health 38 ( 2 ), 65–76. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Robson D, Spaducci G, McNeill A, Stewart D, Craig TJK, Yates M, Szatkowski L, 2017. Effect of implementation of a smoke-free policy on physical violence in a psychiatric inpatient setting: an interrupted time series analysis . Lancet Psychiatry 4 ( 7 ), 540–546. [ PubMed ] [ Google Scholar ]
  • Schildcrout JS, Schisterman EF, Mercaldo ND, Rathouz PJ, Heagerty PJ, 2018. Extending the Case-Control Design to Longitudinal Data: Stratified Sampling Based on Repeated Binary Outcomes . Epidemiology 29 ( 1 ), 67–75. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shadish WR, Cook Thomas D., Campbell Donald T, 2002. Experimental and quasi-experimental designs for generalized causal inference . Houghton Miffflin Company, Boston, MA. [ Google Scholar ]
  • Simon GE, Ludman EJ, Bauer MS, Unutzer J, Operskalski B, 2006. Long-term effectiveness and cost of a systematic care program for bipolar disorder . Arch Gen Psychiatry 63 ( 5 ), 500–508. [ PubMed ] [ Google Scholar ]
  • Stetler CB, Legro MW, Rycroft-Malone J, Bowman C, Curran G, Guihan M, Hagedorn H, Pineros S, Wallace CM, 2006. Role of “external facilitation” in implementation of research findings: a qualitative evaluation of facilitation experiences in the Veterans Health Administration . Implement Sci 1 , 23. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Taljaard M, McKenzie JE, Ramsay CR, Grimshaw JM, 2014. The use of segmented regression in analysing interrupted time series studies: an example in pre-hospital ambulance care . Implement Sci 9 , 77. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wagner AK, Soumerai SB, Zhang F, Ross-Degnan D, 2002. Segmented regression analysis of interrupted time series studies in medication use research . J Clin Pharm Ther 27 ( 4 ), 299–309. [ PubMed ] [ Google Scholar ]
  • Wagner EH, Austin BT, Von Korff M, 1996. Organizing care for patients with chronic illness . Milbank Q 74 ( 4 ), 511–544. [ PubMed ] [ Google Scholar ]
  • Wyrick DL, Rulison KL, Fearnow-Kenney M, Milroy JJ, Collins LM, 2014. Moving beyond the treatment package approach to developing behavioral interventions: addressing questions that arose during an application of the Multiphase Optimization Strategy (MOST) . Transl Behav Med 4 ( 3 ), 252–259. [ PMC free article ] [ PubMed ] [ Google Scholar ]

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 7: Nonexperimental Research

Quasi-Experimental Research

Learning Objectives

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix  quasi  means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). [1] Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A  nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This design would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a  pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of  history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of  maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is  regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study  because  of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is  spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001) [2] . Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952) [3] . But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate  without  receiving psychotherapy. This parallel suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here: Classics in the History of Psychology .

Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980) [4] . They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Interrupted Time Series Design

A variant of the pretest-posttest design is the  interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this one is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979) [5] . Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.3 shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of  Figure 7.3 shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of  Figure 7.3 shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Image description available

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does  not  receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve  more  than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this change in attitude could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.
  • regression to the mean
  • spontaneous remission

Image Descriptions

Figure 7.3 image description: Two line graphs charting the number of absences per week over 14 weeks. The first 7 weeks are without treatment and the last 7 weeks are with treatment. In the first line graph, there are between 4 to 8 absences each week. After the treatment, the absences drop to 0 to 3 each week, which suggests the treatment worked. In the second line graph, there is no noticeable change in the number of absences per week after the treatment, which suggests the treatment did not work. [Return to Figure 7.3]

  • Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin. ↵
  • Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146. ↵
  • Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324. ↵
  • Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press. ↵

A between-subjects design in which participants have not been randomly assigned to conditions.

The dependent variable is measured once before the treatment is implemented and once after it is implemented.

A category of alternative explanations for differences between scores such as events that happened between the pretest and posttest, unrelated to the study.

An alternative explanation that refers to how the participants might have changed between the pretest and posttest in ways that they were going to anyway because they are growing and learning.

The statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion.

The tendency for many medical and psychological problems to improve over time without any form of treatment.

A set of measurements taken at intervals over a period of time that are interrupted by a treatment.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

example of single subject quasi experimental research

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

example of single subject quasi experimental research

Home Market Research Research Tools and Apps

Quasi-experimental Research: What It Is, Types & Examples

quasi-experimental research is research that appears to be experimental but is not.

Much like an actual experiment, quasi-experimental research tries to demonstrate a cause-and-effect link between a dependent and an independent variable. A quasi-experiment, on the other hand, does not depend on random assignment, unlike an actual experiment. The subjects are sorted into groups based on non-random variables.

What is Quasi-Experimental Research?

“Resemblance” is the definition of “quasi.” Individuals are not randomly allocated to conditions or orders of conditions, even though the regression analysis is changed. As a result, quasi-experimental research is research that appears to be experimental but is not.

The directionality problem is avoided in quasi-experimental research since the regression analysis is altered before the multiple regression is assessed. However, because individuals are not randomized at random, there are likely to be additional disparities across conditions in quasi-experimental research.

As a result, in terms of internal consistency, quasi-experiments fall somewhere between correlational research and actual experiments.

The key component of a true experiment is randomly allocated groups. This means that each person has an equivalent chance of being assigned to the experimental group or the control group, depending on whether they are manipulated or not.

Simply put, a quasi-experiment is not a real experiment. A quasi-experiment does not feature randomly allocated groups since the main component of a real experiment is randomly assigned groups. Why is it so crucial to have randomly allocated groups, given that they constitute the only distinction between quasi-experimental and actual  experimental research ?

Let’s use an example to illustrate our point. Let’s assume we want to discover how new psychological therapy affects depressed patients. In a genuine trial, you’d split half of the psych ward into treatment groups, With half getting the new psychotherapy therapy and the other half receiving standard  depression treatment .

And the physicians compare the outcomes of this treatment to the results of standard treatments to see if this treatment is more effective. Doctors, on the other hand, are unlikely to agree with this genuine experiment since they believe it is unethical to treat one group while leaving another untreated.

A quasi-experimental study will be useful in this case. Instead of allocating these patients at random, you uncover pre-existing psychotherapist groups in the hospitals. Clearly, there’ll be counselors who are eager to undertake these trials as well as others who prefer to stick to the old ways.

These pre-existing groups can be used to compare the symptom development of individuals who received the novel therapy with those who received the normal course of treatment, even though the groups weren’t chosen at random.

If any substantial variations between them can be well explained, you may be very assured that any differences are attributable to the treatment but not to other extraneous variables.

As we mentioned before, quasi-experimental research entails manipulating an independent variable by randomly assigning people to conditions or sequences of conditions. Non-equivalent group designs, pretest-posttest designs, and regression discontinuity designs are only a few of the essential types.

What are quasi-experimental research designs?

Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn’t give full control over the independent variable(s) like true experimental designs do.

In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at random. Instead, people are put into groups based on things they already have in common, like their age, gender, or how many times they have seen a certain stimulus.

Because the assignments are not random, it is harder to draw conclusions about cause and effect than in a real experiment. However, quasi-experimental designs are still useful when randomization is not possible or ethical.

The true experimental design may be impossible to accomplish or just too expensive, especially for researchers with few resources. Quasi-experimental designs enable you to investigate an issue by utilizing data that has already been paid for or gathered by others (often the government). 

Because they allow better control for confounding variables than other forms of studies, they have higher external validity than most genuine experiments and higher  internal validity  (less than true experiments) than other non-experimental research.

Is quasi-experimental research quantitative or qualitative?

Quasi-experimental research is a quantitative research method. It involves numerical data collection and statistical analysis. Quasi-experimental research compares groups with different circumstances or treatments to find cause-and-effect links. 

It draws statistical conclusions from quantitative data. Qualitative data can enhance quasi-experimental research by revealing participants’ experiences and opinions, but quantitative data is the method’s foundation.

Quasi-experimental research types

There are many different sorts of quasi-experimental designs. Three of the most popular varieties are described below: Design of non-equivalent groups, Discontinuity in regression, and Natural experiments.

Design of Non-equivalent Groups

Example: design of non-equivalent groups, discontinuity in regression, example: discontinuity in regression, natural experiments, example: natural experiments.

However, because they couldn’t afford to pay everyone who qualified for the program, they had to use a random lottery to distribute slots.

Experts were able to investigate the program’s impact by utilizing enrolled people as a treatment group and those who were qualified but did not play the jackpot as an experimental group.

How QuestionPro helps in quasi-experimental research?

QuestionPro can be a useful tool in quasi-experimental research because it includes features that can assist you in designing and analyzing your research study. Here are some ways in which QuestionPro can help in quasi-experimental research:

Design surveys

Randomize participants, collect data over time, analyze data, collaborate with your team.

With QuestionPro, you have access to the most mature market research platform and tool that helps you collect and analyze the insights that matter the most. By leveraging InsightsHub, the unified hub for data management, you can ​​leverage the consolidated platform to organize, explore, search, and discover your  research data  in one organized data repository . 

Optimize Your quasi-experimental research with QuestionPro. Get started now!

FREE TRIAL         RICHIEDI DEMO

MORE LIKE THIS

example of single subject quasi experimental research

CX Shenanigans: Booth Duty and Beyond — Tuesday CX Thoughts

Jul 9, 2024

Negative correlation

Negative Correlation: Definition, Examples + How to Find It?

customer marketing

Customer Marketing: The Best Kept Secret of Big Brands

Jul 8, 2024

positive correlation

Positive Correlation: What It Is, Importance & How It Works

Jul 5, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence

Our websites may use cookies to personalize and enhance your experience. By continuing without changing your cookie settings, you agree to this collection. For more information, please see our University Websites Privacy Notice .

Neag School of Education

Educational Research Basics by Del Siegle

Single subject research.

“ Single subject research (also known as single case experiments) is popular in the fields of special education and counseling. This research design is useful when the researcher is attempting to change the behavior of an individual or a small group of individuals and wishes to document that change. Unlike true experiments where the researcher randomly assigns participants to a control and treatment group, in single subject research the participant serves as both the control and treatment group. The researcher uses line graphs to show the effects of a particular intervention or treatment.  An important factor of single subject research is that only one variable is changed at a time. Single subject research designs are “weak when it comes to external validity….Studies involving single-subject designs that show a particular treatment to be effective in changing behavior must rely on replication–across individuals rather than groups–if such results are be found worthy of generalization” (Fraenkel & Wallen, 2006, p. 318).

Suppose a researcher wished to investigate the effect of praise on reducing disruptive behavior over many days. First she would need to establish a baseline of how frequently the disruptions occurred. She would measure how many disruptions occurred each day for several days. In the example below, the target student was disruptive seven times on the first day, six times on the second day, and seven times on the third day. Note how the sequence of time is depicted on the x-axis (horizontal axis) and the dependent variable (outcome variable) is depicted on the y-axis (vertical axis).

image002

Once a baseline of behavior has been established (when a consistent pattern emerges with at least three data points), the intervention begins. The researcher continues to plot the frequency of behavior while implementing the intervention of praise.

image004

In this example, we can see that the frequency of disruptions decreased once praise began. The design in this example is known as an A-B design. The baseline period is referred to as A and the intervention period is identified as B.

image006

Another design is the A-B-A design. An A-B-A design (also known as a reversal design) involves discontinuing the intervention and returning to a nontreatment condition.

image008

Sometimes an individual’s behavior is so severe that the researcher cannot wait to establish a baseline and must begin with an intervention. In this case, a B-A-B design is used. The intervention is implemented immediately (before establishing a baseline). This is followed by a measurement without the intervention and then a repeat of the intervention.

image010

Multiple-Baseline Design

Sometimes, a researcher may be interested in addressing several issues for one student or a single issue for several students. In this case, a multiple-baseline design is used.

“In a multiple baseline across subjects design, the researcher introduces the intervention to different persons at different times. The significance of this is that if a behavior changes only after the intervention is presented, and this behavior change is seen successively in each subject’s data, the effects can more likely be credited to the intervention itself as opposed to other variables. Multiple-baseline designs do not require the intervention to be withdrawn. Instead, each subject’s own data are compared between intervention and nonintervention behaviors, resulting in each subject acting as his or her own control (Kazdin, 1982). An added benefit of this design, and all single-case designs, is the immediacy of the data. Instead of waiting until postintervention to take measures on the behavior, single-case research prescribes continuous data collection and visual monitoring of that data displayed graphically, allowing for immediate instructional decision-making. Students, therefore, do not linger in an intervention that is not working for them, making the graphic display of single-case research combined with differentiated instruction responsive to the needs of students.” (Geisler, Hessler, Gardner, & Lovelace, 2009)

image012

Regardless of the research design, the line graphs used to illustrate the data contain a set of common elements.

image014

Generally, in single subject research we count the number of times something occurs in a given time period and see if it occurs more or less often in that time period after implementing an intervention. For example, we might measure how many baskets someone makes while shooting for 2 minutes. We would repeat that at least three times to get our baseline. Next, we would test some intervention. We might play music while shooting, give encouragement while shooting, or video the person while shooting to see if our intervention influenced the number of shots made. After the 3 baseline measurements (3 sets of 2 minute shooting), we would measure several more times (sets of 2 minute shooting) after the intervention and plot the time points (number of baskets made in 2 minutes for each of the measured time points). This works well for behaviors that are distinct and can be counted.

Sometimes behaviors come and go over time (such as being off task in a classroom or not listening during a coaching session). The way we can record these is to select a period of time (say 5 minutes) and mark down every 10 seconds whether our participant is on task. We make a minimum of three sets of 5 minute observations for a baseline, implement an intervention, and then make more sets of 5 minute observations with the intervention in place. We use this method rather than counting how many times someone is off task because one could continually be off task and that would only be a count of 1 since the person was continually off task. Someone who might be off task twice for 15 second would be off task twice for a score of 2. However, the second person is certainly not off task twice as much as the first person. Therefore, recording whether the person is off task at 10-second intervals gives a more accurate picture. The person continually off task would have a score of 30 (off task at every second interval for 5 minutes) and the person off task twice for a short time would have a score of 2 (off task only during 2 of the 10 second interval measures.

I also have additional information about how to record single-subject research data .

I hope this helps you better understand single subject research.

I have created a PowerPoint on Single Subject Research , which also available below as a video.

I have also created instructions for creating single-subject research design graphs with Excel .

Fraenkel, J. R., & Wallen, N. E. (2006). How to design and evaluate research in education (6th ed.). Boston, MA: McGraw Hill.

Geisler, J. L., Hessler, T., Gardner, R., III, & Lovelace, T. S. (2009). Differentiated writing interventions for high-achieving urban African American elementary students. Journal of Advanced Academics, 20, 214–247.

Del Siegle, Ph.D. University of Connecticut [email protected] www.delsiegle.info

Revised 02/02/2024

example of single subject quasi experimental research

A Quasi-Experimental and Single-Subject Research Approach as an Alternative to Traditional Post-Occupancy Evaluation of Learning Environments

Cite this chapter.

example of single subject quasi experimental research

  • Terry Byers 6  

Part of the book series: Advances in Learning Environments Research ((ALER))

1347 Accesses

The past decade has seen a resurgence in the literature concerning the effectiveness of physical learning environments. A worrying characteristic of this research has been a lack of rigorous experimental methodology (Brooks, 2011; Painter et al., 2013). This may be due to the difficulties associated with randomly assigning students and staff to specific settings and problems associated with accounting for the complex intervening variables that come to play within the educative experience (Byers, Imms & Hartnell-Young, 2014).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unable to display preview.  Download preview PDF.

Similar content being viewed by others

example of single subject quasi experimental research

Learning environments research in English classrooms

example of single subject quasi experimental research

A Comparison of Student Perceptions of Physical and Virtual Engineering Laboratory Classes

Baguley, T. (2009). Standardized or simple effect size: What should be reported? British Journal of Psychology, 100 , 603–617. doi: 10.1348/000712608X377117

Article   Google Scholar  

Beeson, P. M., & Robey, R. R. (2006). Evaluating single-subject treatment research: Lessons learned from the aphasia literature. Neuropsychology Review, 16 (4), 161–169. doi: 10.1007/s11065-006-9013-7

Blackmore, J., Bateman, D., O’Mara, J., & Loughlin, J. (2011). Research into the connection between built learning spaces and student outcomes: Literature review . Melbourne: Victorian Department of Education and Early Childhood Development. Retrieved from http://www.eduweb.vic.gov.au/edulibrary/public/publ/research/publ/blackmore_learning_spaces.pdf

Bobrovitz, C. D., & Ottenbacher, K. J. (1998). Comparison of visual inspection and statistical analysis of single-subject data in rehabilitation research. American Journal of Physical Medicine and Rehabilitation, 77 (2), 94–102.

Brooks, D. C. (2011). Space matters: The impact of formal learning environments on student learning. British Journal of Educational Technology, 42 (5), 719–726. doi: 10.1111/j.1467-8535.2010.01098.x

Byers, T., Imms, W., & Hartnell-Young, E. (2014). Making the case for space: The effect of learning spaces on teaching and learning. Curriculum and Teaching, 29 (1), 5–19. doi: 10.7459/ct/29.1.02

Byiers, B. J., Reichle, J., & Symons, F. J. (2012). Single-subject experimental design for evidence-based practice. American Journal of Speech-Language Pathology, 21 (4), 397–414. doi: 10.1044/1058-0360(2012/11-0036 )

Cakiroglu, O. (2012). Single subject research: Applications to special education. British Journal of Special Education, 39 (1), 21–29.

Campbell, D. T. (1957). Factors relevant to the validity of experiments in social settings. Psychological Bulletin, 54 (4), 297–312. doi: 10.1037/h0040950

Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research on teaching . Chicago, IL: Rand McNally.

Google Scholar  

Casey, L. B., Meindl, J. N., Frame, K., Elswick, S., Hayes, J., & Wyatt, J. (2012). Current trends in education: How single-subject research can help middle and high school educators keep up with the zeitgeist. Clearing House: A Journal of Educational Strategies, Issues and Ideas, 85 (3), 109–116.

Clegg, S. (2005). Evidence-based practice in educational research: A critical realist critique of systematic review. British Journal of Sociology of Education, 26 (3), 415–428. doi: 10.1080/01425690500128932

Cohen, J. (1998). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings . Chicago, IL: Rand McNally.

Coryn, C. L. S., Schröter, D. C., & Hanssen, C. E. (2009). Adding a time-series design element to the success case method to improve methodological rigor an application for nonprofit program evaluation. American Journal of Evaluation, 30 (1), 80–92. doi: 10.1177/1098214008326557

Creswell, J. W. (2005). Educational research: Planning, conducting, and evaluating quantitative and qualitative research (2nd ed.). Boston, MA: Pearson.

Dori, Y. J., & Belcher, J. (2005). How does technology-enabled active learning affect undergraduate students’ understanding of electromagnetism concepts? The Journal of the Learning Sciences, 14 (2), 243–279.

Dori, Y. J., Belcher, J., Bessette, M., Danziger, M., McKinney, A., & Hult, E. (2003). Technology for active learning. Materials Today, 6 (12), 44–49. doi: 10.1016/S1369-7021(03)01225-2

Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G* power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39 (2), 175–191.

Fredricks, J. A., McColskey, W., Meli, J., Mordica, J., Montrosse, B., & Mooney, K. (2011). Measuring student engagement in upper elementary through high school: A description of 21 instruments (p. 88). Washington, DC: Regional Educational Laboratory Southeast.

Gliem, J. A., & Gliem, R. R. (2003). Calculating, interpreting, and reporting cronbach’s alpha reliability coefficient for likert-type scales . Paper presented at the Midwest Research-to-Practice Conference in Adult, Continuing, and Community Education, The Ohio State University, Columbus, OH.

Harris, A. D., McGregor, J. C., Perencevich, E. N., Furuno, J. P., Zhu, J., Peterson, D. E., & Finkelstein, J. (2006). The use and interpretation of quasi-experimental studies in medical informatics. Journal of the American Medical Informatics Association, 13 (1), 16–23. doi: 10.1197/jamia.M1749

Higgins, S., Hall, E., Wall, K., Woolner, P., & McCaughey, C. (2005). The impact of school environments: A literature review . Newcastle, United Kingdom: University of Newcastle.

Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71 (2), 165–179.

Horner, R. H., Swaminathan, H. S., & George, S. K. (2012). Considerations for the systematic analysis and use of single-case research. Education & Treatment of Children, 35 (2), 269.

Jenson, W. R., Clark, E., Kircher, J. C., & Kristjansson, S. D. (2007). Statistical reform: Evidence-based practice, meta-analyses, and single subject designs. Psychology in the Schools, 44 (5), 483–493.

Johnston, M. V., Ottenbacher, K. J., & Reichardt, C. S. (1995). Strong quasi-experimental designs for research on the effectiveness of rehabilitation. American Journal of Physical Medicine and Rehabilitation, 74 , 383–392.

Kinugasa, T., Cerin, E., & Hooper, S. (2004). Single-subject research designs and data analyses for assessing elite athletes’ conditioning. Sports Medicine, 34 (15), 1035–1050.

Kromrey, J. D., & Foster-Johnson, L. (1996). Determining the efficacy of intervention: The use of effect sizes for data analysis in single-subject research. The Journal of Experimental Education , (1), 73–93.

Lambert, N. N., & McCombs, B. L. (Eds.). (1988). How students learn: Reforming schools through learner-centered education . Washington, DC: APA Books.

Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Hoboken, NJ: Wiley.

Mitchell, M. L., & Jolley, J. M. (2012). Research design explained (8th ed.). Belmont, CA: Wadsworth Cengage Learning.

Nourbakhsh, M. R., & Ottenbacher, K. J. (1994). The statistical analysis of single-subject data: A comparative examination. Physical Therapy, 74 (8), 768–776.

Painter, S., Fournier, J., Grape, C., Grummon, P., Morelli, J., Whitmer, S., & Cevetello, J. (2013). Research on learning space design: Present state, future directions (The Perry Chapman Prize). Ann Arbor, MI: Society for College and University Planning.

Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74 (4), 525–556. doi: 10.3102/00346543074004525

Pintrich, P. R., & De Groot, E. V. (1990). Motivational and self-regulated learning components of classroom academic performance. Journal of Educational Psychology, 82 (1), 33–40. doi: 10.1037/0022-0663.82.1.33

Rassafiani, M., & Sahaf, R. (2010). Single case experimental design: an overview. International Journal of Therapy & Rehabilitation, 17 (6), 285–289.

Robson, C. (2011). Real world research: A resource for social scientists and practitioner-researchers (3rd ed.). Chichester & Hoboken, NJ: Wiley-Blackwell.

Shadish, W. R., & Cook, T. D. (1999). Comment-design rules: More steps toward a complete theory of quasi-experimentation. Statistical Science, 14 (3), 294–300. doi: 10.2307/2676764

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference . Boston, MA: Houghton Mifflin.

Shadish, W. R., & Luellen, J. K. (2012). Quasi-experimental design. In J. L. Green, G. Camilli, & P. B. Elmore (Eds.), Handbook of complementary methods in education research (pp. 539–550). Mahwah, NJ: Routledge.

Shadish, W. R., Hedges, L. V., & Pustejovsky, J. E. (2014). Analysis and meta-analysis of single-case designs with a standardized mean difference statistic: A primer and applications. Journal of School Psychology, 52 (2), 123–147. doi: 10.1016/j.jsp.2013.11.005

Tamim, R. M., Lowerison, G., Schmid, R. F., Bernard, R. M., & Abrami, P. C. (2011). A multi-year investigation of the relationship between pedagogy, computer use and course effectiveness in postsecondary education. Journal of Computing in Higher Education, 23 (1), 1–14. doi: 10.1007/s12528-010-9041-4

Upitis, R. (2009). Complexity and design: How school architecture influences learning. Design Principles and Practices: An International Journal, 3 (2), 1–14.

Vickers, A. (2003). How many repeated measures in repeated measures designs? Statistical issues for comparative trials. BMC Medical Research Methodology, 3 (1), 22.

Walker, J. D., Brooks, D. C., & Baepler, P. (2011). Pedagogy and space: Empirical research on new learning environments. EDUCAUSE Quarterly, 34 (4). Retrieved from http://www.educause.edu/ero/article/pedagogy-and-space-empirical-research-new-learning-environments

West, S. G., & Thoemmes, F. (2010). Campbell’s and Rubin’s perspectives on causal inference. Psychological Methods, 15 (1), 18–37. doi: 10.1037/a0015917

Whiteside, A. L., Brooks, D. C., & Walker, J. D. (2010). Making the case for space: Three years of empirical research on learning environments. EDUCAUSE Quarterly, 33 (3).

Download references

Author information

Authors and affiliations.

Melbourne Graduate School of Education, The University of Melbourne, Australia

Terry Byers

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

University of Melbourne, Australia

Wesley Imms

Benjamin Cleveland

Kenn Fisher

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Sense Publishers

About this chapter

Byers, T. (2016). A Quasi-Experimental and Single-Subject Research Approach as an Alternative to Traditional Post-Occupancy Evaluation of Learning Environments. In: Imms, W., Cleveland, B., Fisher, K. (eds) Evaluating Learning Environments. Advances in Learning Environments Research. SensePublishers, Rotterdam. https://doi.org/10.1007/978-94-6300-537-1_9

Download citation

DOI : https://doi.org/10.1007/978-94-6300-537-1_9

Publisher Name : SensePublishers, Rotterdam

Online ISBN : 978-94-6300-537-1

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

10.1 Overview of Single-Subject Research

Learning objectives.

  • Explain what single-subject research is, including how it differs from other types of psychological research.
  • Explain who uses single-subject research and why.

What Is Single-Subject Research?

Single-subject research  is a type of quantitative research that involves studying in detail the behavior of each of a small number of participants. Note that the term  single-subject  does not mean that only one participant is studied; it is more typical for there to be somewhere between two and 10 participants. (This is why single-subject research designs are sometimes called small- n designs, where  n  is the statistical symbol for the sample size.) Single-subject research can be contrasted with  group research , which typically involves studying large numbers of participants and examining their behavior primarily in terms of group means, standard deviations, and so on. The majority of this textbook is devoted to understanding group research, which is the most common approach in psychology. But single-subject research is an important alternative, and it is the primary approach in some more applied areas of psychology.

Before continuing, it is important to distinguish single-subject research from case studies and other more qualitative approaches that involve studying in detail a small number of participants. As described in Chapter 6, case studies involve an in-depth analysis and description of an individual, which is typically primarily qualitative in nature. More broadly speaking, qualitative research focuses on understanding people’s subjective experience by observing behavior and collecting relatively unstructured data (e.g., detailed interviews) and analyzing those data using narrative rather than quantitative techniques. Single-subject research, in contrast, focuses on understanding objective behavior through experimental manipulation and control, collecting highly structured data, and analyzing those data quantitatively.

Assumptions of Single-Subject Research

Again, single-subject research involves studying a small number of participants and focusing intensively on the behavior of each one. But why take this approach instead of the group approach? There are several important assumptions underlying single-subject research, and it will help to consider them now.

First and foremost is the assumption that it is important to focus intensively on the behavior of individual participants. One reason for this is that group research can hide individual differences and generate results that do not represent the behavior of any individual. For example, a treatment that has a positive effect for half the people exposed to it but a negative effect for the other half would, on average, appear to have no effect at all. Single-subject research, however, would likely reveal these individual differences. A second reason to focus intensively on individuals is that sometimes it is the behavior of a particular individual that is primarily of interest. A school psychologist, for example, might be interested in changing the behavior of a particular disruptive student. Although previous published research (both single-subject and group research) is likely to provide some guidance on how to do this, conducting a study on this student would be more direct and probably more effective.

A second assumption of single-subject research is that it is important to discover causal relationships through the manipulation of an independent variable, the careful measurement of a dependent variable, and the control of extraneous variables. For this reason, single-subject research is often considered a type of experimental research with good internal validity. Recall, for example, that Hall and his colleagues measured their dependent variable (studying) many times—first under a no-treatment control condition, then under a treatment condition (positive teacher attention), and then again under the control condition. Because there was a clear increase in studying when the treatment was introduced, a decrease when it was removed, and an increase when it was reintroduced, there is little doubt that the treatment was the cause of the improvement.

A third assumption of single-subject research is that it is important to study strong and consistent effects that have biological or social importance. Applied researchers, in particular, are interested in treatments that have substantial effects on important behaviors and that can be implemented reliably in the real-world contexts in which they occur. This is sometimes referred to as social validity  (Wolf, 1976) [1] . The study by Hall and his colleagues, for example, had good social validity because it showed strong and consistent effects of positive teacher attention on a behavior that is of obvious importance to teachers, parents, and students. Furthermore, the teachers found the treatment easy to implement, even in their often-chaotic elementary school classrooms.

Who Uses Single-Subject Research?

Single-subject research has been around as long as the field of psychology itself. In the late 1800s, one of psychology’s founders, Wilhelm Wundt, studied sensation and consciousness by focusing intensively on each of a small number of research participants. Herman Ebbinghaus’s research on memory and Ivan Pavlov’s research on classical conditioning are other early examples, both of which are still described in almost every introductory psychology textbook.

In the middle of the 20th century, B. F. Skinner clarified many of the assumptions underlying single-subject research and refined many of its techniques (Skinner, 1938) [2] . He and other researchers then used it to describe how rewards, punishments, and other external factors affect behavior over time. This work was carried out primarily using nonhuman subjects—mostly rats and pigeons. This approach, which Skinner called the experimental analysis of behavior —remains an important subfield of psychology and continues to rely almost exclusively on single-subject research. For excellent examples of this work, look at any issue of the  Journal of the Experimental Analysis of Behavior . By the 1960s, many researchers were interested in using this approach to conduct applied research primarily with humans—a subfield now called  applied behavior analysis  (Baer, Wolf, & Risley, 1968) [3] . Applied behavior analysis plays an especially important role in contemporary research on developmental disabilities, education, organizational behavior, and health, among many other areas. Excellent examples of this work (including the study by Hall and his colleagues) can be found in the  Journal of Applied Behavior Analysis .

Although most contemporary single-subject research is conducted from the behavioral perspective, it can in principle be used to address questions framed in terms of any theoretical perspective. For example, a studying technique based on cognitive principles of learning and memory could be evaluated by testing it on individual high school students using the single-subject approach. The single-subject approach can also be used by clinicians who take any theoretical perspective—behavioral, cognitive, psychodynamic, or humanistic—to study processes of therapeutic change with individual clients and to document their clients’ improvement (Kazdin, 1982) [4] .

Key Takeaways

  • Single-subject research—which involves testing a small number of participants and focusing intensively on the behavior of each individual—is an important alternative to group research in psychology.
  • Single-subject studies must be distinguished from qualitative research on a single person or small number of individuals. Unlike more qualitative research, single-subject research focuses on understanding objective behavior through experimental manipulation and control, collecting highly structured data, and analyzing those data quantitatively.
  • Single-subject research has been around since the beginning of the field of psychology. Today it is most strongly associated with the behavioral theoretical perspective, but it can in principle be used to study behavior from any perspective.
  • Practice: Find and read a published article in psychology that reports new single-subject research. (An archive of articles published in the Journal of Applied Behavior Analysis can be found at http://www.ncbi.nlm.nih.gov/pmc/journals/309/ ) Write a short summary of the study.
  • Wolf, M. (1976). Social validity: The case for subjective measurement or how applied behavior analysis is finding its heart.  Journal of Applied Behavior Analysis, 11 , 203–214. ↵
  • Skinner, B. F. (1938). T he behavior of organisms: An experimental analysis . New York, NY: Appleton-Century-Crofts. ↵
  • Baer, D. M., Wolf, M. M., & Risley, T. R. (1968). Some current dimensions of applied behavior analysis.  Journal of Applied Behavior Analysis, 1 , 91–97. ↵
  • Kazdin, A. E. (1982).  Single-case research designs: Methods for clinical and applied settings . New York, NY: Oxford University Press. ↵

Creative Commons License

Share This Book

  • Increase Font Size

ASHA_org_pad

  • CREd Library , Research Design and Method

Single-Subject Experimental Design: An Overview

Cred library, julie wambaugh, and ralf schlosser.

  • December, 2014

DOI: 10.1044/cred-cred-ssd-r101-002

Single-subject experimental designs – also referred to as within-subject or single case experimental designs – are among the most prevalent designs used in CSD treatment research. These designs provide a framework for a quantitative, scientifically rigorous approach where each participant provides his or her own experimental control.

An Overview of Single-Subject Experimental Design

What is single-subject design.

Transcript of the video Q&A with Julie Wambaugh. The essence of single-subject design is using repeated measurements to really understand an individual’s variability, so that we can use our understanding of that variability to determine what the effects of our treatment are. For me, one of the first steps in developing a treatment is understanding what an individual does. So, if I were doing a group treatment study, I would not necessarily be able to see or to understand what was happening with each individual patient, so that I could make modifications to my treatment and understand all the details of what’s happening in terms of the effects of my treatment. For me it’s a natural first step in the progression of developing a treatment. Also with the disorders that we deal with, it’s very hard to get the number of participants that we would need for the gold standard randomized controlled trial. Using single-subject designs works around the possible limiting factor of not having enough subjects in a particular area of study. My mentor was Dr. Cynthia Thompson, who was trained by Leija McReynolds from the University of Kansas, which was where a lot of single-subject design in our field originated, and so I was fortunate to be on the cutting edge of this being implemented in our science back in the late ’70s early ’80s. We saw, I think, a nice revolution in terms of attention to these types of designs, giving credit to the type of data that could be obtained from these types of designs, and a flourishing of these designs really through the 1980s into the 1990s and into the 2000s. But I think — I’ve talked with other single-subject design investigators, and now we’re seeing maybe a little bit of a lapse of attention, and a lack of training again among our young folks. Maybe people assume that people understand the foundation, but they really don’t. And more problems are occurring with the science. I think we need to re-establish the foundations in our young scientists. And this project, I think, will be a big plus toward moving us in that direction.

What is the Role of Single-Subject Design?

Transcript of the video Q&A with Ralf Schlosser. So what has happened recently, is with the onset of evidence-based practice and the adoption of the common hierarchy of evidence in terms of designs. As you noted the randomized controlled trial and meta-analyses of randomized controlled trials are on top of common hierarchies. And that’s fine. But it doesn’t mean that single-subject cannot play a role. For example, single-subject design can be implemented prior to implementing a randomized controlled trial to get a better handle on the magnitude of the effects, the workings of the active ingredients, and all of that. It is very good to prepare that prior to developing a randomized controlled trial. After you have implemented the randomized controlled trial, and then you want to implement the intervention in a more naturalistic setting, it becomes very difficult to do that in a randomized form or at the group level. So again, single-subject design lends itself to more practice-oriented implementation. So I see it as a crucial methodology among several. What we can do to promote what single-subject design is good for is to speak up. It is important that it is being recognized for what it can do and what it cannot do.

Basic Features and Components of Single-Subject Experimental Designs

Defining Features Single-subject designs are defined by the following features:

  • An individual “case” is the unit of intervention and unit of data analysis.
  • The case provides its own control for purposes of comparison. For example, the case’s series of outcome variables are measured prior to the intervention and compared with measurements taken during (and after) the intervention.
  • The outcome variable is measured repeatedly within and across different conditions or levels of the independent variable.

See Kratochwill, et al. (2010)

Structure and Phases of the Design Single-subject designs are typically described according to the arrangement of baseline and treatment phases.

The conditions in a single-subject experimental study are often assigned letters such as the A phase and the B phase, with A being the baseline, or no-treatment phase, and B the experimental, or treatment phase. (Other letters are sometimes used to designate other experimental phases.) Generally, the A phase serves as a time period in which the behavior or behaviors of interest are counted or scored prior to introducing treatment. In the B phase, the same behavior of the individual is counted over time under experimental conditions while treatment is administered. Decisions regarding the effect of treatment are then made by comparing an individual’s performance during the treatment, B phase, and the no-treatment. McReynolds and Thompson (1986)

Basic Components Important primary components of a single-subject study include the following:

  • The participant is the unit of analysis, where a participant may be an individual or a unit such as a class or school.
  • Participant and setting descriptions are provided with sufficient detail to allow another researcher to recruit similar participants in similar settings.
  • Dependent variables are (a) operationally defined and (b) measured repeatedly.
  • An independent variable is actively manipulated, with the fidelity of implementation documented.
  • A baseline condition demonstrates a predictable pattern which can be compared with the intervention condition(s).
  • Experimental control is achieved through introduction and withdrawal/reversal, staggered introduction, or iterative manipulation of the independent variable.
  • Visual analysis is used to interpret the level, trend, and variability of the data within and across phases.
  • External validity of results is accomplished through replication of the effects.
  • Social validity is established by documenting that interventions are functionally related to change in socially important outcomes.

See Horner, et al. (2005)

Common Misconceptions

Single-Subject Experimental Designs versus Case Studies

Transcript of the video Q&A with Julie Wambaugh. One of the biggest mistakes, that is a huge problem, is misunderstanding that a case study is not a single-subject experimental design. There are controls that need to be implemented, and a case study does not equate to a single-subject experimental design. People misunderstand or they misinterpret the term “multiple baseline” to mean that because you are measuring multiple things, that that gives you the experimental control. You have to be demonstrating, instead, that you’ve measured multiple behaviors and that you’ve replicated your treatment effect across those multiple behaviors. So, one instance of one treatment being implemented with one behavior is not sufficient, even if you’ve measured other things. That’s a very common mistake that I see. There’s a design — an ABA design — that’s a very strong experimental design where you measure the behavior, you implement treatment, and you then to get experimental control need to see that treatment go back down to baseline, for you to have evidence of experimental control. It’s a hard behavior to implement in our field because we want our behaviors to stay up! We don’t want to see them return back to baseline. Oftentimes people will say they did an ABA. But really, in effect, all they did was an AB. They measured, they implemented treatment, and the behavior changed because the treatment was successful. That does not give you experimental control. They think they did an experimentally sound design, but because the behavior didn’t do what the design requires to get experimental control, they really don’t have experimental control with their design.

Single-subject studies should not be confused with case studies or other non-experimental designs.

In case study reports, procedures used in treatment of a particular client’s behavior are documented as carefully as possible, and the client’s progress toward habilitation or rehabilitation is reported. These investigations provide useful descriptions. . . .However, a demonstration of treatment effectiveness requires an experimental study. A better role for case studies is description and identification of potential variables to be evaluated in experimental studies. An excellent discussion of this issue can be found in the exchange of letters to the editor by Hoodin (1986) [Article] and Rubow and Swift (1986) [Article]. McReynolds and Thompson (1986)

Other Single-Subject Myths

Transcript of the video Q&A with Ralf Schlosser. Myth 1: Single-subject experiments only have one participant. Obviously, it requires only one subject, one participant. But that’s a misnomer to think that single-subject is just about one participant. You can have as many as twenty or thirty. Myth 2: Single-subject experiments only require one pre-test/post-test. I think a lot of students in the clinic are used to the measurement of one pre-test and one post-test because of the way the goals are written, and maybe there’s not enough time to collect continuous data.But single-case experimental designs require ongoing data collection. There’s this misperception that one baseline data point is enough. But for single-case experimental design you want to see at least three data points, because it allows you to see a trend in the data. So there’s a myth about the number of data points needed. The more data points we have, the better. Myth 3: Single-subject experiments are easy to do. Single-subject design has its own tradition of methodology. It seems very easy to do when you read up on one design. But there are lots of things to consider, and lots of things can go wrong.It requires quite a bit of training. It takes at least one three-credit course that you take over the whole semester.

Further Reading: Components of Single-Subject Designs

Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M. & Shadish, W. R. (2010). Single-case designs technical documentation. From the What Works Clearinghouse. http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=229

Further Reading: Single-Subject Design Textbooks

Kazdin, A. E. (2011). Single-case research designs: Methods for clinical and applied settings. Oxford University Press.

McReynolds, L. V. & Kearns, K. (1983). Single-subject experimental designs in communicative disorders. Baltimore: University Park Press.

Further Reading: Foundational Articles

Julie Wambaugh University of Utah

Ralf Schlosser Northeastern University

The content of this page is based on selected clips from video interviews conducted at the ASHA National Office.

Additional digested resources and references for further reading were selected and implemented by CREd Library staff.

Copyright © 2015 American Speech-Language-Hearing Association

logoCREDHeader

Clinical Research Education

More from the cred library, innovative treatments for persons with dementia, implementation science resources for crisp, when the ears interact with the brain, follow asha journals on twitter.

logoAcademy_Revised_2

© 1997-2024 American Speech-Language-Hearing Association Privacy Notice Terms of Use

  • Open access
  • Published: 03 July 2024

The impact of evidence-based nursing leadership in healthcare settings: a mixed methods systematic review

  • Maritta Välimäki 1 , 2 ,
  • Shuang Hu 3 ,
  • Tella Lantta 1 ,
  • Kirsi Hipp 1 , 4 ,
  • Jaakko Varpula 1 ,
  • Jiarui Chen 3 ,
  • Gaoming Liu 5 ,
  • Yao Tang 3 ,
  • Wenjun Chen 3 &
  • Xianhong Li 3  

BMC Nursing volume  23 , Article number:  452 ( 2024 ) Cite this article

458 Accesses

Metrics details

The central component in impactful healthcare decisions is evidence. Understanding how nurse leaders use evidence in their own managerial decision making is still limited. This mixed methods systematic review aimed to examine how evidence is used to solve leadership problems and to describe the measured and perceived effects of evidence-based leadership on nurse leaders and their performance, organizational, and clinical outcomes.

We included articles using any type of research design. We referred nurses, nurse managers or other nursing staff working in a healthcare context when they attempt to influence the behavior of individuals or a group in an organization using an evidence-based approach. Seven databases were searched until 11 November 2021. JBI Critical Appraisal Checklist for Quasi-experimental studies, JBI Critical Appraisal Checklist for Case Series, Mixed Methods Appraisal Tool were used to evaluate the Risk of bias in quasi-experimental studies, case series, mixed methods studies, respectively. The JBI approach to mixed methods systematic reviews was followed, and a parallel-results convergent approach to synthesis and integration was adopted.

Thirty-one publications were eligible for the analysis: case series ( n  = 27), mixed methods studies ( n  = 3) and quasi-experimental studies ( n  = 1). All studies were included regardless of methodological quality. Leadership problems were related to the implementation of knowledge into practice, the quality of nursing care and the resource availability. Organizational data was used in 27 studies to understand leadership problems, scientific evidence from literature was sought in 26 studies, and stakeholders’ views were explored in 24 studies. Perceived and measured effects of evidence-based leadership focused on nurses’ performance, organizational outcomes, and clinical outcomes. Economic data were not available.

Conclusions

This is the first systematic review to examine how evidence is used to solve leadership problems and to describe its measured and perceived effects from different sites. Although a variety of perceptions and effects were identified on nurses’ performance as well as on organizational and clinical outcomes, available knowledge concerning evidence-based leadership is currently insufficient. Therefore, more high-quality research and clinical trial designs are still needed.

Trail registration

The study was registered (PROSPERO CRD42021259624).

Peer Review reports

Global health demands have set new roles for nurse leaders [ 1 ].Nurse leaders are referred to as nurses, nurse managers, or other nursing staff working in a healthcare context who attempt to influence the behavior of individuals or a group based on goals that are congruent with organizational goals [ 2 ]. They are seen as professionals “armed with data and evidence, and a commitment to mentorship and education”, and as a group in which “leaders innovate, transform, and achieve quality outcomes for patients, health care professionals, organizations, and communities” [ 3 ]. Effective leadership occurs when team members critically follow leaders and are motivated by a leader’s decisions based on the organization’s requests and targets [ 4 ]. On the other hand, problems caused by poor leadership may also occur, regarding staff relations, stress, sickness, or retention [ 5 ]. Therefore, leadership requires an understanding of different problems to be solved using synthesizing evidence from research, clinical expertise, and stakeholders’ preferences [ 6 , 7 ]. If based on evidence, leadership decisions, also referred as leadership decision making [ 8 ], could ensure adequate staffing [ 7 , 9 ] and to produce sufficient and cost-effective care [ 10 ]. However, nurse leaders still rely on their decision making on their personal [ 11 ] and professional experience [ 10 ] over research evidence, which can lead to deficiencies in the quality and safety of care delivery [ 12 , 13 , 14 ]. As all nurses should demonstrate leadership in their profession, their leadership competencies should be strengthened [ 15 ].

Evidence-informed decision-making, referred to as evidence appraisal and application, and evaluation of decisions [ 16 ], has been recognized as one of the core competencies for leaders [ 17 , 18 ]. The role of evidence in nurse leaders’ managerial decision making has been promoted by public authorities [ 19 , 20 , 21 ]. Evidence-based management, another concept related to evidence-based leadership, has been used as the potential to improve healthcare services [ 22 ]. It can guide nursing leaders, in developing working conditions, staff retention, implementation practices, strategic planning, patient care, and success of leadership [ 13 ]. Collins and Holton [ 23 ] in their systematic review and meta-analysis examined 83 studies regarding leadership development interventions. They found that leadership training can result in significant improvement in participants’ skills, especially in knowledge level, although the training effects varied across studies. Cummings et al. [ 24 ] reviewed 100 papers (93 studies) and concluded that participation in leadership interventions had a positive impact on the development of a variety of leadership styles. Clavijo-Chamorro et al. [ 25 ] in their review of 11 studies focused on leadership-related factors that facilitate evidence implementation: teamwork, organizational structures, and transformational leadership. The role of nurse managers was to facilitate evidence-based practices by transforming contexts to motivate the staff and move toward a shared vision of change.

As far as we are aware, however, only a few systematic reviews have focused on evidence-based leadership or related concepts in the healthcare context aiming to analyse how nurse leaders themselves uses evidence in the decision-making process. Young [ 26 ] targeted definitions and acceptance of evidence-based management (EBMgt) in healthcare while Hasanpoor et al. [ 22 ] identified facilitators and barriers, sources of evidence used, and the role of evidence in the process of decision making. Both these reviews concluded that EBMgt was of great importance but used limitedly in healthcare settings due to a lack of time, a lack of research management activities, and policy constraints. A review by Williams [ 27 ] showed that the usage of evidence to support management in decision making is marginal due to a shortage of relevant evidence. Fraser [ 28 ] in their review further indicated that the potential evidence-based knowledge is not used in decision making by leaders as effectively as it could be. Non-use of evidence occurs and leaders base their decisions mainly on single studies, real-world evidence, and experts’ opinions [ 29 ]. Systematic reviews and meta-analyses rarely provide evidence of management-related interventions [ 30 ]. Tate et al. [ 31 ] concluded based on their systematic review and meta-analysis that the ability of nurse leaders to use and critically appraise research evidence may influence the way policy is enacted and how resources and staff are used to meet certain objectives set by policy. This can further influence staff and workforce outcomes. It is therefore important that nurse leaders have the capacity and motivation to use the strongest evidence available to effect change and guide their decision making [ 27 ].

Despite of a growing body of evidence, we found only one review focusing on the impact of evidence-based knowledge. Geert et al. [ 32 ] reviewed literature from 2007 to 2016 to understand the elements of design, delivery, and evaluation of leadership development interventions that are the most reliably linked to outcomes at the level of the individual and the organization, and that are of most benefit to patients. The authors concluded that it is possible to improve individual-level outcomes among leaders, such as knowledge, motivation, skills, and behavior change using evidence-based approaches. Some of the most effective interventions included, for example, interactive workshops, coaching, action learning, and mentoring. However, these authors found limited research evidence describing how nurse leaders themselves use evidence to support their managerial decisions in nursing and what the outcomes are.

To fill the knowledge gap and compliment to existing knowledgebase, in this mixed methods review we aimed to (1) examine what leadership problems nurse leaders solve using an evidence-based approach and (2) how they use evidence to solve these problems. We also explored (3) the measured and (4) perceived effects of the evidence-based leadership approach in healthcare settings. Both qualitative and quantitative components of the effects of evidence-based leadership were examined to provide greater insights into the available literature [ 33 ]. Together with the evidence-based leadership approach, and its impact on nursing [ 34 , 35 ], this knowledge gained in this review can be used to inform clinical policy or organizational decisions [ 33 ]. The study is registered (PROSPERO CRD42021259624). The methods used in this review were specified in advance and documented in a priori in a published protocol [ 36 ]. Key terms of the review and the search terms are defined in Table  1 (population, intervention, comparison, outcomes, context, other).

In this review, we used a mixed methods approach [ 37 ]. A mixed methods systematic review was selected as this approach has the potential to produce direct relevance to policy makers and practitioners [ 38 ]. Johnson and Onwuegbuzie [ 39 ] have defined mixed methods research as “the class of research in which the researcher mixes or combines quantitative and qualitative research techniques, methods, approaches, concepts or language into a single study.” Therefore, we combined quantitative and narrative analysis to appraise and synthesize empirical evidence, and we held them as equally important in informing clinical policy or organizational decisions [ 34 ]. In this review, a comprehensive synthesis of quantitative and qualitative data was performed first and then discussed in discussion part (parallel-results convergent design) [ 40 ]. We hoped that different type of analysis approaches could complement each other and deeper picture of the topic in line with our research questions could be gained [ 34 ].

Inclusion and exclusion criteria

Inclusion and exclusion criteria of the study are described in Table  1 .

Search strategy

A three-step search strategy was utilized. First, an initial limited search with #MEDLINE was undertaken, followed by analysis of the words used in the title, abstract, and the article’s key index terms. Second, the search strategy, including identified keywords and index terms, was adapted for each included data base and a second search was undertaken on 11 November 2021. The full search strategy for each database is described in Additional file 1 . Third, the reference list of all studies included in the review were screened for additional studies. No year limits or language restrictions were used.

Information sources

The database search included the following: CINAHL (EBSCO), Cochrane Library (academic database for medicine and health science and nursing), Embase (Elsevier), PsycINFO (EBSCO), PubMed (MEDLINE), Scopus (Elsevier) and Web of Science (academic database across all scientific and technical disciplines, ranging from medicine and social sciences to arts and humanities). These databases were selected as they represent typical databases in health care context. Subject headings from each of the databases were included in the search strategies. Boolean operators ‘AND’ and ‘OR’ were used to combine the search terms. An information specialist from the University of Turku Library was consulted in the formation of the search strategies.

Study selection

All identified citations were collated and uploaded into Covidence software (Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia www.covidence.org ), and duplicates were removed by the software. Titles and abstracts were screened and assessed against the inclusion criteria independently by two reviewers out of four, and any discrepancies were resolved by the third reviewer (MV, KH, TL, WC). Studies meeting the inclusion criteria were retrieved in full and archived in Covidence. Access to one full-text article was lacking: the authors for one study were contacted about the missing full text, but no full text was received. All remaining hits of the included studies were retrieved and assessed independently against the inclusion criteria by two independent reviewers of four (MV, KH, TL, WC). Studies that did not meet the inclusion criteria were excluded, and the reasons for exclusion were recorded in Covidence. Any disagreements that arose between the reviewers were resolved through discussions with XL.

Assessment of methodological quality

Eligible studies were critically appraised by two independent reviewers (YT, SH). Standardized critical appraisal instruments based on the study design were used. First, quasi-experimental studies were assessed using the JBI Critical Appraisal Checklist for Quasi-experimental studies [ 44 ]. Second, case series were assessed using the JBI Critical Appraisal Checklist for Case Series [ 45 ]. Third, mixed methods studies were appraised using the Mixed Methods Appraisal Tool [ 46 ].

To increase inter-reviewer reliability, the review agreement was calculated (SH) [ 47 ]. A kappa greater than 0.8 was considered to represent a high level of agreement (0–0.1). In our data, the agreement was 0.75. Discrepancies raised between two reviewers were resolved through discussion and modifications and confirmed by XL. As an outcome, studies that met the inclusion criteria were proceeded to critical appraisal and assessed as suitable for inclusion in the review. The scores for each item and overall critical appraisal scores were presented.

Data extraction

For data extraction, specific tables were created. First, study characteristics (author(s), year, country, design, number of participants, setting) were extracted by two authors independently (JC, MV) and reviewed by TL. Second, descriptions of the interventions were extracted by two reviewers (JV, JC) using the structure of the TIDIeR (Template for Intervention Description and Replication) checklist (brief name, the goal of the intervention, material and procedure, models of delivery and location, dose, modification, adherence and fidelity) [ 48 ]. The extractions were confirmed (MV).

Third, due to a lack of effectiveness data and a wide heterogeneity between study designs and presentation of outcomes, no attempt was made to pool the quantitative data statistically; the findings of the quantitative data were presented in narrative form only [ 44 ]. The separate data extraction tables for each research question were designed specifically for this study. For both qualitative (and a qualitative component of mixed-method studies) and quantitative studies, the data were extracted and tabulated into text format according to preplanned research questions [ 36 ]. To test the quality of the tables and the data extraction process, three authors independently extracted the data from the first five studies (in alphabetical order). After that, the authors came together to share and determine whether their approaches of the data extraction were consistent with each other’s output and whether the content of each table was in line with research question. No reason was found to modify the data extraction tables or planned process. After a consensus of the data extraction process was reached, the data were extracted in pairs by independent reviewers (WC, TY, SH, GL). Any disagreements that arose between the reviewers were resolved through discussion and with a third reviewer (MV).

Data analysis

We were not able to conduct a meta-analysis due to a lack of effectiveness data based on clinical trials. Instead, we used inductive thematic analysis with constant comparison to answer the research question [ 46 , 49 ] using tabulated primary data from qualitative and quantitative studies as reported by the original authors in narrative form only [ 47 ]. In addition, the qualitizing process was used to transform quantitative data to qualitative data; this helped us to convert the whole data into themes and categories. After that we used the thematic analysis for the narrative data as follows. First, the text was carefully read, line by line, to reveal topics answering each specific review question (MV). Second, the data coding was conducted, and the themes in the data were formed by data categorization. The process of deriving the themes was inductive based on constant comparison [ 49 ]. The results of thematic analysis and data categorization was first described in narrative format and then the total number of studies was calculated where the specific category was identified (%).

Stakeholder involvement

The method of reporting stakeholders’ involvement follows the key components by [ 50 ]: (1) people involved, (2) geographical location, (3) how people were recruited, (4) format of involvement, (5) amount of involvement, (6) ethical approval, (7) financial compensation, and (8) methods for reporting involvement.

In our review, stakeholder involvement targeted nurses and nurse leader in China. Nurse Directors of two hospitals recommended potential participants who received a personal invitation letter from researchers to participate in a discussion meeting. Stakeholders’ participation was based on their own free will. Due to COVID-19, one online meeting (1 h) was organized (25 May 2022). Eleven participants joined the meeting. Ethical approval was not applied and no financial compensation was offered. At the end of the meeting, experiences of stakeholders’ involvement were explored.

The meeting started with an introductory presentation with power points. The rationale, methods, and preliminary review results were shared with the participants [ 51 ].The meeting continued with general questions for the participants: (1) Are you aware of the concepts of evidence-based practice or evidence-based leadership?; (2) How important is it to use evidence to support decisions among nurse leaders?; (3) How is the evidence-based approach used in hospital settings?; and (4) What type of evidence is currently used to support nurse leaders’ decision making (e.g. scientific literature, organizational data, stakeholder views)?

Two people took notes on the course and content of the conversation. The notes were later transcripted in verbatim, and the key points of the discussions were summarised. Although answers offered by the stakeholders were very short, the information was useful to validate the preliminary content of the results, add the rigorousness of the review, and obtain additional perspectives. A recommendation of the stakeholders was combined in the Discussion part of this review increasing the applicability of the review in the real world [ 50 ]. At the end of the discussion, the value of stakeholders’ involvement was asked. Participants shared that the experience of participating was unique and the topic of discussion was challenging. Two authors of the review group further represented stakeholders by working together with the research team throughout the review study.

Search results

From seven different electronic databases, 6053 citations were identified as being potentially relevant to the review. Then, 3133 duplicates were removed by an automation tool (Covidence: www.covidence.org ), and one was removed manually. The titles and abstracts of 3040 of citations were reviewed, and a total of 110 full texts were included (one extra citation was found on the reference list but later excluded). Based on the eligibility criteria, 31 studies (32 hits) were critically appraised and deemed suitable for inclusion in the review. The search results and selection process are presented in the PRISMA [ 52 ] flow diagram Fig.  1 . The full list of references for included studies can be find in Additional file 2 . To avoid confusion between articles of the reference list and studies included in the analysis, the studies included in the review are referred inside the article using the reference number of each study (e.g. ref 1, ref 2).

figure 1

Search results and study selection and inclusion process [ 52 ]

Characteristics of included studies

The studies had multiple purposes, aiming to develop practice, implement a new approach, improve quality, or to develop a model. The 31 studies (across 32 hits) were case series studies ( n  = 27), mixed methods studies ( n  = 3) and a quasi-experimental study ( n  = 1). All studies were published between the years 2004 and 2021. The highest number of papers was published in year 2020.

Table  2 describes the characteristics of included studies and Additional file 3 offers a narrative description of the studies.

Methodological quality assessment

Quasi-experimental studies.

We had one quasi-experimental study (ref 31). All questions in the critical appraisal tool were applicable. The total score of the study was 8 (out of a possible 9). Only one response of the tool was ‘no’ because no control group was used in the study (see Additional file 4 for the critical appraisal of included studies).

Case series studies . A case series study is typically defined as a collection of subjects with common characteristics. The studies do not include a comparison group and are often based on prevalent cases and on a sample of convenience [ 53 ]. Munn et al. [ 45 ] further claim that case series are best described as observational studies, lacking experimental and randomized characteristics, being descriptive studies, without a control or comparator group. Out of 27 case series studies included in our review, the critical appraisal scores varied from 1 to 9. Five references were conference abstracts with empirical study results, which were scored from 1 to 3. Full reports of these studies were searched in electronic databases but not found. Critical appraisal scores for the remaining 22 studies ranged from 1 to 9 out of a possible score of 10. One question (Q3) was not applicable to 13 studies: “Were valid methods used for identification of the condition for all participants included in the case series?” Only two studies had clearly reported the demographic of the participants in the study (Q6). Twenty studies met Criteria 8 (“Were the outcomes or follow-up results of cases clearly reported?”) and 18 studies met Criteria 7 (“Q7: Was there clear reporting of clinical information of the participants?”) (see Additional file 4 for the critical appraisal of included studies).

Mixed-methods studies

Mixed-methods studies involve a combination of qualitative and quantitative methods. This is a common design and includes convergent design, sequential explanatory design, and sequential exploratory design [ 46 ]. There were three mixed-methods studies. The critical appraisal scores for the three studies ranged from 60 to 100% out of a possible 100%. Two studies met all the criteria, while one study fulfilled 60% of the scored criteria due to a lack of information to understand the relevance of the sampling strategy well enough to address the research question (Q4.1) or to determine whether the risk of nonresponse bias was low (Q4.4) (see Additional file 4 for the critical appraisal of included studies).

Intervention or program components

The intervention of program components were categorized and described using the TiDier checklist: name and goal, theory or background, material, procedure, provider, models of delivery, location, dose, modification, and adherence and fidelity [ 48 ]. A description of intervention in each study is described in Additional file 5 and a narrative description in Additional file 6 .

Leadership problems

In line with the inclusion criteria, data for the leadership problems were categorized in all 31 included studies (see Additional file 7 for leadership problems). Three types of leadership problems were identified: implementation of knowledge into practice, the quality of clinical care, and resources in nursing care. A narrative summary of the results is reported below.

Implementing knowledge into practice

Eleven studies (35%) aimed to solve leadership problems related to implementation of knowledge into practice. Studies showed how to support nurses in evidence-based implementation (EBP) (ref 3, ref 5), how to engage nurses in using evidence in practice (ref 4), how to convey the importance of EBP (ref 22) or how to change practice (ref 4). Other problems were how to facilitate nurses to use guideline recommendations (ref 7) and how nurses can make evidence-informed decisions (ref 8). General concerns also included the linkage between theory and practice (ref 1) as well as how to implement the EBP model in practice (ref 6). In addition, studies were motivated by the need for revisions or updates of protocols to improve clinical practice (ref 10) as well as the need to standardize nursing activities (ref 11, ref 14).

The quality of the care

Thirteen (42%) focused on solving problems related to the quality of clinical care. In these studies, a high number of catheter infections led a lack of achievement of organizational goals (ref 2, ref 9). A need to reduce patient symptoms in stem cell transplant patients undergoing high-dose chemotherapy (ref 24) was also one of the problems to be solved. In addition, the projects focused on how to prevent pressure ulcers (ref 26, ref 29), how to enhance the quality of cancer treatment (ref 25) and how to reduce the need for invasive constipation treatment (ref 30). Concerns about patient safety (ref 15), high fall rates (ref 16, ref 19), dissatisfaction of patients (ref 16, ref 18) and nurses (ref 16, ref 30) were also problems that had initiated the projects. Studies addressed concerns about how to promote good contingency care in residential aged care homes (ref 20) and about how to increase recognition of human trafficking problems in healthcare (ref 21).

Resources in nursing care

Nurse leaders identified problems in their resources, especially in staffing problems. These problems were identified in seven studies (23%), which involved concerns about how to prevent nurses from leaving the job (ref 31), how to ensure appropriate recruitment, staffing and retaining of nurses (ref 13) and how to decrease nurses’ burden and time spent on nursing activities (ref 12). Leadership turnover was also reported as a source of dissatisfaction (ref 17); studies addressed a lack of structured transition and training programs, which led to turnover (ref 23), as well as how to improve intershift handoff among nurses (ref 28). Optimal design for new hospitals was also examined (ref 27).

Main features of evidence-based leadership

Out of 31 studies, 17 (55%) included all four domains of an evidence-based leadership approach, and four studies (13%) included evidence of critical appraisal of the results (see Additional file 8 for the main features of evidence-based Leadership) (ref 11, ref 14, ref 23, ref 27).

Organizational evidence

Twenty-seven studies (87%) reported how organizational evidence was collected and used to solve leadership problems (ref 2). Retrospective chart reviews (ref 5), a review of the extent of specific incidents (ref 19), and chart auditing (ref 7, ref 25) were conducted. A gap between guideline recommendations and actual care was identified using organizational data (ref 7) while the percentage of nurses’ working time spent on patient care was analyzed using an electronic charting system (ref 12). Internal data (ref 22), institutional data, and programming metrics were also analyzed to understand the development of the nurse workforce (ref 13).

Surveys (ref 3, ref 25), interviews (ref 3, ref 25) and group reviews (ref 18) were used to better understand the leadership problem to be solved. Employee opinion surveys on leadership (ref 17), a nurse satisfaction survey (ref 30) and a variety of reporting templates were used for the data collection (ref 28) reported. Sometimes, leadership problems were identified by evidence facilitators or a PI’s team who worked with staff members (ref 15, ref 17). Problems in clinical practice were also identified by the Nursing Professional Council (ref 14), managers (ref 26) or nurses themselves (ref 24). Current practices were reviewed (ref 29) and a gap analysis was conducted (ref 4, ref 16, ref 23) together with SWOT analysis (ref 16). In addition, hospital mission and vision statements, research culture established and the proportion of nursing alumni with formal EBP training were analyzed (ref 5). On the other hand, it was stated that no systematic hospital-specific sources of data regarding job satisfaction or organizational commitment were used (ref 31). In addition, statements of organizational analysis were used on a general level only (ref 1).

Scientific evidence identified

Twenty-six studies (84%) reported the use of scientific evidence in their evidence-based leadership processes. A literature search was conducted (ref 21) and questions, PICO, and keywords were identified (ref 4) in collaboration with a librarian. Electronic databases, including PubMed (ref 14, ref 31), Cochrane, and EMBASE (ref 31) were searched. Galiano (ref 6) used Wiley Online Library, Elsevier, CINAHL, Health Source: Nursing/Academic Edition, PubMed, and the Cochrane Library while Hoke (ref 11) conducted an electronic search using CINAHL and PubMed to retrieve articles.

Identified journals were reviewed manually (ref 31). The findings were summarized using ‘elevator speech’ (ref 4). In a study by Gifford et al. (ref 9) evidence facilitators worked with participants to access, appraise, and adapt the research evidence to the organizational context. Ostaszkiewicz (ref 20) conducted a scoping review of literature and identified and reviewed frameworks and policy documents about the topic and the quality standards. Further, a team of nursing administrators, directors, staff nurses, and a patient representative reviewed the literature and made recommendations for practice changes.

Clinical practice guidelines were also used to offer scientific evidence (ref 7, ref 19). Evidence was further retrieved from a combination of nursing policies, guidelines, journal articles, and textbooks (ref 12) as well as from published guidelines and literature (ref 13). Internal evidence, professional practice knowledge, relevant theories and models were synthesized (ref 24) while other study (ref 25) reviewed individual studies, synthesized with systematic reviews or clinical practice guidelines. The team reviewed the research evidence (ref 3, ref 15) or conducted a literature review (ref 22, ref 28, ref 29), a literature search (ref 27), a systematic review (ref 23), a review of the literature (ref 30) or ‘the scholarly literature was reviewed’ (ref 18). In addition, ‘an extensive literature review of evidence-based best practices was carried out’ (ref 10). However, detailed description how the review was conducted was lacking.

Views of stakeholders

A total of 24 studies (77%) reported methods for how the views of stakeholders, i.e., professionals or experts, were considered. Support to run this study was received from nursing leadership and multidisciplinary teams (ref 29). Experts and stakeholders joined the study team in some cases (ref 25, ref 30), and in other studies, their opinions were sought to facilitate project success (ref 3). Sometimes a steering committee was formed by a Chief Nursing Officer and Clinical Practice Specialists (ref 2). More specifically, stakeholders’ views were considered using interviews, workshops and follow-up teleconferences (ref 7). The literature review was discussed with colleagues (ref 11), and feedback and support from physicians as well as the consensus of staff were sought (ref 16).

A summary of the project findings and suggestions for the studies were discussed at 90-minute weekly meetings by 11 charge nurses. Nurse executive directors were consulted over a 10-week period (ref 31). An implementation team (nurse, dietician, physiotherapist, occupational therapist) was formed to support the implementation of evidence-based prevention measures (ref 26). Stakeholders volunteered to join in the pilot implementation (ref 28) or a stakeholder team met to determine the best strategy for change management, shortcomings in evidence-based criteria were discussed, and strategies to address those areas were planned (ref 5). Nursing leaders, staff members (ref 22), ‘process owners (ref 18) and program team members (ref 18, ref 19, ref 24) met regularly to discuss the problems. Critical input was sought from clinical educators, physicians, nutritionists, pharmacists, and nurse managers (ref 24). The unit director and senior nursing staff reviewed the contents of the product, and the final version of clinical pathways were reviewed and approved by the Quality Control Commission of the Nursing Department (ref 12). In addition, two co-design workshops with 18 residential aged care stakeholders were organized to explore their perspectives about factors to include in a model prototype (ref 20). Further, an agreement of stakeholders in implementing continuous quality services within an open relationship was conducted (ref 1).

Critical appraisal

In five studies (16%), a critical appraisal targeting the literature search was carried out. The appraisals were conducted by interns and teams who critiqued the evidence (ref 4). In Hoke’s study, four areas that had emerged in the literature were critically reviewed (ref 11). Other methods were to ‘critically appraise the search results’ (ref 14). Journal club team meetings (ref 23) were organized to grade the level and quality of evidence and the team ‘critically appraised relevant evidence’ (ref 27). On the other hand, the studies lacked details of how the appraisals were done in each study.

The perceived effects of evidence-based leadership

Perceived effects of evidence-based leadership on nurses’ performance.

Eleven studies (35%) described perceived effects of evidence-based leadership on nurses’ performance (see Additional file 9 for perceived effects of evidence-based leadership), which were categorized in four groups: awareness and knowledge, competence, ability to understand patients’ needs, and engagement. First, regarding ‘awareness and knowledge’, different projects provided nurses with new learning opportunities (ref 3). Staff’s knowledge (ref 20, ref 28), skills, and education levels improved (ref 20), as did nurses’ knowledge comprehension (ref 21). Second, interventions and approaches focusing on management and leadership positively influenced participants’ competence level to improve the quality of services. Their confidence level (ref 1) and motivation to change practice increased, self-esteem improved, and they were more positive and enthusiastic in their work (ref 22). Third, some nurses were relieved that they had learned to better handle patients’ needs (ref 25). For example, a systematic work approach increased nurses’ awareness of the patients who were at risk of developing health problems (ref 26). And last, nurse leaders were more engaged with staff, encouraging them to adopt the new practices and recognizing their efforts to change (ref 8).

Perceived effects on organizational outcomes

Nine studies (29%) described the perceived effects of evidence-based leadership on organizational outcomes (see Additional file 9 for perceived effects of evidence-based leadership). These were categorized into three groups: use of resources, staff commitment, and team effort. First, more appropriate use of resources was reported (ref 15, ref 20), and working time was more efficiently used (ref 16). In generally, a structured approach made implementing change more manageable (ref 1). On the other hand, in the beginning of the change process, the feedback from nurses was unfavorable, and they experienced discomfort in the new work style (ref 29). New approaches were also perceived as time consuming (ref 3). Second, nurse leaders believed that fewer nursing staff than expected left the organization over the course of the study (ref 31). Third, the project helped staff in their efforts to make changes, and it validated the importance of working as a team (ref 7). Collaboration and support between the nurses increased (ref 26). On the other hand, new work style caused challenges in teamwork (ref 3).

Perceived effects on clinical outcomes

Five studies (16%) reported the perceived effects of evidence-based leadership on clinical outcomes (see Additional file 9 for perceived effects of evidence-based leadership), which were categorized in two groups: general patient outcomes and specific clinical outcomes. First, in general, the project assisted in connecting the guideline recommendations and patient outcomes (ref 7). The project was good for the patients in general, and especially to improve patient safety (ref 16). On the other hand, some nurses thought that the new working style did not work at all for patients (ref 28). Second, the new approach used assisted in optimizing patients’ clinical problems and person-centered care (ref 20). Bowel management, for example, received very good feedback (ref 30).

The measured effects of evidence-based leadership

The measured effects on nurses’ performance.

Data were obtained from 20 studies (65%) (see Additional file 10 for measured effects of evidence-based leadership) and categorized nurse performance outcomes for three groups: awareness and knowledge, engagement, and satisfaction. First, six studies (19%) measured the awareness and knowledge levels of participants. Internship for staff nurses was beneficial to help participants to understand the process for using evidence-based practice and to grow professionally, to stimulate for innovative thinking, to give knowledge needed to use evidence-based practice to answer clinical questions, and to make possible to complete an evidence-based practice project (ref 3). Regarding implementation program of evidence-based practice, those with formal EBP training showed an improvement in knowledge, attitude, confidence, awareness and application after intervention (ref 3, ref 11, ref 20, ref 23, ref 25). On the contrary, in other study, attitude towards EBP remained stable ( p  = 0.543). and those who applied EBP decreased although no significant differences over the years ( p  = 0.879) (ref 6).

Second, 10 studies (35%) described nurses’ engagement to new practices (ref 5, ref 6, ref 7, ref 10, ref 16, ref 17, ref 18, ref 21, ref 25, ref 27). 9 studies (29%) studies reported that there was an improvement of compliance level of participants (ref 6, ref 7, ref 10, ref 16, ref 17, ref 18, ref 21, ref 25, ref 27). On the contrary, in DeLeskey’s (ref 5) study, although improvement was found in post-operative nausea and vomiting’s (PONV) risk factors documented’ (2.5–63%), and ’risk factors communicated among anaesthesia and surgical staff’ (0–62%), the improvement did not achieve the goal. The reason was a limited improvement was analysed. It was noted that only those patients who had been seen by the pre-admission testing nurse had risk assessments completed. Appropriate treatment/prophylaxis increased from 69 to 77%, and from 30 to 49%; routine assessment for PONV/rescue treatment 97% and 100% was both at 100% following the project. The results were discussed with staff but further reasons for a lack of engagement in nursing care was not reported.

And third, six studies (19%) reported nurses’ satisfaction with project outcomes. The study results showed that using evidence in managerial decisions improved nurses’ satisfaction and attitudes toward their organization ( P  < 0.05) (ref 31). Nurses’ overall job satisfaction improved as well (ref 17). Nurses’ satisfaction with usability of the electronic charting system significantly improved after introduction of the intervention (ref 12). In handoff project in seven hospitals, improvement was reported in all satisfaction indicators used in the study although improvement level varied in different units (ref 28). In addition, positive changes were reported in nurses’ ability to autonomously perform their job (“How satisfied are you with the tools and resources available for you treat and prevent patient constipation?” (54%, n  = 17 vs. 92%, n  = 35, p  < 0.001) (ref 30).

The measured effects on organizational outcomes

Thirteen studies (42%) described the effects of a project on organizational outcomes (see Additional file 10 for measured effects of evidence-based leadership), which were categorized in two groups: staff compliance, and changes in practices. First, studies reported improved organizational outcomes due to staff better compliance in care (ref 4, ref 13, ref 17, ref 23, ref 27, ref 31). Second, changes in organization practices were also described (ref 11) like changes in patient documentation (ref 12, ref 21). Van Orne (ref 30) found a statistically significant reduction in the average rate of invasive medication administration between pre-intervention and post-intervention ( p  = 0.01). Salvador (ref 24) also reported an improvement in a proactive approach to mucositis prevention with an evidence-based oral care guide. On the contrary, concerns were also raised such as not enough time for new bedside report (ref 16) or a lack of improvement of assessment of diabetic ulcer (ref 8).

The measured effects on clinical outcomes

A variety of improvements in clinical outcomes were reported (see Additional file 10 for measured effects of evidence-based leadership): improvement in patient clinical status and satisfaction level. First, a variety of improvement in patient clinical status was reported. improvement in Incidence of CAUTI decreased 27.8% between 2015 and 2019 (ref 2) while a patient-centered quality improvement project reduced CAUTI rates to 0 (ref 10). A significant decrease in transmission rate of MRSA transmission was also reported (ref 27) and in other study incidences of CLABSIs dropped following of CHG bathing (ref 14). Further, it was possible to decrease patient nausea from 18 to 5% and vomiting to 0% (ref 5) while the percentage of patients who left the hospital without being seen was below 2% after the project (ref 17). In addition, a significant reduction in the prevalence of pressure ulcers was found (ref 26, ref 29) and a significant reduction of mucositis severity/distress was achieved (ref 24). Patient falls rate decreased (ref 15, ref 16, ref 19, ref 27).

Second, patient satisfaction level after project implementation improved (ref 28). The scale assessing healthcare providers by consumers showed improvement, but the changes were not statistically significant. Improvement in an emergency department leadership model and in methods of communication with patients improved patient satisfaction scores by 600% (ref 17). In addition, new evidence-based unit improved patient experiences about the unit although not all items improved significantly (ref 18).

Stakeholder involvement in the mixed-method review

To ensure stakeholders’ involvement in the review, the real-world relevance of our research [ 53 ], achieve a higher level of meaning in our review results, and gain new perspectives on our preliminary findings [ 50 ], a meeting with 11 stakeholders was organized. First, we asked if participants were aware of the concepts of evidence-based practice or evidence-based leadership. Responses revealed that participants were familiar with the concept of evidence-based practice, but the topic of evidence-based leadership was totally new. Examples of nurses and nurse leaders’ responses are as follows: “I have heard a concept of evidence-based practice but never a concept of evidence-based leadership.” Another participant described: “I have heard it [evidence-based leadership] but I do not understand what it means.”

Second, as stakeholder involvement is beneficial to the relevance and impact of health research [ 54 ], we asked how important evidence is to them in supporting decisions in health care services. One participant described as follows: “Using evidence in decisions is crucial to the wards and also to the entire hospital.” Third, we asked how the evidence-based approach is used in hospital settings. Participants expressed that literature is commonly used to solve clinical problems in patient care but not to solve leadership problems. “In [patient] medication and care, clinical guidelines are regularly used. However, I am aware only a few cases where evidence has been sought to solve leadership problems.”

And last, we asked what type of evidence is currently used to support nurse leaders’ decision making (e.g. scientific literature, organizational data, stakeholder views)? The participants were aware that different types of information were collected in their organization on a daily basis (e.g. patient satisfaction surveys). However, the information was seldom used to support decision making because nurse leaders did not know how to access this information. Even so, the participants agreed that the use of evidence from different sources was important in approaching any leadership or managerial problems in the organization. Participants also suggested that all nurse leaders should receive systematic training related to the topic; this could support the daily use of the evidence-based approach.

To our knowledge, this article represents the first mixed-methods systematic review to examine leadership problems, how evidence is used to solve these problems and what the perceived and measured effects of evidence-based leadership are on nurse leaders and their performance, organizational, and clinical outcomes. This review has two key findings. First, the available research data suggests that evidence-based leadership has potential in the healthcare context, not only to improve knowledge and skills among nurses, but also to improve organizational outcomes and the quality of patient care. Second, remarkably little published research was found to explore the effects of evidence-based leadership with an efficient trial design. We validated the preliminary results with nurse stakeholders, and confirmed that nursing staff, especially nurse leaders, were not familiar with the concept of evidence-based leadership, nor were they used to implementing evidence into their leadership decisions. Our data was based on many databases, and we screened a large number of studies. We also checked existing registers and databases and found no registered or ongoing similar reviews being conducted. Therefore, our results may not change in the near future.

We found that after identifying the leadership problems, 26 (84%) studies out of 31 used organizational data, 25 (81%) studies used scientific evidence from the literature, and 21 (68%) studies considered the views of stakeholders in attempting to understand specific leadership problems more deeply. However, only four studies critically appraised any of these findings. Considering previous critical statements of nurse leaders’ use of evidence in their decision making [ 14 , 30 , 31 , 34 , 55 ], our results are still quite promising.

Our results support a previous systematic review by Geert et al. [ 32 ], which concluded that it is possible to improve leaders’ individual-level outcomes, such as knowledge, motivation, skills, and behavior change using evidence-based approaches. Collins and Holton [ 23 ] particularly found that leadership training resulted in significant knowledge and skill improvements, although the effects varied widely across studies. In our study, evidence-based leadership was seen to enable changes in clinical practice, especially in patient care. On the other hand, we understand that not all efforts to changes were successful [ 56 , 57 , 58 ]. An evidence-based approach causes negative attitudes and feelings. Negative emotions in participants have also been reported due to changes, such as discomfort with a new working style [ 59 ]. Another study reported inconvenience in using a new intervention and its potential risks for patient confidentiality. Sometimes making changes is more time consuming than continuing with current practice [ 60 ]. These findings may partially explain why new interventions or program do not always fully achieve their goals. On the other hand, Dubose et al. [ 61 ] state that, if prepared with knowledge of resistance, nurse leaders could minimize the potential negative consequences and capitalize on a powerful impact of change adaptation.

We found that only six studies used a specific model or theory to understand the mechanism of change that could guide leadership practices. Participants’ reactions to new approaches may be an important factor in predicting how a new intervention will be implemented into clinical practice. Therefore, stronger effort should be put to better understanding the use of evidence, how participants’ reactions and emotions or practice changes could be predicted or supported using appropriate models or theories, and how using these models are linked with leadership outcomes. In this task, nurse leaders have an important role. At the same time, more responsibilities in developing health services have been put on the shoulders of nurse leaders who may already be suffering under pressure and increased burden at work. Working in a leadership position may also lead to role conflict. A study by Lalleman et al. [ 62 ] found that nurses were used to helping other people, often in ad hoc situations. The helping attitude of nurses combined with structured managerial role may cause dilemmas, which may lead to stress. Many nurse leaders opt to leave their positions less than 5 years [ 63 ].To better fulfill the requirements of health services in the future, the role of nurse leaders in evidence-based leadership needs to be developed further to avoid ethical and practical dilemmas in their leadership practices.

It is worth noting that the perceived and measured effects did not offer strong support to each other but rather opened a new venue to understand the evidence-based leadership. Specifically, the perceived effects did not support to measured effects (competence, ability to understand patients’ needs, use of resources, team effort, and specific clinical outcomes) while the measured effects could not support to perceived effects (nurse’s performance satisfaction, changes in practices, and clinical outcomes satisfaction). These findings may indicate that different outcomes appear if the effects of evidence-based leadership are looked at using different methodological approach. Future study is encouraged using well-designed study method including mixed-method study to examine the consistency between perceived and measured effects of evidence-based leadership in health care.

There is a potential in nursing to support change by demonstrating conceptual and operational commitment to research-based practices [ 64 ]. Nurse leaders are well positioned to influence and lead professional governance, quality improvement, service transformation, change and shared governance [ 65 ]. In this task, evidence-based leadership could be a key in solving deficiencies in the quality, safety of care [ 14 ] and inefficiencies in healthcare delivery [ 12 , 13 ]. As WHO has revealed, there are about 28 million nurses worldwide, and the demand of nurses will put nurse resources into the specific spotlight [ 1 ]. Indeed, evidence could be used to find solutions for how to solve economic deficits or other problems using leadership skills. This is important as, when nurses are able to show leadership and control in their own work, they are less likely to leave their jobs [ 66 ]. On the other hand, based on our discussions with stakeholders, nurse leaders are not used to using evidence in their own work. Further, evidence-based leadership is not possible if nurse leaders do not have access to a relevant, robust body of evidence, adequate funding, resources, and organizational support, and evidence-informed decision making may only offer short-term solutions [ 55 ]. We still believe that implementing evidence-based strategies into the work of nurse leaders may create opportunities to protect this critical workforce from burnout or leaving the field [ 67 ]. However, the role of the evidence-based approach for nurse leaders in solving these problems is still a key question.

Limitations

This study aimed to use a broad search strategy to ensure a comprehensive review but, nevertheless, limitations exist: we may have missed studies not included in the major international databases. To keep search results manageable, we did not use specific databases to systematically search grey literature although it is a rich source of evidence used in systematic reviews and meta-analysis [ 68 ]. We still included published conference abstract/proceedings, which appeared in our scientific databases. It has been stated that conference abstracts and proceedings with empirical study results make up a great part of studies cited in systematic reviews [ 69 ]. At the same time, a limited space reserved for published conference publications can lead to methodological issues reducing the validity of the review results [ 68 ]. We also found that the great number of studies were carried out in western countries, restricting the generalizability of the results outside of English language countries. The study interventions and outcomes were too different across studies to be meaningfully pooled using statistical methods. Thus, our narrative synthesis could hypothetically be biased. To increase transparency of the data and all decisions made, the data, its categorization and conclusions are based on original studies and presented in separate tables and can be found in Additional files. Regarding a methodological approach [ 34 ], we used a mixed methods systematic review, with the core intention of combining quantitative and qualitative data from primary studies. The aim was to create a breadth and depth of understanding that could confirm to or dispute evidence and ultimately answer the review question posed [ 34 , 70 ]. Although the method is gaining traction due to its usefulness and practicality, guidance in combining quantitative and qualitative data in mixed methods systematic reviews is still limited at the theoretical stage [ 40 ]. As an outcome, it could be argued that other methodologies, for example, an integrative review, could have been used in our review to combine diverse methodologies [ 71 ]. We still believe that the results of this mixed method review may have an added value when compared with previous systematic reviews concerning leadership and an evidence-based approach.

Our mixed methods review fills the gap regarding how nurse leaders themselves use evidence to guide their leadership role and what the measured and perceived impact of evidence-based leadership is in nursing. Although the scarcity of controlled studies on this topic is concerning, the available research data suggest that evidence-based leadership intervention can improve nurse performance, organizational outcomes, and patient outcomes. Leadership problems are also well recognized in healthcare settings. More knowledge and a deeper understanding of the role of nurse leaders, and how they can use evidence in their own managerial leadership decisions, is still needed. Despite the limited number of studies, we assume that this narrative synthesis can provide a good foundation for how to develop evidence-based leadership in the future.

Implications

Based on our review results, several implications can be recommended. First, the future of nursing success depends on knowledgeable, capable, and strong leaders. Therefore, nurse leaders worldwide need to be educated about the best ways to manage challenging situations in healthcare contexts using an evidence-based approach in their decisions. This recommendation was also proposed by nurses and nurse leaders during our discussion meeting with stakeholders.

Second, curriculums in educational organizations and on-the-job training for nurse leaders should be updated to support general understanding how to use evidence in leadership decisions. And third, patients and family members should be more involved in the evidence-based approach. It is therefore important that nurse leaders learn how patients’ and family members’ views as stakeholders are better considered as part of the evidence-based leadership approach.

Future studies should be prioritized as follows: establishment of clear parameters for what constitutes and measures evidence-based leadership; use of theories or models in research to inform mechanisms how to effectively change the practice; conducting robust effectiveness studies using trial designs to evaluate the impact of evidence-based leadership; studying the role of patient and family members in improving the quality of clinical care; and investigating the financial impact of the use of evidence-based leadership approach within respective healthcare systems.

Data availability

The authors obtained all data for this review from published manuscripts.

World Health Organization. State of the world’s nursing 2020: investing in education, jobs and leadership. 2020. https://www.who.int/publications/i/item/9789240003279 . Accessed 29 June 2024.

Hersey P, Campbell R. Leadership: a behavioral science approach. The Center for; 2004.

Cline D, Crenshaw JT, Woods S. Nurse leader: a definition for the 21st century. Nurse Lead. 2022;20(4):381–4. https://doi.org/10.1016/j.mnl.2021.12.017 .

Article   Google Scholar  

Chen SS. Leadership styles and organization structural configurations. J Hum Resource Adult Learn. 2006;2(2):39–46.

Google Scholar  

McKibben L. Conflict management: importance and implications. Br J Nurs. 2017;26(2):100–3.

Article   PubMed   Google Scholar  

Haghgoshayie E, Hasanpoor E. Evidence-based nursing management: basing Organizational practices on the best available evidence. Creat Nurs. 2021;27(2):94–7. https://doi.org/10.1891/CRNR-D-19-00080 .

Majers JS, Warshawsky N. Evidence-based decision-making for nurse leaders. Nurse Lead. 2020;18(5):471–5.

Tichy NM, Bennis WG. Making judgment calls. Harvard Business Rev. 2007;85(10):94.

Sousa MJ, Pesqueira AM, Lemos C, Sousa M, Rocha Á. Decision-making based on big data analytics for people management in healthcare organizations. J Med Syst. 2019;43(9):1–10.

Guo R, Berkshire SD, Fulton LV, Hermanson PM. %J L in HS. Use of evidence-based management in healthcare administration decision-making. 2017;30(3): 330–42.

Liang Z, Howard P, Rasa J. Evidence-informed managerial decision-making: what evidence counts?(part one). Asia Pac J Health Manage. 2011;6(1):23–9.

Hasanpoor E, Janati A, Arab-Zozani M, Haghgoshayie E. Using the evidence-based medicine and evidence-based management to minimise overuse and maximise quality in healthcare: a hybrid perspective. BMJ evidence-based Med. 2020;25(1):3–5.

Shingler NA, Gonzalez JZ. Ebm: a pathway to evidence-based nursing management. Nurs 2022. 2017;47(2):43–6.

Farokhzadian J, Nayeri ND, Borhani F, Zare MR. Nurse leaders’ attitudes, self-efficacy and training needs for implementing evidence-based practice: is it time for a change toward safe care? Br J Med Med Res. 2015;7(8):662.

Article   PubMed   PubMed Central   Google Scholar  

American Nurses Association. ANA leadership competency model. Silver Spring, MD; 2018.

Royal College of Nursing. Leadership skills. 2022. https://www.rcn.org.uk/professional-development/your-career/nurse/leadership-skills . Accessed 29 June 2024.

Kakemam E, Liang Z, Janati A, Arab-Zozani M, Mohaghegh B, Gholizadeh M. Leadership and management competencies for hospital managers: a systematic review and best-fit framework synthesis. J Healthc Leadersh. 2020;12:59.

Liang Z, Howard PF, Leggat S, Bartram T. Development and validation of health service management competencies. J Health Organ Manag. 2018;32(2):157–75.

World Health Organization. Global Strategic Directions for Nursing and Midwifery. 2021. https://apps.who.int/iris/bitstream/handle/10665/344562/9789240033863-eng.pdf . Accessed 29 June 2024.

NHS Leadership Academy. The nine leadership dimensions. 2022. https://www.leadershipacademy.nhs.uk/resources/healthcare-leadership-model/nine-leadership-dimensions/ . Accessed 29 June 2024.

Canadian Nurses Association. Evidence-informed decision-making and nursing practice: Position statement. 2018. https://hl-prod-ca-oc-download.s3-ca-central-1.amazonaws.com/CNA/2f975e7e-4a40-45ca-863c-5ebf0a138d5e/UploadedImages/documents/Evidence_informed_Decision_making_and_Nursing_Practice_position_statement_Dec_2018.pdf . Accessed 29 June 2024.

Hasanpoor E, Hajebrahimi S, Janati A, Abedini Z, Haghgoshayie E. Barriers, facilitators, process and sources of evidence for evidence-based management among health care managers: a qualitative systematic review. Ethiop J Health Sci. 2018;28(5):665–80.

PubMed   PubMed Central   Google Scholar  

Collins DB, Holton EF III. The effectiveness of managerial leadership development programs: a meta-analysis of studies from 1982 to 2001. Hum Res Dev Q. 2004;15(2):217–48.

Cummings GG, Lee S, Tate K, Penconek T, Micaroni SP, Paananen T, et al. The essentials of nursing leadership: a systematic review of factors and educational interventions influencing nursing leadership. Int J Nurs Stud. 2021;115:103842.

Clavijo-Chamorro MZ, Romero-Zarallo G, Gómez-Luque A, López-Espuela F, Sanz-Martos S, López-Medina IM. Leadership as a facilitator of evidence implementation by nurse managers: a metasynthesis. West J Nurs Res. 2022;44(6):567–81.

Young SK. Evidence-based management: a literature review. J Nurs Adm Manag. 2002;10(3):145–51.

Williams LL. What goes around comes around: evidence-based management. Nurs Adm Q. 2006;30(3):243–51.

Fraser I. Organizational research with impact: working backwards. Worldviews Evidence-Based Nurs. 2004;1:S52–9.

Roshanghalb A, Lettieri E, Aloini D, Cannavacciuolo L, Gitto S, Visintin F. What evidence on evidence-based management in healthcare? Manag Decis. 2018;56(10):2069–84.

Jaana M, Vartak S, Ward MM. Evidence-based health care management: what is the research evidence available for health care managers? Eval Health Prof. 2014;37(3):314–34.

Tate K, Hewko S, McLane P, Baxter P, Perry K, Armijo-Olivo S, et al. Learning to lead: a review and synthesis of literature examining health care managers’ use of knowledge. J Health Serv Res Policy. 2019;24(1):57–70.

Geerts JM, Goodall AH, Agius S, %J SS. Medicine. Evidence-based leadership development for physicians: a systematic literature review. 2020;246: 112709.

Barends E, Rousseau DM, Briner RB. Evidence-based management: The basic principles. Amsterdam; 2014. https://research.vu.nl/ws/portalfiles/portal/42141986/complete+dissertation.pdf#page=203 . Accessed 29 June 2024.

Stern C, Lizarondo L, Carrier J, Godfrey C, Rieger K, Salmond S, et al. Methodological guidance for the conduct of mixed methods systematic reviews. JBI Evid Synthesis. 2020;18(10):2108–18. https://doi.org/10.11124/JBISRIR-D-19-00169 .

Lancet T. 2020: unleashing the full potential of nursing. Lancet (London, England). 2019. p. 1879.

Välimäki MA, Lantta T, Hipp K, Varpula J, Liu G, Tang Y, et al. Measured and perceived impacts of evidence-based leadership in nursing: a mixed-methods systematic review protocol. BMJ Open. 2021;11(10):e055356. https://doi.org/10.1136/bmjopen-2021-055356 .

The Joanna Briggs Institute. Joanna Briggs Institute reviewers’ manual: 2014 edition. Joanna Briggs Inst. 2014; 88–91.

Pearson A, White H, Bath-Hextall F, Salmond S, Apostolo J, Kirkpatrick P. A mixed-methods approach to systematic reviews. JBI Evid Implement. 2015;13(3):121–31.

Johnson RB, Onwuegbuzie AJ. Mixed methods research: a research paradigm whose time has come. Educational Researcher. 2004;33(7):14–26.

Hong, Pluye P, Bujold M, Wassef M. Convergent and sequential synthesis designs: implications for conducting and reporting systematic reviews of qualitative and quantitative evidence. Syst Reviews. 2017;6(1):61. https://doi.org/10.1186/s13643-017-0454-2 .

Ramis MA, Chang A, Conway A, Lim D, Munday J, Nissen L. Theory-based strategies for teaching evidence-based practice to undergraduate health students: a systematic review. BMC Med Educ. 2019;19(1):1–13.

Sackett DL, Rosenberg WM, Gray JM, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn’t. Bmj. British Medical Journal Publishing Group; 1996. pp. 71–2.

Goodman JS, Gary MS, Wood RE. Bibliographic search training for evidence-based management education: a review of relevant literatures. Acad Manage Learn Educ. 2014;13(3):322–53.

Aromataris E, Munn Z. Chapter 3: Systematic reviews of effectiveness. JBI Manual for Evidence Synthesis. 2020; https://synthesismanual.jbi.global .

Munn Z, Barker TH, Moola S, Tufanaru C, Stern C, McArthur A et al. Methodological quality of case series studies: an introduction to the JBI critical appraisal tool. 2020;18(10): 2127–33.

Hong Q, Pluye P, Fàbregues S, Bartlett G, Boardman F, Cargo M, et al. Mixed methods Appraisal Tool (MMAT) Version 2018: user guide. Montreal: McGill University; 2018.

McKenna J, Jeske D. Ethical leadership and decision authority effects on nurses’ engagement, exhaustion, and turnover intention. J Adv Nurs. 2021;77(1):198–206.

Maxwell M, Hibberd C, Aitchison P, Calveley E, Pratt R, Dougall N, et al. The TIDieR (template for intervention description and replication) checklist. The patient Centred Assessment Method for improving nurse-led biopsychosocial assessment of patients with long-term conditions: a feasibility RCT. NIHR Journals Library; 2018.

Braun V, Clarke V. Using thematic analysis in psychology. Qualitative Res Psychol. 2006;3(2):77–101.

Pollock A, Campbell P, Struthers C, Synnot A, Nunn J, Hill S, et al. Stakeholder involvement in systematic reviews: a scoping review. Syst Reviews. 2018;7:1–26.

Braye S, Preston-Shoot M. Emerging from out of the shadows? Service user and carer involvement in systematic reviews. Evid Policy. 2005;1(2):173–93.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Reviews. 2021;10(1):1–11.

Porta M. Pilot investigation, study. A dictionary of epidemiology. Oxford University Press Oxford; 2014. p. 215.

Kreis J, Puhan MA, Schünemann HJ, Dickersin K. Consumer involvement in systematic reviews of comparative effectiveness research. Health Expect. 2013;16(4):323–37.

Joseph ML, Nelson-Brantley HV, Caramanica L, Lyman B, Frank B, Hand MW, et al. Building the science to guide nursing administration and leadership decision making. JONA: J Nurs Adm. 2022;52(1):19–26.

Gifford W, Davies BL, Graham ID, Tourangeau A, Woodend AK, Lefebre N. Developing Leadership Capacity for Guideline Use: a pilot cluster Randomized Control Trial: Leadership Pilot Study. Worldviews Evidence-Based Nurs. 2013;10(1):51–65. https://doi.org/10.1111/j.1741-6787.2012.00254.x .

Hsieh HY, Henker R, Ren D, Chien WY, Chang JP, Chen L, et al. Improving effectiveness and satisfaction of an electronic charting system in Taiwan. Clin Nurse Specialist. 2016;30(6):E1–6. https://doi.org/10.1097/NUR.0000000000000250 .

McAllen E, Stephens K, Swanson-Biearman B, Kerr K, Whiteman K. Moving Shift Report to the Bedside: an evidence-based Quality Improvement Project. OJIN: Online J Issues Nurs. 2018;23(2). https://doi.org/10.3912/OJIN.Vol23No02PPT22 .

Thomas M, Autencio K, Cesario K. Positive outcomes of an evidence-based pressure injury prevention program. J Wound Ostomy Cont Nurs. 2020;47:S24.

Cullen L, Titler MG. Promoting evidence-based practice: an internship for Staff nurses. Worldviews Evidence-Based Nurs. 2004;1(4):215–23. https://doi.org/10.1111/j.1524-475X.2004.04027.x .

DuBose BM, Mayo AM. Resistance to change: a concept analysis. Nursing forum. Wiley Online Library; 2020. pp. 631–6.

Lalleman PCB, Smid GAC, Lagerwey MD, Shortridge-Baggett LM, Schuurmans MJ. Curbing the urge to care: a bourdieusian analysis of the effect of the caring disposition on nurse middle managers’ clinical leadership in patient safety practices. Int J Nurs Stud. 2016;63:179–88.

Article   CAS   PubMed   Google Scholar  

Martin E, Warshawsky N. Guiding principles for creating value and meaning for the next generation of nurse leaders. JONA: J Nurs Adm. 2017;47(9):418–20.

Griffiths P, Recio-Saucedo A, Dall’Ora C, Briggs J, Maruotti A, Meredith P, et al. The association between nurse staffing and omissions in nursing care: a systematic review. J Adv Nurs. 2018;74(7):1474–87. https://doi.org/10.1111/jan.13564 .

Lúanaigh PÓ, Hughes F. The nurse executive role in quality and high performing health services. J Nurs Adm Manag. 2016;24(1):132–6.

de Kok E, Weggelaar-Jansen AM, Schoonhoven L, Lalleman P. A scoping review of rebel nurse leadership: descriptions, competences and stimulating/hindering factors. J Clin Nurs. 2021;30(17–18):2563–83.

Warshawsky NE. Building nurse manager well-being by reducing healthcare system demands. JONA: J Nurs Adm. 2022;52(4):189–91.

Paez A. Gray literature: an important resource in systematic reviews. J Evidence-Based Med. 2017;10(3):233–40.

McAuley L, Tugwell P, Moher D. Does the inclusion of grey literature influence estimates of intervention effectiveness reported in meta-analyses? Lancet. 2000;356(9237):1228–31.

Sarah S. Introduction to mixed methods systematic reviews. https://jbi-global-wiki.refined.site/space/MANUAL/4689215/8.1+Introduction+to+mixed+methods+systematic+reviews . Accessed 29 June 2024.

Whittemore R, Knafl K. The integrative review: updated methodology. J Adv Nurs. 2005;52(5):546–53.

Download references

Acknowledgements

We want to thank the funding bodies, the Finnish National Agency of Education, Asia Programme, the Department of Nursing Science at the University of Turku, and Xiangya School of Nursing at the Central South University. We also would like to thank the nurses and nurse leaders for their valuable opinions on the topic.

The work was supported by the Finnish National Agency of Education, Asia Programme (grant number 26/270/2020) and the University of Turku (internal fund 26003424). The funders had no role in the study design and will not have any role during its execution, analysis, interpretation of the data, decision to publish, or preparation of the manuscript.

Author information

Authors and affiliations.

Department of Nursing Science, University of Turku, Turku, FI-20014, Finland

Maritta Välimäki, Tella Lantta, Kirsi Hipp & Jaakko Varpula

School of Public Health, University of Helsinki, Helsinki, FI-00014, Finland

Maritta Välimäki

Xiangya Nursing, School of Central South University, Changsha, 410013, China

Shuang Hu, Jiarui Chen, Yao Tang, Wenjun Chen & Xianhong Li

School of Health and Social Services, Häme University of Applied Sciences, Hämeenlinna, Finland

Hunan Cancer Hospital, Changsha, 410008, China

Gaoming Liu

You can also search for this author in PubMed   Google Scholar

Contributions

Study design: MV, XL. Literature search and study selection: MV, KH, TL, WC, XL. Quality assessment: YT, SH, XL. Data extraction: JC, MV, JV, WC, YT, SH, GL. Analysis and interpretation: MV, SH. Manuscript writing: MV. Critical revisions for important intellectual content: MV, XL. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xianhong Li .

Ethics declarations

Ethics approval and consent to participate.

No ethical approval was required for this study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Differences between the original protocol

We modified criteria for the included studies: we included published conference abstracts/proceedings, which form a relatively broad knowledge base in scientific knowledge. We originally planned to conduct a survey with open-ended questions followed by a face-to-face meeting to discuss the preliminary results of the review. However, to avoid extra burden in nurses due to COVID-19, we decided to limit the validation process to the online discussion only.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, supplementary material 4, supplementary material 5, supplementary material 6, supplementary material 7, supplementary material 8, supplementary material 9, supplementary material 10, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Välimäki, M., Hu, S., Lantta, T. et al. The impact of evidence-based nursing leadership in healthcare settings: a mixed methods systematic review. BMC Nurs 23 , 452 (2024). https://doi.org/10.1186/s12912-024-02096-4

Download citation

Received : 28 April 2023

Accepted : 13 June 2024

Published : 03 July 2024

DOI : https://doi.org/10.1186/s12912-024-02096-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Evidence-based leadership
  • Health services administration
  • Organizational development
  • Quality in healthcare

BMC Nursing

ISSN: 1472-6955

example of single subject quasi experimental research

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

10.1 Overview of Single-Subject Research

Learning objectives.

  • Explain what single-subject research is, including how it differs from other types of psychological research.
  • Explain what case studies are, including some of their strengths and weaknesses.
  • Explain who uses single-subject research and why.

What Is Single-Subject Research?

Single-subject research is a type of quantitative research that involves studying in detail the behavior of each of a small number of participants. Note that the term single-subject does not mean that only one participant is studied; it is more typical for there to be somewhere between two and 10 participants. (This is why single-subject research designs are sometimes called small- n designs, where n is the statistical symbol for the sample size.) Single-subject research can be contrasted with group research , which typically involves studying large numbers of participants and examining their behavior primarily in terms of group means, standard deviations, and so on. The majority of this book is devoted to understanding group research, which is the most common approach in psychology. But single-subject research is an important alternative, and it is the primary approach in some areas of psychology.

Before continuing, it is important to distinguish single-subject research from two other approaches, both of which involve studying in detail a small number of participants. One is qualitative research, which focuses on understanding people’s subjective experience by collecting relatively unstructured data (e.g., detailed interviews) and analyzing those data using narrative rather than quantitative techniques. Single-subject research, in contrast, focuses on understanding objective behavior through experimental manipulation and control, collecting highly structured data, and analyzing those data quantitatively.

It is also important to distinguish single-subject research from case studies. A case study is a detailed description of an individual, which can include both qualitative and quantitative analyses. (Case studies that include only qualitative analyses can be considered a type of qualitative research.) The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see Note 10.5 “The Case of “Anna O.”” ) and John Watson and Rosalie Rayner’s description of Little Albert (Watson & Rayner, 1920), who learned to fear a white rat—along with other furry objects—when the researchers made a loud noise while he was playing with the rat. Case studies can be useful for suggesting new research questions and for illustrating general principles. They can also help researchers understand rare phenomena, such as the effects of damage to a specific part of the human brain. As a general rule, however, case studies cannot substitute for carefully designed group or single-subject research studies. One reason is that case studies usually do not allow researchers to determine whether specific events are causally related, or even related at all. For example, if a patient is described in a case study as having been sexually abused as a child and then as having developed an eating disorder as a teenager, there is no way to determine whether these two events had anything to do with each other. A second reason is that an individual case can always be unusual in some way and therefore be unrepresentative of people more generally. Thus case studies have serious problems with both internal and external validity.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis (Freud, 1961). (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst (p. 9).

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return.

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

Figure 10.2

Freud's

“Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis.

Wikimedia Commons – public domain.

Assumptions of Single-Subject Research

Again, single-subject research involves studying a small number of participants and focusing intensively on the behavior of each one. But why take this approach instead of the group approach? There are several important assumptions underlying single-subject research, and it will help to consider them now.

First and foremost is the assumption that it is important to focus intensively on the behavior of individual participants. One reason for this is that group research can hide individual differences and generate results that do not represent the behavior of any individual. For example, a treatment that has a positive effect for half the people exposed to it but a negative effect for the other half would, on average, appear to have no effect at all. Single-subject research, however, would likely reveal these individual differences. A second reason to focus intensively on individuals is that sometimes it is the behavior of a particular individual that is primarily of interest. A school psychologist, for example, might be interested in changing the behavior of a particular disruptive student. Although previous published research (both single-subject and group research) is likely to provide some guidance on how to do this, conducting a study on this student would be more direct and probably more effective.

A second assumption of single-subject research is that it is important to discover causal relationships through the manipulation of an independent variable, the careful measurement of a dependent variable, and the control of extraneous variables. For this reason, single-subject research is often considered a type of experimental research with good internal validity. Recall, for example, that Hall and his colleagues measured their dependent variable (studying) many times—first under a no-treatment control condition, then under a treatment condition (positive teacher attention), and then again under the control condition. Because there was a clear increase in studying when the treatment was introduced, a decrease when it was removed, and an increase when it was reintroduced, there is little doubt that the treatment was the cause of the improvement.

A third assumption of single-subject research is that it is important to study strong and consistent effects that have biological or social importance. Applied researchers, in particular, are interested in treatments that have substantial effects on important behaviors and that can be implemented reliably in the real-world contexts in which they occur. This is sometimes referred to as social validity (Wolf, 1976). The study by Hall and his colleagues, for example, had good social validity because it showed strong and consistent effects of positive teacher attention on a behavior that is of obvious importance to teachers, parents, and students. Furthermore, the teachers found the treatment easy to implement, even in their often chaotic elementary school classrooms.

Who Uses Single-Subject Research?

Single-subject research has been around as long as the field of psychology itself. In the late 1800s, one of psychology’s founders, Wilhelm Wundt, studied sensation and consciousness by focusing intensively on each of a small number of research participants. Herman Ebbinghaus’s research on memory and Ivan Pavlov’s research on classical conditioning are other early examples, both of which are still described in almost every introductory psychology textbook.

In the middle of the 20th century, B. F. Skinner clarified many of the assumptions underlying single-subject research and refined many of its techniques (Skinner, 1938). He and other researchers then used it to describe how rewards, punishments, and other external factors affect behavior over time. This work was carried out primarily using nonhuman subjects—mostly rats and pigeons. This approach, which Skinner called the experimental analysis of behavior —remains an important subfield of psychology and continues to rely almost exclusively on single-subject research. For excellent examples of this work, look at any issue of the Journal of the Experimental Analysis of Behavior . By the 1960s, many researchers were interested in using this approach to conduct applied research primarily with humans—a subfield now called applied behavior analysis (Baer, Wolf, & Risley, 1968). Applied behavior analysis plays an especially important role in contemporary research on developmental disabilities, education, organizational behavior, and health, among many other areas. Excellent examples of this work (including the study by Hall and his colleagues) can be found in the Journal of Applied Behavior Analysis .

Although most contemporary single-subject research is conducted from the behavioral perspective, it can in principle be used to address questions framed in terms of any theoretical perspective. For example, a studying technique based on cognitive principles of learning and memory could be evaluated by testing it on individual high school students using the single-subject approach. The single-subject approach can also be used by clinicians who take any theoretical perspective—behavioral, cognitive, psychodynamic, or humanistic—to study processes of therapeutic change with individual clients and to document their clients’ improvement (Kazdin, 1982).

Key Takeaways

  • Single-subject research—which involves testing a small number of participants and focusing intensively on the behavior of each individual—is an important alternative to group research in psychology.
  • Single-subject studies must be distinguished from case studies, in which an individual case is described in detail. Case studies can be useful for generating new research questions, for studying rare phenomena, and for illustrating general principles. However, they cannot substitute for carefully controlled experimental or correlational studies because they are low in internal and external validity.
  • Single-subject research has been around since the beginning of the field of psychology. Today it is most strongly associated with the behavioral theoretical perspective, but it can in principle be used to study behavior from any perspective.
  • Practice: Find and read a published article in psychology that reports new single-subject research. (A good source of articles published in the Journal of Applied Behavior Analysis can be found at http://seab.envmed.rochester.edu/jaba/jabaMostPop-2011.html .) Write a short summary of the study.

Practice: Find and read a published case study in psychology. (Use case study as a key term in a PsycINFO search.) Then do the following:

  • Describe one problem related to internal validity.
  • Describe one problem related to external validity.
  • Generate one hypothesis suggested by the case study that might be interesting to test in a systematic single-subject or group study.

Baer, D. M., Wolf, M. M., & Risley, T. R. (1968). Some current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis , 1 , 91–97.

Freud, S. (1961). Five lectures on psycho-analysis . New York, NY: Norton.

Kazdin, A. E. (1982). Single-case research designs: Methods for clinical and applied settings . New York, NY: Oxford University Press.

Skinner, B. F. (1938). The behavior of organisms: An experimental analysis . New York, NY: Appleton-Century-Crofts.

Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology , 3 , 1–14.

Wolf, M. (1976). Social validity: The case for subjective measurement or how applied behavior analysis is finding its heart. Journal of Applied Behavior Analysis, 11 , 203–214.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

IMAGES

  1. Research Methods: Single-Subject, Quasi-Experimental, and Developmental Research Designs

    example of single subject quasi experimental research

  2. PPT

    example of single subject quasi experimental research

  3. PPT

    example of single subject quasi experimental research

  4. PPT

    example of single subject quasi experimental research

  5. PPT

    example of single subject quasi experimental research

  6. PPT

    example of single subject quasi experimental research

VIDEO

  1. BCBA Task List 5: D 4

  2. 3100 Ch 11 Single-Case, Quasi-Experimental, and Developmental Research

  3. Quasi-experimental Video Assignment

  4. # A quasi square wave...Example

  5. Chapter 4: Experimental & Quasi-Experimental Research

  6. Introduction to quasi-experimental designs (QEDs)

COMMENTS

  1. Quasi-Experimental Design

    Revised on January 22, 2024. Like a true experiment, a quasi-experimental design aims to establish a cause-and-effect relationship between an independent and dependent variable. However, unlike a true experiment, a quasi-experiment does not rely on random assignment. Instead, subjects are assigned to groups based on non-random criteria.

  2. Quasi Experimental Design Overview & Examples

    Quasi-experimental research is a design that closely resembles experimental research but is different. The term "quasi" means "resembling," so you can think of it as a cousin to actual experiments. In these studies, researchers can manipulate an independent variable — that is, they change one factor to see what effect it has.

  3. 7.3 Quasi-Experimental Research

    Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one. The prefix quasi means "resembling.". Thus quasi-experimental research is research that resembles experimental research but is not true experimental research.

  4. PDF Quasi-experimental and Single-case Experimental Designs

    Quasi-Experimental Designs In this major section, we introduce a common type of research design called the quasi-experimental research design. The quasi-experimental research design, also defined in A quasi-experimental research design is the use of methods and procedures to make observations in a study that is structured similar to an experiment,

  5. 10.2 Single-Subject Research Designs

    Many of these features are illustrated in Figure 10.1, which shows the results of a generic single-subject study. First, the dependent variable (represented on the y -axis of the graph) is measured repeatedly over time (represented by the x -axis) at regular intervals. Second, the study is divided into distinct phases, and the participant is ...

  6. Chapter 7 Quasi-Experimental Research

    Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. ... in contrast, is an example of single-subject research, which typically involves studying a small number of participants and focusing closely on each individual. In this section ...

  7. Overview of Single-Subject Research

    Key Takeaways. Single-subject research—which involves testing a small number of participants and focusing intensively on the behaviour of each individual—is an important alternative to group research in psychology. Single-subject studies must be distinguished from case studies, in which an individual case is described in detail.

  8. PDF SINGLE SUBJECT RESEARCH What Are Single-Subject/System Designs?

    • Single-subject designs involve in-depth quantitative study of the response of an individual or a group of individuals to an intervention or the withdrawal of that intervention (Szymanski, 1993). • Basically, single-subject designs, focus on a single individual in a research sample (Alberto & Troutman, 1990), are the extension of the quasi ...

  9. 14

    Specifically, we describe four quasi-experimental designs - one-group pretest-posttest designs, non-equivalent group designs, regression discontinuity designs, and interrupted time-series designs - and their statistical analyses in detail. Both simple quasi-experimental designs and embellishments of these simple designs are presented.

  10. Use of Quasi-Experimental Research Designs in Education Research

    The increasing use of quasi-experimental research designs (QEDs) in education, brought into focus following the "credibility revolution" (Angrist & Pischke, 2010) in economics, which sought to use data to empirically test theoretical assertions, has indeed improved causal claims in education (Loeb et al., 2017).However, more recently, scholars, practitioners, and policymakers have ...

  11. Experimental and Quasi-Experimental Designs in Implementation Research

    1.2. Single Subject Experimental Designs and On-Off-On (ABA) Designs. We also note that there are a variety of Single Subject Experimental Designs (SSEDs; Byiers et al., 2012), including withdrawal designs and alternating treatment designs, that can be used in testing evidence-based practices. Similarly, an implementation strategy may be used ...

  12. Quasi-Experimental Design: Types, Examples, Pros, and Cons

    See why leading organizations rely on MasterClass for learning & development. A quasi-experimental design can be a great option when ethical or practical concerns make true experiments impossible, but the research methodology does have its drawbacks. Learn all the ins and outs of a quasi-experimental design.

  13. Single-Subject Research Designs

    Many of these features are illustrated in Figure 10.2, which shows the results of a generic single-subject study. First, the dependent variable (represented on the y -axis of the graph) is measured repeatedly over time (represented by the x -axis) at regular intervals. Second, the study is divided into distinct phases, and the participant is ...

  14. Single Subject Research Design

    Single subject research design is a type of research methodology characterized by repeated assessment of a particular phenomenon (often a behavior) over time and is generally used to evaluate interventions [].Repeated measurement across time differentiates single subject research design from case studies and group designs, as it facilitates the examination of client change in response to an ...

  15. 8.1 One-Group Designs

    X. Chapter 10: Single-Subject Research. 10.1 Overview of Single-Subject Research ... Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research. ... For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. ...

  16. PDF Quasi-Experimental Designs

    In this reading, we'll discuss five quasi-experimental approaches: 1) matching, 2) mixed designs, 3) single-subject designs, and 4) developmental designs. (b) are plausible causes of the dependent variable. Quasi-experiments are designed to reduce confounding variables as much as possible, given that random assignment is not available.

  17. Quasi-Experimental Research

    The prefix quasi means "resembling.". Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979).[1]

  18. Quasi-experimental Research: What It Is, Types & Examples

    Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn't give full control over the independent variable (s) like true experimental designs do. In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at ...

  19. Single Subject Research

    Journal of Advanced Academics, 20, 214-247. Del Siegle, Ph.D. University of Connecticut. [email protected]. www.delsiegle.info. Revised 02/02/2024. "Single subject research (also known as single case experiments) is popular in the fields of special education and counseling.

  20. 9. A QUASI-EXPERIMENTAL AND SINGLE-SUBJECT RESEARCH APPROACH ...

    A QUASI-EXPERIMENTAL AND SINGLE-SUBJECT RESEARCH APPROACH AS AN ALTERNATIVE ... This chapter explores the development of a quasi-experimental and Single Subject Research Design (SSRD) approach to investigate the effectiveness of physical ... For example, to achieve an adequate statistical power (greater than 0.8) to conduct ...

  21. 10.1 Overview of Single-Subject Research

    Key Takeaways. Single-subject research—which involves testing a small number of participants and focusing intensively on the behavior of each individual—is an important alternative to group research in psychology. Single-subject studies must be distinguished from qualitative research on a single person or small number of individuals.

  22. Single-Subject Experimental Design: An Overview

    Single-subject experimental designs - also referred to as within-subject or single case experimental designs - are among the most prevalent designs used in CSD treatment research. ... For example, single-subject design can be implemented prior to implementing a randomized controlled trial to get a better handle on the magnitude of the ...

  23. Full article: Fostering Culturally Responsive Teaching Through the

    As recommended, we are using a quasi-experimental design to combine qualitative research with a quantitative evaluation of the effectiveness and efficacy of an intervention (Fàbregues et al., Citation 2022). To increase the validity of the results, the qualitative data were converted into a numerical format to compare and contrast the quality ...

  24. The impact of evidence-based nursing leadership in healthcare settings

    Quasi-experimental studies. We had one quasi-experimental study (ref 31). All questions in the critical appraisal tool were applicable. The total score of the study was 8 (out of a possible 9). Only one response of the tool was 'no' because no control group was used in the study (see Additional file 4 for the critical appraisal of included ...

  25. 10.1 Overview of Single-Subject Research

    Key Takeaways. Single-subject research—which involves testing a small number of participants and focusing intensively on the behavior of each individual—is an important alternative to group research in psychology. Single-subject studies must be distinguished from case studies, in which an individual case is described in detail.

  26. PDF 56356 Federal Register /Vol. 89, No. 131/Tuesday, July 9 ...

    (iii) A single study assessed by the Department, as appropriate, that— (A) Is an experimental study, a quasi- experimental design study, or a well- designed and well-implemented correlational study with statistical controls for selection bias (e.g., a study using regression methods to account for differences between a treatment group

  27. arXiv:2407.07384v1 [cond-mat.quant-gas] 10 Jul 2024

    lar arrangement, the condensate is subject to tight confinement along two axes, effectively creating a quasi-one-dimensional system [4, 31]. Consequently, we can employ a factorization technique as introduced in Ref. [32], which ultimately guides us towards the quasi-1D Gross-Pitaevskii equation: i~ ∂φ ∂T + ~2 2m 0 ∂2φ ∂x2 −U(x)φ ...