Skip to content

Commit c341a62

Browse files
committed
add references and glossary terms
1 parent dd16414 commit c341a62

File tree

3 files changed

+30
-15
lines changed

3 files changed

+30
-15
lines changed

docs/source/design_notation.md

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
# Quasi-experimental design notation
22

3-
This page provides a concise summary of the tabular notation used by Shadish Cook & Campbell (2002) and Reichardt (2009). This notation provides a compact description of various experimental designs. While it is possible to describe randomised designs using this notation, we focus purely on quasi-experimental designs here, with non-random allocation (abbreviated as `NR`). Observations are denoted by $O$. Time proceeds from left to right, so observations made through time are labelled as $O_1$, $O_2$, etc. The treatment is denoted by `X`. Rows represent different groups of units. Remember, a unit is a person, place, or thing that is the subject of the study.
3+
This page provides a concise summary of the tabular notation used by {cite:t}`shadish_cook_cambell_2002` and {cite:t}`reichardt2019quasi`. This notation provides a compact description of various experimental designs. While it is possible to describe randomised designs using this notation, we focus purely on {term}`quasi-experimental<Quasi-experiment>` designs here, with non-random allocation (abbreviated as `NR`). Observations are denoted by $O$. Time proceeds from left to right, so observations made through time are labelled as $O_1$, $O_2$, etc. The treatment is denoted by `X`. Rows represent different groups of units. Remember, a unit is a person, place, or thing that is the subject of the study.
44

55
## Pretest-posttest designs
66

7-
One of the simplest designs is the pretest-posttest design. Here we have one row, denoting a single group of units. There is an `X` which means all are treated. The pretest is denoted by $O_1$ and the posttest by $O_2$. See p99 of Reichardt (2019).
7+
One of the simplest designs is the pretest-posttest design. Here we have one row, denoting a single group of units. There is an `X` which means all are treated. The pretest is denoted by $O_1$ and the posttest by $O_2$. See p99 of {cite:t}`reichardt2019quasi`.
88

99
| | | |
1010
|----|---|----|
1111
$O_1$ | X | $O_2$ |
1212

13-
Informally, if we think about drawing conclusions about the causal impact of the treatment based on the change from $O_1$ to $O_2$, we might say that the treatment caused the change. However, this is a tenuous conclusion because we have no way of knowing what would have happened in the absence of the treatment.
13+
Informally, if we think about drawing conclusions about the {term}`causal impact` of the treatment based on the change from $O_1$ to $O_2$, we might say that the treatment caused the change. However, this is a tenuous conclusion because we have no way of knowing what would have happened in the absence of the treatment.
1414

15-
A variation of this design which may (slightly) improve this situation from the perspective of making causal claims, would be to take multiple pretest measures. This is shown below, see p107 of Reichardt (2019).
15+
A variation of this design which may (slightly) improve this situation from the perspective of making causal claims, would be to take multiple pretest measures. This is shown below, see p107 of {cite:t}`reichardt2019quasi`.
1616

1717
| | | | |
1818
|----|--|---|----|
@@ -24,7 +24,7 @@ This would allow us to estimate how the group was changing over time before the
2424

2525
In randomized experiments, with large enough groups, the randomization process should ensure that the treatment and control groups are equivalent. However, in quasi-experimental designs, with non-random (`NR`) allocation, we could expect there to be differences between the treatment and control groups. This poses some challenges in making strong causal claims about the impact of the treatment.
2626

27-
For example, in the simplest nonequivalent group design, we have two groups, one treated and one not treated, and just one posttest. See p114 of Reichardt (2019).
27+
For example, in the simplest {term}`nonequivalent group design<NEGD>`, we have two groups, one treated and one not treated, and just one posttest. See p114 of {cite:t}`reichardt2019quasi`.
2828

2929
| | | |
3030
|-----|---|----|
@@ -33,18 +33,18 @@ For example, in the simplest nonequivalent group design, we have two groups, one
3333

3434
The above design would be considered weak - the lack of a pre-test measure makes it hard to know whether differences between the groups at $O_1$ are due to the treatment or to pre-existing differences between the groups.
3535

36-
This limitation can be addressed by adding a pretest measure. See p115 of Reichardt (2019).
36+
This limitation can be addressed by adding a pretest measure. See p115 of {cite:t}`reichardt2019quasi`.
3737

3838
| | | | |
3939
|-----|----|---|----|
4040
| NR: | $O_1$ | X | $O_2$ |
4141
| NR: | $O_1$ | | $O_2$ |
4242

4343
Non-equivalent group designs like this, with a pretest and a posttest measure could be analysed in a number of ways:
44-
1. **ANCOVA:** Here, the group would be a categorical predictor, the pretest measure would be a covariate, and the posttest measure would be the outcome.
45-
2. **Difference-in-differences:** We can apply linear modeling approaches such as `y ~ group + time + group:time` to estimate the treatment effect. Here, `y` is the outcome measure, `group` is a binary variable indicating treatment or control group, and `time` is a binary variable indicating pretest or posttest.
44+
1. **{term}`ANCOVA`:** Here, the group would be a categorical predictor, the pretest measure would be a covariate, and the posttest measure would be the outcome.
45+
2. **{term}`Difference in differences`:** We can apply linear modeling approaches such as `y ~ group + time + group:time` to estimate the treatment effect. Here, `y` is the outcome measure, `group` is a binary variable indicating treatment or control group, and `time` is a binary variable indicating pretest or posttest.
4646

47-
A limitation of the nonequivalent group designs with single pre and posttest measures is that we don't know how the groups were changing over time before the treatment was introduced. This can be addressed by adding multiple pretest measures. See p154 of Reichardt (2019).
47+
A limitation of the nonequivalent group designs with single pre and posttest measures is that we don't know how the groups were changing over time before the treatment was introduced. This can be addressed by adding multiple pretest measures. See p154 of {cite:t}`reichardt2019quasi`.
4848

4949
| | | | | |
5050
|-----|----|---|-|----|
@@ -55,15 +55,15 @@ Again, this design could be analysed using the difference-in-differences approac
5555

5656
## Interrupted time series designs
5757

58-
While there is no control group, the interrupted time series design is a powerful quasi-experimental design that can be used to estimate the causal impact of a treatment. The design involves multiple pretest and posttest measures. The treatment is introduced at a specific point in time, denoted by `X`. The design can be used to estimate the causal impact of the treatment by comparing the trajectory of the outcome variable before and after the treatment. See p203 of Reichardt (2019).
58+
While there is no control group, the {term}`interrupted time series design` is a powerful quasi-experimental design that can be used to estimate the causal impact of a treatment. The design involves multiple pretest and posttest measures. The treatment is introduced at a specific point in time, denoted by `X`. The design can be used to estimate the causal impact of the treatment by comparing the trajectory of the outcome variable before and after the treatment. See p203 of {cite:t}`reichardt2019quasi`.
5959

6060
| | | | | | | | | |
6161
|-----|----|---|----|---|----|----|----|----|
6262
| $O_1$ | $O_2$ | $O_3$ | $O_4$ | X | $O_5$ | $O_6$ | $O_7$ | $O_8$ |
6363

6464
## Comparative interrupted time series designs
6565

66-
The comparative interrupted time series design incorporates aspects of **interrupted time series** (with only a treatment group), and **nonequivalent group designs** (with a treatment and control group). This design can be used to estimate the causal impact of a treatment by comparing the trajectory of the outcome variable before and after the treatment in the treatment group, and comparing this to the trajectory of the outcome variable in the control group. See p226 of Reichardt (2019).
66+
The {term}`comparative interrupted time-series<CITS>` design incorporates aspects of **interrupted time series** (with only a treatment group), and **nonequivalent group designs** (with a treatment and control group). This design can be used to estimate the causal impact of a treatment by comparing the trajectory of the outcome variable before and after the treatment in the treatment group, and comparing this to the trajectory of the outcome variable in the control group. See p226 of {cite:t}`reichardt2019quasi`.
6767

6868
| | | | | | | | | | |
6969
|-----|----|---|----|---|----|----|----|----|-|
@@ -73,11 +73,11 @@ The comparative interrupted time series design incorporates aspects of **interru
7373

7474
Because this design is very similar to the nonequivalent group design, simply with multiple pre and post test measures, it is well-suited to analysis under the difference-in-differences approach.
7575

76-
However, if we have many untreated units and one treated unit, then this design could be analysed with the synthetic control approach.
76+
However, if we have many untreated units and one treated unit, then this design could be analysed with the {term}`synthetic control` approach.
7777

7878
## Regression discontinuity designs
7979

80-
The design notation for regression discontinuity designs are different from the others and take a bit of getting used to. We have two groups, but allocation to the groups are determined by a units' relation to a cutoff point `C` along a running variable. Also, $O_1$ now represents the value of the running variable, and $O_2$ represents the outcome variable. See p169 of Reichardt (2019). This will make more sense if you consider the design notation alongside one of the example notebooks.
80+
The design notation for {term}`regression discontinuity designs<RDD>` are different from the others and take a bit of getting used to. We have two groups, but allocation to the groups are determined by a units' relation to a cutoff point `C` along a running variable. Also, $O_1$ now represents the value of the running variable, and $O_2$ represents the outcome variable. See p169 of {cite:t}`reichardt2019quasi`. This will make more sense if you consider the design notation alongside one of the example notebooks.
8181

8282
| | | | |
8383
|-----|----|---|----|
@@ -88,3 +88,8 @@ From an analysis perspective, regression discontinuity designs are very similar
8888

8989
## Summary
9090
This page has offered a brief overview of the tabular notation used to describe quasi-experimental designs. The notation is a useful tool for summarizing the design of a study, and can be used to help identify the strengths and limitations of a study design. But readers are strongly encouraged to consult the original sources when assessing the relative strengths and limitations of making causal claims under different quasi-experimental designs.
91+
92+
## References
93+
:::{bibliography}
94+
:filter: docname in docnames
95+
:::

docs/source/glossary.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ Glossary
1818
Change score analysis
1919
A statistical procedure where the outcome variable is the difference between the posttest and protest scores.
2020

21+
Causal impact
22+
An umbrella term for the estimated effect of a treatment on an outcome.
23+
2124
Comparative interrupted time-series
2225
CITS
2326
An interrupted time series design with added comparison time series observations.
@@ -36,7 +39,6 @@ Glossary
3639
ITS
3740
A quasi-experimental design to estimate a treatment effect where a series of observations are collected before and after a treatment. No control group is present.
3841

39-
4042
Instrumental Variable regression
4143
IV
4244
A quasi-experimental design to estimate a treatment effect where the is a risk of confounding between the treatment and the outcome due to endogeniety.
@@ -67,6 +69,7 @@ Glossary
6769
An emprical comparison used to estimate the effects of treatments where units are assigned to treatment conditions randomly.
6870

6971
Regression discontinuity design
72+
RDD
7073
A quasi–experimental comparison to estimate a treatment effect where units are assigned to treatment conditions based on a cut-off score on a quantitative assignment variable (aka running variable).
7174

7275
Regression kink design
@@ -88,7 +91,6 @@ Glossary
8891
Wilkinson notation
8992
A notation for describing statistical models :footcite:p:`wilkinson1973symbolic`.
9093

91-
9294
Two Stage Least Squares
9395
2SLS
9496
An estimation technique for estimating the parameters of an IV regression. It takes its name from the fact that it uses two OLS regressions - a first and second stage.

docs/source/references.bib

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,3 +68,11 @@ @article{acemoglu2001colonial
6868
pages={1369--1401},
6969
year={2001}
7070
}
71+
72+
@book{shadish_cook_cambell_2002,
73+
title={Experimental and quasi-experimental designs for generalized causal inference},
74+
author={Cook, Thomas D and Campbell, Donald Thomas and Shadish, William},
75+
volume={1195},
76+
year={2002},
77+
publisher={Houghton Mifflin Boston, MA}
78+
}

0 commit comments

Comments
 (0)