Skip to content

Add propensity weighting schemes and covariate balance plot functionality #311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
May 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
06a0d25
adding classes for propensity models
NathanielF Mar 16, 2024
994799b
updated notebook and site map
NathanielF Mar 17, 2024
60dc6e2
update data validation
NathanielF Mar 17, 2024
3385c57
run pre-commit
NathanielF Mar 17, 2024
790a4a4
remove unused var
NathanielF Mar 17, 2024
704c698
updating notebook
NathanielF Mar 17, 2024
e9212f1
change imports
NathanielF Mar 17, 2024
04cae44
trying linter
NathanielF Mar 17, 2024
7cfed44
fixing lint
NathanielF Mar 17, 2024
60e5ee4
update interrogate badge
NathanielF Mar 17, 2024
6323c11
fixing doctest
NathanielF Mar 17, 2024
5dbc15e
add test coverage around weighting
NathanielF Mar 17, 2024
fcc2030
linting
NathanielF Mar 17, 2024
b5f3a26
fix linting
NathanielF Mar 17, 2024
eae2252
adding more test coverage
NathanielF Mar 17, 2024
0227b4a
test plot functions
NathanielF Mar 17, 2024
3c1ba4e
add sample constraint to plot in test
NathanielF Mar 17, 2024
0842735
upping sample for test
NathanielF Mar 17, 2024
50240a0
make nicer overlap plots
NathanielF Mar 22, 2024
c5cb35e
explain purple in overlap plots
NathanielF Mar 22, 2024
37c022c
remove duplicate plotting
NathanielF Mar 22, 2024
76d8d2a
improved color scheme
NathanielF Mar 22, 2024
80c07d8
fixing isort
NathanielF Mar 22, 2024
9f6a025
align colorscheme with causalpy blue and yellow
NathanielF Mar 23, 2024
a38e50a
update with Ben and Alex's feedback
NathanielF Apr 14, 2024
298fe78
Merge branch 'main' into inv_propensity_model
NathanielF Apr 14, 2024
35b6f6c
fixing lint errors
NathanielF Apr 14, 2024
d2532e4
remove trailing whitespace references
NathanielF Apr 14, 2024
3fb81fc
fix docs title underline
NathanielF Apr 14, 2024
e366ba5
Merge branch 'main' into inv_propensity_model
NathanielF May 2, 2024
29b8a0a
update with Ben's requests
NathanielF May 2, 2024
9fc57a4
fix interrogate
NathanielF May 2, 2024
a863252
fix typo add full stops
NathanielF May 3, 2024
e49b005
Merge branch 'main' into inv_propensity_model
NathanielF May 6, 2024
3ce8fdd
add readme add blanklines
NathanielF May 6, 2024
0a54532
remove trailing whitespace README.md
NathanielF May 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,17 @@ Instrumental Variable regression is an appropriate technique when you wish to es

![](https://raw.githubusercontent.com/pymc-labs/CausalPy/main/docs/source/_static/iv_reg1.png)


### Inverse Propensity Score Weighting

Propensity scores are often used to address the risks of bias or confounding introduced in an observational study by
selection effects into the treatment condition. Propensity scores can be used in a number of ways, but here we demonstrate
their usage within corrective weighting schemes aimed to recover as-if random allocation of subjects to the treatment condition.
The technique "up-weights" or "down-weights" individual observations to better estimate a causal estimand such as the average treatment
effect.

![](https://raw.githubusercontent.com/pymc-labs/CausalPy/main/docs/source/_static/propensity_weight.png)

## Learning resources

Here are some general resources about causal inference:
Expand Down
1 change: 1 addition & 0 deletions causalpy/data/datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
"anova1": {"filename": "ancova_generated.csv"},
"geolift1": {"filename": "geolift1.csv"},
"risk": {"filename": "AJR2001.csv"},
"nhefs": {"filename": "nhefs.csv"},
}


Expand Down
1,567 changes: 1,567 additions & 0 deletions causalpy/data/nhefs.csv

Large diffs are not rendered by default.

26 changes: 26 additions & 0 deletions causalpy/data_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,3 +146,29 @@ def _input_validation(self):
the assumption of a simple IV experiment.
The coefficients should be interpreted appropriately."""
)


class PropensityDataValidator:
"""Mixin class for validating the input data and model formula for Propensity Weighting experiments."""

def _input_validation(self):
"""Validate the input data and model formula for correctness"""
treatment = self.formula.split("~")[0]
test = treatment.strip() in self.data.columns
test = test & (self.outcome_variable in self.data.columns)
if not test:
raise DataException(
f"""
The treatment variable:
{treatment} must appear in the data to be used
as an outcome variable. And {self.outcome_variable}
must also be available in the data to be re-weighted
"""
)
T = self.data[treatment.strip()]
check_binary = len(np.unique(T)) > 2
if check_binary:
raise DataException(
"""Warning. The treatment variable is not 0-1 Binary.
"""
)
Loading