Skip to content

Updated writeup in the low-level interface notebook #150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 13, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 5 additions & 29 deletions demo/notebooks/prototype_interface.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,43 +20,19 @@
"to the C++ code that doesn't require modifying any C++.\n",
"\n",
"To illustrate when such a prototype interface might be useful, consider\n",
"the classic BART algorithm:\n",
"\n",
"**INPUT**: $y$, $X$, $\\tau$, $\\nu$, $\\lambda$, $\\alpha$, $\\beta$\n",
"\n",
"**OUTPUT**: $m$ samples of a decision forest with $k$ trees and global variance parameter $\\sigma^2$\n",
"\n",
"Initialize $\\sigma^2$ via a default or a data-dependent calibration exercise\n",
"\n",
"Initialize \"forest 0\" with $k$ trees with a single root node, referring to tree $j$'s prediction vector as $f_{0,j}$\n",
"\n",
"Compute residual as $r = y - \\sum_{j=1}^k f_{0,j}$\n",
"\n",
"**FOR** $i$ **IN** $\\left\\{1,\\dots,m\\right\\}$:\n",
"\n",
" Initialize forest $i$ from forest $i-1$\n",
" \n",
" **FOR** $j$ **IN** $\\left\\{1,\\dots,k\\right\\}$:\n",
" \n",
" Add predictions for tree $j$ to residual: $r = r + f_{i,j}$ \n",
" \n",
" Update tree $j$ via Metropolis-Hastings with $r$ and $X$ as data and tree priors depending on ($\\tau$, $\\sigma^2$, $\\alpha$, $\\beta$)\n",
"\n",
" Sample leaf node parameters for tree $j$ via Gibbs (leaf node prior is $N\\left(0,\\tau\\right)$)\n",
" \n",
" Subtract (updated) predictions for tree $j$ from residual: $r = r - f_{i,j}$\n",
"\n",
" Sample $\\sigma^2$ via Gibbs (prior is $IG(\\nu/2,\\nu\\lambda/2)$)\n",
"that that \"classic\" BART algorithm is essentially a Metropolis-within-Gibbs \n",
"sampler, in which the forest is sampled by MCMC, conditional on all of the \n",
"other model parameters, and then the model parameters are updated by Gibbs.\n",
"\n",
"While the algorithm itself is conceptually simple, much of the core \n",
"computation is carried out in low-level languages such as C or C++ \n",
"because of the tree data structure. As a result, any changes to this \n",
"because of the tree data structures. As a result, any changes to this \n",
"algorithm, such as supporting heteroskedasticity and categorical outcomes (Murray 2021) \n",
"or causal effect estimation (Hahn et al 2020) require modifying low-level code. \n",
"\n",
"The prototype interface exposes the core components of the \n",
"loop above at the R level, thus making it possible to interchange \n",
"C++ computation for steps like \"update tree $j$ via Metropolis-Hastings\" \n",
"C++ computation for steps like \"update forest via Metropolis-Hastings\" \n",
"with R computation for a custom variance model, other user-specified additive \n",
"mean model components, and so on."
]
Expand Down
Loading