You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vignettes/articles/Scratch.Rmd
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -48,15 +48,15 @@ Before proceeding, it helps to to review how `parsnip` categorizes models:
48
48
49
49
* Within a model type is the _mode_. This relates to the modeling goal. Currently the two modes in the package are "regression" and "classification". Some models have methods for both models (e.g. nearest neighbors) while others are specific to a single mode (e.g. logistic regression).
50
50
51
-
* The computation _engine_ is a combination of the estimation method and the implementation. For example, for linear regression, one model is `"lm"` and this uses ordinary least squares analysis using the `lm` package. Another engine is `"stan"` which uses the Stan infrastructure to estimate parameters using Bayes rule.
51
+
* The computation _engine_ is a combination of the estimation method and the implementation. For example, for linear regression, one engine is `"lm"` and this uses ordinary least squares analysis using the `lm` package. Another engine is `"stan"` which uses the Stan infrastructure to estimate parameters using Bayes rule.
52
52
53
53
When adding a model into `parsnip`, the user has to specific which modes and engines are used. The package also enables users to add a new mode or engine to an existing model.
54
54
55
55
## The General Process
56
56
57
57
`parsnip` stores information about the models in an internal environment object. The environment can be accessed via the function `get_model_env()`. The package includes a variety of functions that can get or set the different aspects of the models.
58
58
59
-
If you are adding a new model form your own package, you can use these functions to add new entries into the model environment.
59
+
If you are adding a new model from your own package, you can use these functions to add new entries into the model environment.
60
60
61
61
## Step 1. Register the Model, Modes, and Arguments.
62
62
@@ -103,7 +103,7 @@ set_model_arg(
103
103
show_model_info("mixture_da")
104
104
```
105
105
106
-
## Step 3. Create the model function
106
+
## Step 2. Create the model function
107
107
108
108
This is a fairly simple function that can follow a basic template. The main arguments to our function will be:
109
109
@@ -146,7 +146,7 @@ Now that `parsnip` knows about the model, mode, and engine, we can give it the i
146
146
147
147
*`func` is the package and name of the function that will be called. If you are using a locally defined function, only `fun` is required.
148
148
149
-
*`defaults` is an optional list of arguments to the fit function that the user can change, but whose defaults can be set here. This isn't needed in this case, but is describe later in this document.
149
+
*`defaults` is an optional list of arguments to the fit function that the user can change, but whose defaults can be set here. This isn't needed in this case, but is described later in this document.
150
150
151
151
For the first engine:
152
152
@@ -165,7 +165,7 @@ set_fit(
165
165
show_model_info("mixture_da")
166
166
```
167
167
168
-
## Step 3. Add Modules for Prediction
168
+
## Step 4. Add Modules for Prediction
169
169
170
170
Similar to the fitting module, we specify the code for making different types of predictions. To make hard class predictions, the `class` object contains the details. The elements of the list are:
171
171
@@ -413,7 +413,7 @@ This would **not** include making dummy variables and `model.matrix` stuff. `par
413
413
414
414
### Why would I postprocess my predictions?
415
415
416
-
What comes back from some R functions make be somewhat... arcane or problematic. As an example, for `xgboost`, if you fit a multiclass boosted tree, you might expect the class probabilities to come back as a matrix (narrator: they don't). If you have four classes and make predictions on three samples, you get a vector of 12 probability values. You need to convert these to a rectangular data set.
416
+
What comes back from some R functions may be somewhat... arcane or problematic. As an example, for `xgboost`, if you fit a multiclass boosted tree, you might expect the class probabilities to come back as a matrix (narrator: they don't). If you have four classes and make predictions on three samples, you get a vector of 12 probability values. You need to convert these to a rectangular data set.
417
417
418
418
Another example is the predict method for `ranger`, which encapsulates the actual predictions in a more complex object structure.
0 commit comments