Skip to content

AutoML 2.0: Should Seed be a tunable param? #6704

Open
@torronen

Description

@torronen

One of the tunable parameters is FeatureFraction. Per my understanding this drops randomly the specified portion of columns for each tree. Further, I understand that random seed is used to decide which columns to drop.

In small forests with few trees the specific feature columns that are dropped may have bigger impact than how many features are dropped. One of the columns may lead to overfit and dropping it may improve the results on validation and test sets. However, it is crucial that the correct columns is dropped.

If my understanding is correct, then I believe seed should be one of the tunable parameters. This should be the case at least for algorithms where FeatureFraction is present (FastTree and LightGBM at least). At the moment, it is not present in e.g. FastTreeOption class.

Agree/disagree?

Metadata

Metadata

Assignees

No one assigned

    Labels

    AutoML.NETAutomating various steps of the machine learning processenhancementNew feature or requestuntriagedNew issue has not been triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions