Skip to content

Support Multiple Threads on ML Related DB Access  #6692

Open
@superichmann

Description

@superichmann

Is your feature request related to a problem? Please describe.
According to my tests, when trying to Execute multiple RegressionExperiments in parallel where data is loaded through DatabaseSource, some bottleneck occurs on the DB access layer, probably due to a single instance of the DB access layer (even when creating a separate DatabaseSource for each experiment).

This prevents accessing the full throttle of the cpu and the database.

My tests were:

  1. Reviewing DB logs: when running experiments, the DB access occurs sequentially and not at the same time.
  2. Running from separate EXE: when running each experiment from a separate process, suddenly the DB gets much more queries at the same time. all experiments end faster.

Still I might miss something so correct me if I am wrong.

Describe the solution you'd like
Allow instantiation of multiple DB connections to allow running experiments in parallel.

Describe alternatives you've considered

  • Tried to load the idataview with the entire db data and then dropcolumns or filter but that took very long time (loading from db)

Additional context
Windows machine, both database and ml are on the same machine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AutoML.NETAutomating various steps of the machine learning processenhancementNew feature or requestneeds-further-triageuntriagedNew issue has not been triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions