Description
Is your feature request related to a problem? Please describe.
According to my tests, when trying to Execute multiple RegressionExperiments
in parallel where data is loaded through DatabaseSource
, some bottleneck occurs on the DB access layer, probably due to a single instance of the DB access layer (even when creating a separate DatabaseSource
for each experiment).
This prevents accessing the full throttle of the cpu and the database.
My tests were:
- Reviewing DB logs: when running experiments, the DB access occurs sequentially and not at the same time.
- Running from separate EXE: when running each experiment from a separate process, suddenly the DB gets much more queries at the same time. all experiments end faster.
Still I might miss something so correct me if I am wrong.
Describe the solution you'd like
Allow instantiation of multiple DB connections to allow running experiments in parallel.
Describe alternatives you've considered
- Tried to load the idataview with the entire db data and then dropcolumns or filter but that took very long time (loading from db)
Additional context
Windows machine, both database and ml are on the same machine.