Parallelization of multi start optimization

In it's current state, Data2Dynamics only supports parallelization over experimental conditions but not over multi-start optimization runs. An easy way to implement such a parallelization is by starting multiple instances of MATLAB on one machine and have each of them run a small batch of the full number of multi-start optimization runs. In this article, we want to introduce you to this straightforward way of parallelization with a small example.

Let's first initialize D2D, load model and data, compile and save the workspace:

arInit;
arLoadModel('model');
arLoadData('data_for_model');
arCompileAll;

arSave('my_workspace');

Now, let's say we have four processor cores available and want to fit with 15 multi-start runs:

multistart_runs = 15;
parallel_instances = 4;

The basic idea now is to have a bash script startup.sh that does start multiple instances of MATLAB that call a function doWork.m. For the latter, we need to store some variables in a configuration struct:

conf.pwd = pwd;
conf.d2dpath = fileparts(which('arInit.m'));
conf.workspace = ar.config.savepath;
conf.parIn = parallel_instances; % number of matlab instances
conf.totNum = ceil(multistart_runs/parallel_instances)*...
    parallel_instances; % extend total number of multistart runs to the next multiple of parallel_instances without loss of computation time

save('parallel_conf.mat', 'conf');

Now call startup.sh:

for icall = 1:parallel_instances
    system(sprintf('cd %s; sh startup.sh %i', conf.pwd, icall));
end

startup.sh now has to open multiple instances of matlab which can be realized using the screen command:

screen -d -m /Applications/MATLAB_R2019a.app/bin/matlab -nodisplay -r "addpath('~/Projekte/d2d/arFramework3'); doWork('$1'); exit;"

Make sure you put in the correct MATLAB and d2d path. The function doWork.m can look like this:

function y = doWork(icall)
    icall = str2num(icall);
    
    load('parallel_conf.mat', 'conf'); % load config 

    cd(conf.pwd);
    addpath(conf.d2dpath);
    
    arInit;
    arLoad(conf.workspace);

    arFitLHS(conf.totNum/conf.parIn , icall); 
    % conf.totNum/conf.parIn is the number of fits each instance of MATLAB has to do
    % use icall as a random seed to make sure every instance fits for
    % different initial parameter vectors

    arSave(['par_result_' num2str(icall) '.mat'], 'ar');
end

After all the calculation is finished, there should be parallel_instances = 4 folders named *_par_result_* in Results/, each containing the results of ceil(multistart_runs/parallel_instances) = 4 multi-start runs. These can be conveniently collected by using arMergeFitsCluster('par_result').

Parallelization of multi start optimization

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Home

Further topics

Frequently asked questions

Clone this wiki locally