Summary on Supporting PyTorch

Workflow description about ElasticDL

About start command:

elasticdl train --image_name=elasticdl:mnist: tutorials/elasticdl_local.md

setup entry: elasticdl=elasticdl_client.main:main

Create master:

elasticdl_client/api.py

Run worker:

run task (training/evaluation/prediction).Only calculate the gradient and report gradient to ps.

worker/mian.py

elastic/python/worker/worker.py

PS Client:

Push parameters to PS:elastic/python/worker/ps_client.py

How to print gradient information directly in PyTorch.

Usually, we train in PyTorch with an optimizer.

# training and testing
for epoch in range(EPOCH):
    for step, (b_x, b_y) in enumerate(train_loader):   # gives batch data, normalize x when iterate train_loader

        output = cnn(b_x)[0]            # cnn output
        loss = loss_func(output, b_y)   # cross entropy loss
        optimizer.zero_grad()           # clear gradients for this training step
        loss.backward()                 # backpropagation, compute gradients
        optimizer.step()                # apply gradients

Summary on Supporting PyTorch

Workflow description about ElasticDL

About start command:

Create master:

Run worker:

PS Client:

How to print gradient information directly in PyTorch.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally