Understanding Metrics + DDP #2346
Replies: 5 comments 1 reply
-
mind have look @justusschock @SkafteNicki ^^ |
Beta Was this translation helpful? Give feedback.
-
So the functional interface does not come with ddp support, these are just native torch implementations of the respective metrics. The modular interface comes with ddp support. Here you got two options, either the native or sklearn backend. For
then in the init of your model you initialize these as any other module: |
Beta Was this translation helpful? Give feedback.
-
I see. Thank you! |
Beta Was this translation helpful? Give feedback.
-
Can you elaborate what you mean by aggregate? do you mean to over multiple batches? |
Beta Was this translation helpful? Give feedback.
-
If I am using 2 gpus with DDP I will get two processes and each is calculating his own loss on his share of the data. Process 1 gives me the loss for the first half of the validation data and process 2 gives me the loss for the second half. Now I want to get the mean of both of them to get one final loss for the entire validation data. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
To my understanding the new metrics package aggregates the metrics between the different nodes if using DDP. So if I use DDP with 2 GPUs then
validation_epoch_end
will be called 2 times, each time on a subset of the validation data. If I calculate the F1 score for example this will give me 2 different scores.Now if I use
from pytorch_lightning.metrics.functional import f1_score
then this will internally aggregate the F1 score for both processes (at least that's what I think it does). But I still get different F1 scores for each process.This is my code:
Do I have to use
f1_score_sync
differently?Beta Was this translation helpful? Give feedback.
All reactions