-
Notifications
You must be signed in to change notification settings - Fork 137
Fix Transposed Conv2d error & add test #288
Conversation
We dont currently support output paddings, so the logic falls to the default value which is zero, also we dont support full padding, that would possibly need control flow. |
This LGTM assuming the test case comes from a comparison with an existing framework (e.g. Keras). |
@jekbradbury, surprisingly doesn't work though |
Ohh, as in the test doesn't pass? |
Yea they don't. |
Does it give |
@jekbradbury it gives
|
OK, so what's happening is that some frameworks define transposed convolution in different (but almost equivalent) ways, and you can switch between these definitions by reflecting the filter over both spatial axes and swapping the input and output channel dimensions. (The reason for this is that one of these definitions is the standard mathematical definition for a transposed convolution, and the other allows the framework to reuse the same kernels as the normal convolution backwards pass). I'm not quite sure how you're getting the numbers you just pasted, but they're the correct result for the "standard mathematical definition" (although flipped) and the expected numbers in the test are the correct result for the "backwards pass of convolution" definition that's used by TF/Keras. I'm confused as to why the test currently includes |
I'm particularly confused because |
@jekbradbury I've not seen any direct usage of the Tensorflow transposed conv2d though, It's mostly the keras api which is used which then in return ultimately calls the Tensorflow API. What I'm unsure about is, give the inputs I have, if I manually compute the Also since you mentioned it, would you suggest any changes for the current order of parameters that we use? |
the current error for the test is now :
I think the test i wrote is wrong, because the code i wrote directly calls tf.nn.conv2d_tranpose which just calls backprop2dinput , while its in the keras api where the dimensions of the output are actually calculated |
@jekbradbury Do you have any idea as to how we could solve the latest error? |
I haven't carefully read this whole discussion, but https://bugs.swift.org/browse/TF-540 and this thread might be related: https://groups.google.com/a/tensorflow.org/forum/#!msg/swift/UUPwV01sZrE/LszG6T7dBQAJ ? |
@marcrasi that was an extension error, there seems to be some error within the layer implementation as well or it's the test. Just trying to figure it out currently. That thread did help me earlier to figure out a few errors :) thanks. |
If you revert back to the original code, that is, put back in "- (1 * paddingIndex)" instead of always subtracting 1, then your test should work. It works for me. However, I get an error if I attempt to get gradients. Not sure if that should work or not??
|
Also, not sure if this is an issue or not but the TransposedConv2D defines tensors as width then height whereas the underlying Raw.conv2DBackpropInput defines tensors as height then width for example from TransposedConv2D:
But in func conv2DBackpropInput:
|
Yeah @sjaz24 I noticed that we might have documented it wrong |
@sjaz24 Are you sure this test passes locally for you?
|
You most likely haven't undone your changes. You need to put back the original code that you changed.
|
@sjaz24 pointed out about filter shape order in documentation. Basically It's what Marc Rasi says here: In summary, the documentation of filter should be:
|
Fixed this, Ready to be reviewed. Test and build pass locally. |
Hi @Shashi456! Are you able to fix the merge conflicts here? Thanks! -Brennan |
@saeta done. |
Fix: #282
This is a pretty long error post.
The transposed conv2d layer doesn't work. There are a number of issues that have popped up.
The first of errors i found was in the logic, Keras updates its own new dimensions by calculating
So i changed the above code to reflect this.
Then what i found out was, that
Running this test in tensorflow produces the output as mentioned :
But when i ran it in swift with .same padding we get the outputs
If there's something i've inherently come to understand its that, The tensorflow conv2d_transpose is just a wrapper which is called by keras. You can check the code for it here
So running the function directly might be a cheap hack. The keras transposed conv2d is the place we actually compute the new dimensions, the code for it can be found here.
and the backend for that here.
I'm very confused by the code right now. But i think there's some place i'm committing a major mistake while trying to translate the code. Would appreciate some help, sorry for the very long introductory message.