-
Notifications
You must be signed in to change notification settings - Fork 137
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please add a test, in the case of optimizers you can do that by adding a line here.
Added the test. Hoping for comments on this addition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding a test!
If you have time, could you please create a new file Tests/TensorFlow/OptimizerTests.swift
and add an optimizer correctness test as well? That would be a huge help!
Here's a simple reference test for rectified Adam that we can copy.
@@ -58,6 +59,7 @@ final class SequentialTests: XCTestCase { | |||
amsgrad.update(&model, along: 𝛁model) | |||
adagrad.update(&model, along: 𝛁model) | |||
adadelta.update(&model, along: 𝛁model) | |||
radam.update(&model, along: 𝛁model) | |||
} | |||
} | |||
XCTAssertEqual(model.inferring(from: [[0, 0], [0, 1], [1, 0], [1, 1]]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if you ran this test locally, after adding RAdam
? The expected results of model.inferring
should be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, do check if build and test pass locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test did pass locally. Will run it again and get back to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test is failing now - result's going to NaN. Anyway I can debug this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test is failing now - result's going to NaN. Anyway I can debug this ?
I'd recommend writing a standalone unit test for rectified Adam. This will help isolate the issue.
Here's a simple reference test for rectified Adam - would you like to port it to a new file Tests/TensorFlow/OptimizerTests.swift
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I will be porting it to a new file like you've suggested. I've found the issue, will make the changes and submit OptimizerTests
as a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the test is still failing:
XCTAssertEqual failed: ("[[0.5567076],
[0.5567076],
[0.5567076],
[0.5567076]]") is not equal to ("[[0.5053005],
[0.5053005],
[0.5053005],
[0.5053005]]")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There a few issues if we want to match the implementation with the official Pytorch one (those are small differences between the math in the paper and the actual code the authors provided).
Removed redundant variables, updated RAdam test and added epsilon to secondMoments_h
I've made the changes as pointed by @sgugger. I'll be sending a separate PR for optimizer correctness for Rectified Adam and other optimizers as suggested by @dan-zheng |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I applied some minor style changes in 0794ac4.
Thanks a lot! |
No description provided.