Skip to content
This repository was archived by the owner on Jul 1, 2023. It is now read-only.

Add: Rectified Adam optimizer #564

Merged
merged 6 commits into from
Dec 6, 2019
Merged

Add: Rectified Adam optimizer #564

merged 6 commits into from
Dec 6, 2019

Conversation

vballoli
Copy link
Contributor

No description provided.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@vballoli
Copy link
Contributor Author

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

Copy link
Contributor

@Shashi456 Shashi456 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please add a test, in the case of optimizers you can do that by adding a line here.

@Shashi456
Copy link
Contributor

image
Adding this here for other reviewers, is firstMoments_h really needed? and if the _h variables are just filler variables then you should not add the blank space between variables IMO.

@vballoli
Copy link
Contributor Author

Added the test. Hoping for comments on this addition.

@vballoli vballoli mentioned this pull request Nov 24, 2019
27 tasks
Copy link
Member

@dan-zheng dan-zheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding a test!

If you have time, could you please create a new file Tests/TensorFlow/OptimizerTests.swift and add an optimizer correctness test as well? That would be a huge help!

Here's a simple reference test for rectified Adam that we can copy.

@@ -58,6 +59,7 @@ final class SequentialTests: XCTestCase {
amsgrad.update(&model, along: 𝛁model)
adagrad.update(&model, along: 𝛁model)
adadelta.update(&model, along: 𝛁model)
radam.update(&model, along: 𝛁model)
}
}
XCTAssertEqual(model.inferring(from: [[0, 0], [0, 1], [1, 0], [1, 1]]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you ran this test locally, after adding RAdam? The expected results of model.inferring should be updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, do check if build and test pass locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test did pass locally. Will run it again and get back to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test is failing now - result's going to NaN. Anyway I can debug this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test is failing now - result's going to NaN. Anyway I can debug this ?

I'd recommend writing a standalone unit test for rectified Adam. This will help isolate the issue.

Here's a simple reference test for rectified Adam - would you like to port it to a new file Tests/TensorFlow/OptimizerTests.swift?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I will be porting it to a new file like you've suggested. I've found the issue, will make the changes and submit OptimizerTests as a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the test is still failing:

XCTAssertEqual failed: ("[[0.5567076],
 [0.5567076],
 [0.5567076],
 [0.5567076]]") is not equal to ("[[0.5053005],
 [0.5053005],
 [0.5053005],
 [0.5053005]]")

@dan-zheng dan-zheng added the tests needed Tests are needed for this pull request label Nov 26, 2019
Copy link
Contributor

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a few issues if we want to match the implementation with the official Pytorch one (those are small differences between the math in the paper and the actual code the authors provided).

Removed redundant variables, updated RAdam test and added epsilon to secondMoments_h
@vballoli
Copy link
Contributor Author

vballoli commented Dec 6, 2019

I've made the changes as pointed by @sgugger. I'll be sending a separate PR for optimizer correctness for Rectified Adam and other optimizers as suggested by @dan-zheng

@vballoli vballoli requested review from sgugger and dan-zheng December 6, 2019 16:50
Copy link
Member

@dan-zheng dan-zheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I applied some minor style changes in 0794ac4.

@sgugger sgugger merged commit efb4596 into tensorflow:master Dec 6, 2019
@sgugger
Copy link
Contributor

sgugger commented Dec 6, 2019

Thanks a lot!

@vballoli vballoli mentioned this pull request Feb 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
tests needed Tests are needed for this pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants