-
Notifications
You must be signed in to change notification settings - Fork 23k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pytorch] Add strong Wolfe line search for lbfgs #8824
Conversation
|
||
# store new direction/step | ||
old_dirs.append(y) | ||
old_stps.append(s) | ||
ro.append(1. / ys) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
else: | ||
t = lr | ||
|
||
# directional derivative | ||
gtd = flat_grad.dot(d) # g * d | ||
|
||
# directional derivative is below tolerance | ||
if gtd > -tolerance_change: | ||
break |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
break | ||
|
||
if d.mul(t).abs_().sum() <= tolerance_change: | ||
# lack of progress | ||
if d.mul(t).abs().max() <= tolerance_change: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@@ -106,16 +278,18 @@ def step(self, closure): | |||
state['func_evals'] += 1 | |||
|
|||
flat_grad = self._gather_flat_grad() | |||
abs_grad_sum = flat_grad.abs().sum() | |||
opt_cond = flat_grad.abs().max() <= tolerance_grad |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
min_pos = x1 - (x1 - x2) * ((g1 + d2 - d1) / (g1 - g2 + 2 * d2)) | ||
return min(max(min_pos, xmin_bound), xmax_bound) | ||
else: | ||
return (xmin_bound + xmax_bound) / 2. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
test/test_optim.py
Outdated
lambda params: optim.LBFGS(params, lr=5e-2, max_iter=5), | ||
wrap_old_fn(old_optim.lbfgs, learningRate=5e-2, maxIter=5) | ||
lambda params: optim.LBFGS(params, lr=1, max_iter=5), | ||
wrap_old_fn(old_optim.lbfgs, learningRate=1, maxIter=5) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
I'll review this soon. |
cc: @ssnl |
Thanks for reviewing @vincentqb ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the references! This PR looks good to me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincentqb is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@vincentqb merged this pull request in ad73ea2. |
This has been copied from pytorch/pytorch#8824 PyTorch 1.2.0 has already merged this pull-request and we will incorporate it from the official repository in our code in near future.
This pull request adds a line search for lbfgs. "strong Wolfe" is the default line search method in minFunc and it is also recommended in the Numerical Optimization book.
The implementation is based on four sources:
The 'lua' version is based on an old version of
minFunc
, which has been updated in 2012. I made a couple of small changes based on the updated version. Due to that, the test of comparing with.lua
version is not consistent (that's is the reason I changed a learning rate in the test).Differential Revision: D15740107