Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.optim.lbfgs - added box constraint and line search methods(back… #938

Closed
wants to merge 2 commits into from

Conversation

ChangYong-Oh
Copy link

@ChangYong-Oh ChangYong-Oh commented Mar 6, 2017

torch.optim.lbfgs

  • added box constraint
  • added line search methods(backtracking, goldstein, weak_wolfe)
  • Box constraint is imposed via line search(line search is conducted under constraints)
  • Box constraint is specified by the similar way in scipy fmin_l_bfgs_b
  • Line search method choice is given by string(originally, it seems to be going to be given by function)

torch.autograd._function.reduce.Prod class

  • if prod is taken over some tensor containing zero, this returns inf, because the original implementation is done by [product of all elements]/[an element]. All these cases are taken care of separately. Still, tensor.prod() returns inf if there is only one element in a tensor.

This should be fixed.

@soumith soumith added the ready for review (this tag is deprecated) All PRs are ready for review unless they are draft, WIP, or have undismissed requested changes label Mar 13, 2017
@apaszke apaszke mentioned this pull request Mar 15, 2017
return grad_output.new(self.input_size).fill_(0)
else:
grad_input = grad_output.new(self.input_size).fill_(0)
indexing_tuple = tuple(zero_loc[0].numpy())

This comment was marked as off-topic.

This comment was marked as off-topic.

@apaszke
Copy link
Contributor

apaszke commented Mar 17, 2017

I took your prod fixes and refactored them slightly so that they reuse some intermediate results. Can you please remove this change from this PR? I'll try to get someone who knows line search methods to review your LBFGS changes soon. Thanks!

@apaszke apaszke added requested-changes and removed ready for review (this tag is deprecated) All PRs are ready for review unless they are draft, WIP, or have undismissed requested changes requested changes labels Mar 22, 2017
@soumith soumith added the ready label Jun 22, 2017
@soumith soumith removed the ready label Jul 3, 2017
@hassec
Copy link

hassec commented May 30, 2018

@apaszke Has anyone looked at the line search part of this MR yet?
Or is this MR completely dead?

@ezyang
Copy link
Contributor

ezyang commented May 30, 2018

@pytorchbot retest this please

2 similar comments
@ezyang
Copy link
Contributor

ezyang commented May 30, 2018

@pytorchbot retest this please

@ezyang
Copy link
Contributor

ezyang commented May 30, 2018

@pytorchbot retest this please

@hassec
Copy link

hassec commented Jun 22, 2018

Is there anything one can do to push this along?
IMHO LBFGS would greatly benefit from having these line search functions implemented...

@soumith
Copy link
Member

soumith commented Jun 22, 2018

@hassec we are blocked on finding competent reviewers for this stuff, who know the relevant literature and can verify the code implemented

@hassec
Copy link

hassec commented Jun 23, 2018

@soumith thanks for the quick feedback.
Mentioning in #6564 that there is simply a review needed might be useful to find someone?

@fehiepsi
Copy link
Contributor

@soumith @hassec I implemented strong Wolfe line search in #8824 with enough references. You might want to take a look if you are interested in. :)

@ssnl
Copy link
Collaborator

ssnl commented Jun 25, 2018

I'll review this in the coming days.

@weiyangfb
Copy link
Contributor

@ssnl

@weiyangfb weiyangfb added the ready for review (this tag is deprecated) All PRs are ready for review unless they are draft, WIP, or have undismissed requested changes label Aug 21, 2018
@matthieuheitz
Copy link

What is the state of this PR ?
This L-BFGS-B would be a great addition to Pytorch !

@ianwilliamson
Copy link

Wondering what the status of this pull request is. Would be great to have constraint options.

@soumith
Copy link
Member

soumith commented Feb 23, 2019

@matthieuheitz @ianwilliamson would either of you be willing to review the PR for correctness against the papers? that's what it's blocked on at the moment.

@GeoffNN
Copy link

GeoffNN commented Apr 9, 2019

Is anyone looking at this?

@ezyang ezyang added module: optimizer Related to torch.optim and removed ready for review (this tag is deprecated) All PRs are ready for review unless they are draft, WIP, or have undismissed requested changes labels Apr 9, 2019
@ezyang
Copy link
Contributor

ezyang commented Apr 9, 2019

No not at the moment. We need someone to review the PR for correctness (math), and it also needs some rebasing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincentqb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

assert offset == self._numel()
return deriv

def _max_alpha(self, d):
Copy link
Contributor

@vincentqb vincentqb Jun 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference for bounded-constrained optimization: equation 16.71 in Numerical Optimization 2nd Edition by Nocedal and Wright.

original_param_data_list = self._copy_param()
phi_0 = closure().data[0]
phi_0_prime = self._directional_derivative(d)
alpha_k = 1.0
Copy link
Contributor

@vincentqb vincentqb Jun 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike the _goldstein and _weak_wolfe, _backtracking is not invoking _max_alpha here.

if u_bnd is not None:
from_u_bnd = ((u_bnd-p.data)/p_grad)[p_grad>0]
min_u_bnd = torch.min(from_u_bnd) if from_u_bnd.numel() > 0 else max_alpha
max_alpha = min(max_alpha, min_l_bnd, min_u_bnd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line takes the minimal alpha across all the directions. This implies that, if the point is near the boundary, the next step is constrained to remain very close and the descent could stall. I expect instead to do different step size in each directions, see 16.71 mentioned in previous comment. Is there another reference for this?

Copy link
Contributor

@vincentqb vincentqb Jun 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relevant part of scipy is the fortran lbfgsb library's cauchy function. The gradient is scaled differently for each component, as done in Nocedal equation 16.71.

@vincentqb
Copy link
Contributor

vincentqb commented Jun 19, 2019

I will close this PR since

@vincentqb vincentqb closed this Jun 19, 2019
jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Aug 5, 2021
* pipe through index mode

* replace codegen srings

* cache index mode

* use std limit

* move definitions

* rename INDEX_TYPE
zhuhong61 pushed a commit to zhuhong61/pytorch that referenced this pull request Jun 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: optimizer Related to torch.optim open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.