-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add global optimizer Shuffled Complex Evolution (SCE) to SciPy.optimize. #18436
base: main
Are you sure you want to change the base?
Conversation
Thank you for this contribution @mcuntz. When introducing functionality of this kind we ask that you bring it up for discussion on the scipy-dev mailing list first, this is to ensure that it has greater visibility in the project. It also reduces work, because the maintainer team can advise on how to go about the PR - we've had misguided PRs be submitted that were not going to be merged (for one reason or another), which is a waste of time for the author. Before adding new global optimisers we ask that the functionality goes through the benchmark process first. I see from the changeset that you've modified the global optimiser benchmark. Can you run The review process for adding a new minimiser can be drawn out. This is because ongoing maintenance cost needs to be low, which means that code-style, structure, etc, needs to be really good. If you feel as if the review is never ending please don't get down heartened - 1200 lines of code takes a while to get through. We'll probably have to come up with a name other than Lastly, there is code here that has been committed by accident, namely the |
Thanks @andyfaff for the encouraging words. I renamed it from sce to shuffled_complex_evolution. I ran the global benchmark suite. That was not obvious given that the docs and readme are out of date. I had to set The
Lastly, I will write an e-mail to the scipy-dev mailing list now. |
First of all, thanks for this promising PR!
Another thanks for fixing those issues in the testsuite. It's quite encouraging that benchmarks run succesfully. To make the benchmark output more easy to digest, could you reuse the script we used for benchmarking DIRECT here? With some plotting on top, we should get plots like these on top of the DIRECT PR. That said, the plots could also be created from the benchmark output text file.
I ran into that multiple times as well. Merging main and then using |
I would probably drop the restart facility for the first PR (or release). Restarting is something that, in principle, could be added onto many, if not all, of the optimizers. If we're going to add such capabilities to one, I think it behooves us to at least think about the interface such that can be reused for those others ones, too, at least in terms of the call signature. We don't want to get into the position of having slightly different conventions and argument names for each one. We could definitely add the restart facility just to this one at first, but we should first think about a design that would be likely to work for all of them. For example, I would probably enforce the use of only one file. Exposing two filenames in the signature is a quirk of this particular implementation that wouldn't be needed for other optimizers. For that matter, there isn't a strict need for 2 files even for this optimizer; you can add non- |
@rkern, thanks for the comments. During the review process we'll definitely get to that. At the moment I'd like to know if you think we should add the minimiser to the global optimizer stable. There's a post on scipy-dev that is testing the waters for this. |
Yeah, I mentioned it just because it was listed as a feature over the existing solvers. I'd propose setting those aside for evaluation purposes. Looking at the benchmark results, I'm not seeing a strong standout role for it given SHGO. Maybe it shines against SHGO on the |
6fca6c9
to
cd0d70f
Compare
@dschmitz89 I could not find the script that you used for DIRECT. Here my script to plot mean_nfev and nsuccess (i.e. nsuccess/ntrials*100) from global-bench-results.json:
One can see that SCE has few function evaluations with a rather high success rate. However, SHGO has even less function evaluations.
and
(Note that SHGO and DIRECT are no stochastic optimizers (according to the benchmark) and make only one run per benchmark function.) SCE needs less function evaluations than SHGO in this case. But SHGO also fails for ndim=9 so it might stop from some criteria. I also rebased with scipy main and did |
cd0d70f
to
8e2b7e2
Compare
I pushed an update that removed the PROPACK and boost-math submodule updates. |
My impression is that the benchmark results do make a case for inclusion of the SCE solver in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first quick review.
I cannot really review the math in detail but the API is usually the bigger discussion point.
factor1 = 1.0 - (abs(num / den)) ** 5 | ||
factor2 = 2 + (x[0] - 7.0) ** 2 + 2 * (x[1] - 7.0) ** 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these and other similar changes to the benchmark functions necessary? Often an explicit double is used in power operations like here to convert the output to doubles. That is especially important for scipy's optimizers that are implemented in C or Fortran. An objective returning integers crashed them.
# self.global_optimum = [[-9.99378322, -9.99918927]] | ||
# self.fglob = -0.19990562 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# self.global_optimum = [[-9.99378322, -9.99918927]] | |
# self.fglob = -0.19990562 |
Good finding, the new optimum.
assert_raises(TypeError, shuffled_complex_evolution, func, x0, bounds) | ||
bounds = [(-1, 1), (-1, 1)] | ||
assert_raises(ValueError, shuffled_complex_evolution, func, x0, | ||
bounds, sampling='unknown') | ||
# test correct bool string | ||
assert_raises(ValueError, _strtobool, 'Ja') | ||
# test no initial population found | ||
func = deb03 | ||
x0 = [-0.5, -0.5] | ||
bounds = [(-1, 0), (-1, 0.)] # should be (0, 1) to work | ||
assert_raises(ValueError, shuffled_complex_evolution, func, x0, | ||
bounds, printit=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test that the correct error messages are shown using
with pytest.raises(ValueError, match="error messsage")
Other necessary tests include objectives that:
- return
NaN
- return
inf
# ToDo: | ||
# - write tmp/population files (of Fortran code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# ToDo: | |
# - write tmp/population files (of Fortran code) |
Leftovers?
restart=False, restartfile1='', | ||
restartfile2=''): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should drop the restart
functionality. As @rkern also mentioned, this is a generally useful feature for all optimizers, so it should not be available only for one. The API would have to be discussed further if this is generally desired among scipy.optimize
.
ngs=2, npg=0, nps=0, nspl=0, mings=0, | ||
seed=None, iniflg=True, | ||
alpha=0.8, beta=0.45, maxit=False, printit=2, | ||
polish=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that differential_evolution
does this too but in my opinion runnning a local optimizer by default after a global one is not a good API. It is a good choice in many situations but not in all.
peps : float, optional | ||
Value of normalised geometric range needed for convergence | ||
(default: 0.001). | ||
ngs : int, optional | ||
Number of complexes (default: 2). | ||
npg : int, optional | ||
Number of points in each complex (default: `2*nopt+1`). | ||
nps : int, optional | ||
Number of points in each sub-complex (default: `nopt+1`). | ||
mings : int, optional | ||
Minimum number of complexes required if the number of complexes is | ||
allowed to reduce as the optimization proceeds (default: `ngs`). | ||
nspl : int, optional | ||
Number of evolution steps allowed for each complex before complex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, we should try to rename arguments to something simple to understand (for example n_complexes
instead of ngs
). I have been guilty of bad argument names myself in the past but let's try to do better in future.
mask : array_like, optional | ||
Include (1, True) or exclude (0, False) parameters in minimization | ||
(default: include all parameters). The number of parameters ``nopt`` is | ||
``sum(mask)``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mask : array_like, optional | |
Include (1, True) or exclude (0, False) parameters in minimization | |
(default: include all parameters). The number of parameters ``nopt`` is | |
``sum(mask)``. |
This is a big API change compared to scipy's other optimizers that do not expose such functionality. It is a useful feature in principle but needs to be discussed on a bigger scale first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick comment, new code should have keyword-only arguments where appropriate as per https://docs.scipy.org/doc/scipy/dev/missing-bits.html#required-keyword-names
8e2b7e2
to
1eb8d2c
Compare
…es RuntimeWarning: invalid value encountered in power
@dschmitz89 thanks for looking at the code. I addresses (almost) all concerns of you and others:
I did not change two things:
Note that the benchmark suite has its quirks. SHGO simply hangs in the problem Cola (leading to timeout) so badly that the benchmark suite cannot even write out the rest of the json file afterwards. I simply commented the class for the benchmarking.
This makes the runtime of benchmark painfully slow. I will merge all the commit messages at the end of the process. In the mean time just ignore the individual sub-commits. |
Apologies I haven't been able to look at the benchmark suite, but shgo is buggy when used with the default sampling method. If you are not already using halton or sobol instead of simplicial they might work better . |
@lesshaste I ran the testsuite for SHGO with sobol and halton. Both sampling methods fail as well with Cola. Halton gave similar success rates as simplicial but Sobol was much better, 83% vs. 63%, needing much more function evaluations, though. You can try the cola function with the following script:
It always stops in Qhull, eventually. |
Thanks!
I understand that this is a useful feature. For MILP, a similar approach is used to indicate whether a parameter is boolean or not. My comments about dropping it still also come from the experience that such API changes make the PR reviews much harder as they often cause widespread debates. That said, if there are no objections, we can leave this functionality in but should agree on the API details. Opinions @andyfaff @tupui @mdhaber ?
This is a very interesting observation. I guess that we should not expect global optimizers to come arbitrarily close to the optimum. I should test out how much better DIRECT becomes with such a
The benchmark suite is not in a good shape. I worked on it once and had a similar experience. I will open another issue about these functions to discuss what we can do about them. Thanks a lot for letting us know!
Usually, PRs get squash-merged by a maintainer. Depending on the complexity, git history is rewritten or not. You do not need to worry about that for now :). |
What does this implement/fix?
This adds the global optimizer Shuffled Complex Evolution (SCE) to scipy.optimize:
SCE is very popular in the hydrologic community and had performed well in Andrea Gavana's Global Optimization Benchmarks (https://infinity77.net/global_optimization/).
Additional information
The current implementation has some nice features that are missing from other optimizers in scipy.optimize:
The practical coding (function, class, OptimizeResult) follows closely the implementation of the differential evolution code in scipy.optimize.