Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix convolve_fft nan_treatment='fill' handling #8122

Merged
merged 13 commits into from
Oct 28, 2019

Conversation

keflavich
Copy link
Contributor

@keflavich keflavich commented Nov 13, 2018

Fix for issue #8121

WIP: needs a test

  • regression test

cc @jamienoss

@astropy-bot
Copy link

astropy-bot bot commented Nov 13, 2018

Check results are now reported in the status checks at the bottom of this page.

@keflavich
Copy link
Contributor Author

It's not super clear to me how useful the nan_treatment='fill' mode is, but we'll keep it around. It probably makes sense to have a different fill_value for the edges and the NaNs.

@keflavich keflavich force-pushed the issue8121 branch 3 times, most recently from 2005274 to c36af00 Compare November 13, 2018 21:33
@bsipocz bsipocz added this to the 3.1.1 milestone Nov 13, 2018
@pllim
Copy link
Member

pllim commented Nov 13, 2018

@keflavich , 3.1.1 or 3.2? nvm

@bsipocz
Copy link
Member

bsipocz commented Nov 13, 2018

I suppose this is a bugfix, but if it's not, please move to milestone 3.2

@codecov
Copy link

codecov bot commented Nov 13, 2018

Codecov Report

Merging #8122 into master will increase coverage by 0.55%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8122      +/-   ##
==========================================
+ Coverage   86.36%   86.91%   +0.55%     
==========================================
  Files         405      383      -22     
  Lines       60859    57859    -3000     
  Branches     1096     1056      -40     
==========================================
- Hits        52559    50291    -2268     
+ Misses       7676     6954     -722     
+ Partials      624      614      -10
Impacted Files Coverage Δ
astropy/convolution/convolve.py 85.41% <100%> (-1.11%) ⬇️
astropy/convolution/__init__.py 62.5% <0%> (-37.5%) ⬇️
astropy/wcs/__init__.py 50% <0%> (-16.67%) ⬇️
astropy/visualization/wcsaxes/coordinates_map.py 86.88% <0%> (-11.58%) ⬇️
astropy/convolution/src/convolve.c 80.68% <0%> (-8.34%) ⬇️
astropy/wcs/src/unit_list_proxy.c 53.96% <0%> (-6.62%) ⬇️
astropy/visualization/units.py 68.08% <0%> (-5.5%) ⬇️
astropy/modeling/separable.py 90.62% <0%> (-5.26%) ⬇️
astropy/constants/__init__.py 94.87% <0%> (-5.13%) ⬇️
astropy/io/misc/asdf/tags/table/table.py 90% <0%> (-4.81%) ⬇️
... and 264 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ff9ac0...13694b1. Read the comment docs.

rifft[arrayslices][nanmaskarray] = np.nan
elif nan_treatment == 'fill':
rifft[arrayslices][nanmaskarray] = fill_value
Copy link
Contributor

@jamienoss jamienoss Nov 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keflavich I'm pretty confused. Why are the values being filled post convolution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to happen pre-convolution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This already happens pre-convolution, but not with the intended fill_value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where?

@@ -801,7 +801,14 @@ def convolve_fft(array, kernel, boundary='fill', fill_value=0.,
rifft = ifftn(fftmult)

if preserve_nan:
if nan_treatment == 'fill':
warnings.warn("preserve_nan was set, which overrides the "
Copy link
Contributor

@jamienoss jamienoss Nov 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not semantically true. Whilst preserve_nan=True will persist the NaN values that get filled, it doesn't/shouldn't "override" nan_treatment='fill' functionality. The values should be filled pre-convolution and therefore contribute to their surrounding convolutions regardless of whether they remain as the convolved values or revert to their original NaN values.

Copy link
Contributor

@jamienoss jamienoss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't implement nan_treatment='fill'. The values need to be filled pre-convolution.

rifft[arrayslices][nanmaskarray] = np.nan
elif nan_treatment == 'fill':
rifft[arrayslices][nanmaskarray] = fill_value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to happen pre-convolution.

if nan_treatment == 'fill':
array[nanmaskarray] = fill_value
else:
array[nanmaskarray] = 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamienoss this line replaced the values with the default fill_value of zero. So, the behavior was "always (kinda) correct" for fill_value=0, and we never had a test for other cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this comment is the answer to #8122 (comment))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool, thanks. It was right in front of me after all.

@@ -800,8 +803,7 @@ def convolve_fft(array, kernel, boundary='fill', fill_value=0.,
else:
rifft = ifftn(fftmult)

if preserve_nan:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this condition needs to stay.

@keflavich
Copy link
Contributor Author

@jamienoss thanks for the review. There's a real chance I messed something up further; I clearly did not put in enough thought the first time around. It would be helpful to check that the tests are doing the right thing, i.e., coming up with the right numbers for all 3 elements in the array for the simple 1D convolutions.

@jamienoss
Copy link
Contributor

@keflavich I'll take a look at the tests tomorrow.

fill_value=np.nan)

with warnings.catch_warnings():
warnings.simplefilter('ignore', RuntimeWarning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prob needs a comment outlining its intent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just a c&p from a previous test, so you're right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped the catching; it's not needed when filling.

warnings.simplefilter('ignore', RuntimeWarning)
result = convolve_fft(masked_array, kernel, boundary='fill',
nan_treatment='interpolate',
fill_value=np.nan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I've been trying to figure out the point to this test, specifically why fill_value=np.nan. In fact, I think it points out another bug. I'll open an issue.

In the meantime, what is the intent of this test and why fill_value=np.nan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intent was to test that convolution could successfully ignore edge padding by setting the external pixels to NaN, I believe. However, that might be an outdated use case now that we have two different nan_treatment approaches.

Copy link
Contributor

@jamienoss jamienoss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a few issues lingering within the tests.

I tested them against convolve(), the results of the new tests are at least accurate.

result = convolve_fft(masked_array, kernel, boundary='fill',
nan_treatment='interpolate',
fill_value=np.nan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I've been trying to figure out the point to this test, specifically why fill_value=np.nan. In fact, I think it points out another bug. I'll open an issue.

In the meantime, what is the intent of this test and why fill_value=np.nan?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array = np.array([1., np.nan, 3.], dtype='float64')
kernel = np.array([1, 1, 1])

with warnings.catch_warnings():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, again, in the fill cases, we shouldn't need to catch div-by-zero errors because there shouldn't be any. Fix incoming.

array = np.array([1., np.nan, 3.], dtype='float64')
kernel = np.array([1, 1, 1])

with warnings.catch_warnings():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this?

fill_value=0)

# note that, because fill_value also affects boundary='fill', the edge
# pixels are treated as zero rather than being ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is the bug, it shouldn't happen this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this isn't so much a bug as a design failure, as highlighted in #8127

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, however, it's not always the case that the developer gets to define what is and isn't a bug. I.e. the artist of modern abstract art doesn't get to completely define its interpretation.

If the behaviour and functionality don't intuitively match the semantics (or the documentation (loophole)) then it is a bug. Even if the fix is just readjusting the semantics by documenting it as a feature (hence the loophole), it's still a bug until fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to stick with "design failure" because neither the documentation nor the implementation actually make sense. As you pointed out, fill_value is incorrectly tied to both boundary and nan_treatement, which results in some nonsense parameter combinations.

That's said, this is all semantics, we have a bunch of crap that needs to be cleaned up, and I'm just arguing for punting the fixes related to this particular line to #8127, since it more accurately and completely describes the problem.

There is an open question here, though - should this PR be merged before dealing with the bigger issues in 8127, or should we try to get everything right all at once? I'm open to suggestions; I don't believe there are many/any users trying to use this corner of the convolution code yet, which gives us some flexibility, but we want it all to be right as soon as possible...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool to defer this particular point to #8127. I think it might be better and safer to address this piece by piece. I'm already trying to untangle the normalization and nan treatment as I did for convolve, so we already have a few parallel jobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks for doing this.

Just to give you some background: these problems mostly come from convolve and convolve_fft being developed fully independently (convolve by Tom, convolve_fft by me), then trying to match their call specifications. In trying to make convolve_fft match convolve's arguments more closely, I clearly missed/ignored several use cases that I had never considered. It's looking more likely that I caught all of the inverse, i.e., things convolve_fft did that needed to be ported over to convolve.

@@ -606,13 +653,15 @@ def test_uniform_3x3_withnan(self, boundary, nan_treatment,
"convolve boundary='fill'"):
z = convolve_fft(x, y, boundary=boundary,
nan_treatment=nan_treatment,
fill_value=np.nan if normalize_kernel else 0,
# you cannot fill w/nan, you can only interpolate over it
fill_value=np.nan if normalize_kernel and nan_treatment=='interpolate' else 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this was already conditional on normalize_kernel, I don't think it should be constraining the test space. The addition of nan_treatment=='interpolate' also constrains the test and I don't think it is a good idea to do so. We should be trying to test all parameter space and just deal with the outcome appropriately in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've lost the thread of this conversation... are you saying we should run this test with both np.nan and 0 for fill values, and just check the returns for each case are correct? That's reasonable, and I think the expected outcome is:
if np.isnan(fill_value) and not (normalize_kernel and nan_treatment=='interpolate'): an exception should be raised, but all cases should be allowed with fill_value=0.

I'm not certain of this, though, I should go check....

normalize_kernel=normalize_kernel,
preserve_nan=preserve_nan)
else:
z = convolve_fft(x, y, boundary=boundary,
nan_treatment=nan_treatment,
fill_value=np.nan if normalize_kernel else 0,
# you cannot fill w/nan, you can only interpolate over it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@@ -2,6 +2,7 @@

import itertools

import warnings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the other warnings are removed, don't forget to remove this also.

@bsipocz bsipocz modified the milestones: 3.1.1, 3.1.2 Dec 31, 2018
@pllim
Copy link
Member

pllim commented Feb 8, 2019

From the Astropy Coordination Meeting notes for 3.2 release, this is one of the items listed but looks like it is already milestoned as 3.1.x.

@pllim pllim modified the milestones: 3.1.2, v3.1.3 Feb 20, 2019
@bsipocz
Copy link
Member

bsipocz commented Apr 18, 2019

@keflavich - This seems to be very close to be finished. Do you have time to finalize it?

@keflavich
Copy link
Contributor Author

@bsipocz it might be further from done than I realized...

@bsipocz
Copy link
Member

bsipocz commented Apr 18, 2019

@keflavich - sadly those test failures are coming from master. Anyway, if it's indeed further from done, please remilestone to 3.2.1

@bsipocz bsipocz removed this from the v3.1.3 milestone Apr 19, 2019
@pllim
Copy link
Member

pllim commented Sep 20, 2019

@keflavich , try a rebase.

@pllim
Copy link
Member

pllim commented Oct 24, 2019

@keflavich , change log needs fixing. FYI.

@bsipocz
Copy link
Member

bsipocz commented Oct 24, 2019

I'm happy to fix changelogs if they're the only leftover stuff in a PR (I need to do it anyway already).

@bsipocz
Copy link
Member

bsipocz commented Oct 24, 2019

@larrybradley @adonath - Could either of you please review and approve it if you're happy with it?

@larrybradley
Copy link
Member

@keflavich Is this PR ready for final review. It looks like there are several unresolved conversations and your last comment of "I don't know the current state" suggests to me that this may need more work? If that's the case, I'll re-milestone for 4.1.

@keflavich
Copy link
Contributor Author

@larrybradley It needs a review. I don't believe it needs more work, though.

Copy link
Member

@larrybradley larrybradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @keflavich.

fix: #8121

@larrybradley larrybradley merged commit df61912 into astropy:master Oct 28, 2019
@larrybradley
Copy link
Member

@bsipocz This needs a new changelog entry.

@bsipocz
Copy link
Member

bsipocz commented Oct 28, 2019

This needs a new changelog entry.

is it a bugfix or a feature? I could argue for either of those...

@larrybradley
Copy link
Member

I'd call it a bugfix. The nan_treatment docs stated that fill_value would be used to fill NaN, but it wasn't doing that. Instead, the fill value for NaNs was always set to 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants