-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing integer values through the non-numpy interface results in a wav of all 0s #405
Comments
In general, soundfile expects normalized floating point data between -1 and 1. In integer formats this is clearly not possible, so it scales between the appropriate minimum and maximum. This is documented in |
I am open to contributing a PR, but if I am going to write docs I need to be sure I understand the behavior. When you say "scales between the appropriate minimum and maximum", you mean it scales from the range -1…1 to the minimum and maximum values of the output sound file's type? This description surprises me because in my failed test above I was giving values outside the range -1…1 and they were being snapped to 0 rather than scaled or clipped to the -1…1 range. Is there special case handling for values outside -1…1? |
Integer values scale between the minimum and maximum allowable integer value. That's -2^15 ... (2^15 - 1) for int16, and -2^31 ... (2^31 - 1) for int32. Ignore the asymmetry between positive and negative values; IIRC, the scaling factor is actually always symmetrically 2^15 or 2^31, with the odd 2^15 clipped to 2^15-1. Thus integer-1 scales 1/2^15, which is very close to zero. But this only applies to numpy arrays with dtype int16 or int32. Python lists are converted to a float numpy array, which must be between -1 and 1. |
What @bastibe mentioned above is mostly correct, but Python lists are not necessarily converted to float arrays. The code uses >>> import numpy as np
>>> np.asarray([32767]).dtype
dtype('int64') This means that in your example, you are essentially writing 16-bit data to the least significant bytes of a 64-bit array. For comparison, in your NumPy example, you are writing your data to a 16-bit array, which is then not scaled down, leading to your expected result.
I think the misunderstanding is that you think you are using it without NumPy, but as soon as you use the I think this has been clarified already in #407: If you don't want to use NumPy, you should use the |
To get a clearer idea about the scaling and clipping behavior, you can have a look at the tests: python-soundfile/tests/test_soundfile.py Lines 256 to 271 in 0a8071d
It also happens in the other direction, but that is a very exotic use case: python-soundfile/tests/test_soundfile.py Lines 952 to 974 in 0a8071d
|
Thank you for the clarification, @mgeier. I wasn't aware that So integer lists must be treated as int64, and scale between -2^63 and 2^63. That's kind of annoying. |
Hm, yeah, from a user perspective this is an unusual enough range that if it cannot be changed to something less surprising, it might actually be better for it to error on integer lists. |
I suppose we could work around this issue by forcing any non-numpy-array iterables to float64 in soundfile.py:1019. This would technically be a backwards-incompatible change, though, which I am somewhat weary to make. Also, this is the first time I have heard of this issue, so I'd assume it is a rare occurrence. Nevertheless, it should be documented at the very least. |
I am going to attempt to send you one or more docs PRs today. |
* Clarifies range of python integer input to write(), addressing issue bastibe#405. * Clarifies, which came up in issue bastibe#407. * Clarifies how to build the docs, which confused me while preparing this PR (Simply installing Sphinx is not enough)
* Clarifies range of python integer input to write(), addressing issue bastibe#405. * Clarifies endianness of buffer_write input, which came up in issue bastibe#407. * Clarifies how to build the docs, which confused me while preparing this PR (Simply installing Sphinx is not enough)
I need clarification on something before I continue with #410.
This is not exactly what I see in testing. Consider this sample program:
Result: Saves a wav file with 4 samples, first 2 are 0, third is minimum intensity, final is maximum intensity
Result:
In other words, from this and other tests I conclude the behavior for python int inputs is:
This behavior is non-ideal, but it seems to at least be consistent (I did not test on non-win10 systems). I will change my documentation in #410 to match this behavior, but before I do, does this sound right? Am I missing anything? |
What does this return? >>> import numpy as np
>>> np.asarray([1]).dtype |
Okay.
In my WSL (linux VM) venv on the same machine:
I can't explain this. What does it do on your machine? If the scale is potentially different (32 or 64 bit) depending on system/configuration, this suggests to me the int form simply should not be used and the documentation should say something like "if the input is a python int array the behavior is undefined". Alternately, before @bastibe said he didn't want to change the behavior saying
But, if it was actually broken before (per the tests I pasted above, even if np manages to assign the array an int64, python-soundfile rejects it) maybe you don't need to worry about backward compatibility. If it never worked it seems like logically there must not be existing code to break? |
Oh, this is indeed terrible news. Thank you very much for the analysis! So essentially, there is no consistent option for handling lists of integers across platforms. We could force any non-numpy type to float, but this would break lists of integers entirely. I don't really see a good way forward beyond documenting that "any non-numpy data will be converted to numpy using np.asarray", and noting that using non-numpy integer data is a very bad idea indeed. In fact, we should consider issuing a warning whenever np.asarray returns an integer type (and it wasn't already a numpy array beforehand). What are your thoughts on that? |
Consider this simple use of soundfile:
I expect this to create a wav containing a very slow sawtooth wave (it will not be audible but the waveform will be visible in an editor such as audacity) with a DC offset. Instead, it creates a wav in which all values are zero.
If I pass the values in as numpy rather than as python lists, it works and creates a usable, correct wav file:
It is possible that when using non-numpy input, Soundfile is expecting the values in some format (floats?) other than ints. If so, this is not documented. I assume PCM_16 values because I have specified PCM_16 subtype.
I have a non-numpy app in which I adopted python-soundfile because it had the ability to append to wavs. I understand python-soundfile is mostly intended for use with numpy, but it does advertise a non-numpy interface. Using without numpy, I encountered several inconvences, but this was the biggest problem because I could not resolve it by consulting the documentation. I ultimately had to just add numpy to my program.
The text was updated successfully, but these errors were encountered: