how to maintain same format and datatype storing and loading data using; to_netcdf() and open_dataset() #6478
Unanswered
oswald1234
asked this question in
Q&A
Replies: 1 comment 3 replies
-
Thanks for raising this! This would be a serious problem if you can demonstrate a consistent pattern of this behavior. That said, this is something xarray tests explicitly check for, and I can't reproduce the problem using a simple example: In [4]: cube = xr.Dataset(
...: {
...: 'a': (('y', 'x'), np.random.randint(0, 100, size=(12, 10), dtype='uint8')),
...: 'b': (('x', ), np.arange(0, 50000, 5000, dtype='uint16')),
...: },
...: coords={
...: 'x': np.arange(10),
...: 'y': np.arange(12),
...: 'lat': (('y', 'x'), (np.arange(12).reshape(-1, 1) - 0.2 * np.arange(10))),
...: },
...: )
In [5]: cube
Out[5]:
<xarray.Dataset>
Dimensions: (y: 12, x: 10)
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11
lat (y, x) float64 0.0 -0.2 -0.4 -0.6 -0.8 ... 10.0 9.8 9.6 9.4 9.2
Data variables:
a (y, x) uint8 95 86 28 57 66 75 61 41 44 ... 17 59 99 67 86 9 15 78
b (x) uint16 0 5000 10000 15000 20000 25000 30000 35000 40000 45000
In [6]: cube.to_netcdf(path='local-test.nc', format='NETCDF4', mode='w')
In [7]: xr.open_dataset('local-test.nc').load()
Out[7]:
<xarray.Dataset>
Dimensions: (y: 12, x: 10)
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11
lat (y, x) float64 0.0 -0.2 -0.4 -0.6 -0.8 ... 10.0 9.8 9.6 9.4 9.2
Data variables:
a (y, x) uint8 95 86 28 57 66 75 61 41 44 ... 17 59 99 67 86 9 15 78
b (x) uint16 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 A lot of times, when people run into something like this, there is a step in their workflow which is changing the data type just before they write to disk. Can you see if you're able to create a very simple Minimal, Complete, Verifiable Example like the above? Make sure you're using the latest version of xarray and include the output of |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When I store my dataset, the datatypes for the variables are uint8 and uint16, but when I open it again, the datatypes are float32, and some coordinates are represented as variables.
How do I prevent this?
cube.to_netcdf(path=os.path.join(savedir,filename),format='NETCDF4',mode='w') xr.open_dataset(os.path.join(savedir,filename))
Should I specify something in the dataset before saving or reopening the file?
Beta Was this translation helpful? Give feedback.
All reactions