Calling dask.array.compute_chunk_sizes() with Asynchronous Client #11567
Open
Description
Describe the issue:
The issue I'm encountering happens when I try to run compute_chunk_sizes() function on a dask array on an asynchronous client, which seems to happen because a coroutine is not awaited after async function is called?
Minimal Complete Verifiable Example:
In the following code:
async def main():
import dask.array as da
import numpy as np
from distributed import Client
client = Client(threads_per_worker=12, n_workers=1, asynchronous=True)
depth = (1, 0)
arr = da.from_array(np.zeros((4, 4), dtype=np.int64), chunks=(2, 2))
print(arr.chunksize)
padded = arr.map_overlap(func=lambda x: x, depth=depth)
print(padded.chunksize) # wrong chunk sizes
padded.compute_chunk_sizes() # crush!!
await client.close()
if __name__ == '__main__':
import asyncio
asyncio.run(main())
I got a program crush with message “‘coroutine’ object is not iterable”. This would run fine in a synchronous environment, but in my case I need an asynchronous client and was unable to run this function. How should this be handled?
Anything else we need to know?:
This is a repost from https://dask.discourse.group/t/calling-dask-array-compute-chunk-sizes-with-asynchronous-client/3684
Environment:
- Dask version: 2024.11.2
- Python version: 3.10.4
- Operating System: Windows 11
- Install method (conda, pip, source): pip 24.1.2