Skip to content

Calling dask.array.compute_chunk_sizes() with Asynchronous Client #11567

Open
@Karl5766

Description

Describe the issue:

The issue I'm encountering happens when I try to run compute_chunk_sizes() function on a dask array on an asynchronous client, which seems to happen because a coroutine is not awaited after async function is called?

Minimal Complete Verifiable Example:

In the following code:

async def main():
    import dask.array as da
    import numpy as np
    from distributed import Client
    client = Client(threads_per_worker=12, n_workers=1, asynchronous=True)

    depth = (1, 0)
    arr = da.from_array(np.zeros((4, 4), dtype=np.int64), chunks=(2, 2))
    print(arr.chunksize)
    padded = arr.map_overlap(func=lambda x: x, depth=depth)
    print(padded.chunksize)  # wrong chunk sizes
    padded.compute_chunk_sizes()  # crush!!

    await client.close()


if __name__ == '__main__':
    import asyncio
    asyncio.run(main())

I got a program crush with message “‘coroutine’ object is not iterable”. This would run fine in a synchronous environment, but in my case I need an asynchronous client and was unable to run this function. How should this be handled?

Anything else we need to know?:

This is a repost from https://dask.discourse.group/t/calling-dask-array-compute-chunk-sizes-with-asynchronous-client/3684

Environment:

  • Dask version: 2024.11.2
  • Python version: 3.10.4
  • Operating System: Windows 11
  • Install method (conda, pip, source): pip 24.1.2

Metadata

Assignees

No one assigned

    Labels

    needs triageNeeds a response from a contributor

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions