Delete numpy.delete #1010

chillenzer · 2022-09-15T12:00:10Z

Hi everybody,

While reading through Episode 2, I stumbled across the use of numpy.delete. From my experience, I have not a single examples where the usage of numpy.delete is justified because it creates a new array instead of returning a view.

In the best case, the performance penalty does not matter and you never look at your code again and think "Hey, that element was deleted from the original array. So, it won't be there any more." Well, it is! You are bound to get surprised at least once by the fact that your original array didn't change at all and only if you are very lucky, that will not translate into a hard to track bug in your code where the numbers are slightly off all the time.

In the worst case, an inexperienced python user writes a for-loop to delete all the elements they don't want instead of using a mask or proper slicing to return a view. If you do that on any reasonably sized data, you will immediately abandon python for being almost as slow as doing it by hand or maybe even run into memory issues with multiple copies of large data in memory. And then again, they might forget to assign the created copy (as opposed to an in-place change) back to the original variable and all that.

So, I would be very interested in hearing about justifications I might have overlooked. But for the time being, IMHO the best that can be done is removing that part or, if someone insists on mentioning numpy.delete, clearly state its pitfalls instead and recommend not to use it unless you really, really know what you are doing.

Best,
Julian

The text was updated successfully, but these errors were encountered:

chillenzer · 2022-09-15T14:11:04Z

PS: I came up with one kind of reasonable scenario for using numpy.delete: Functional programming! From a purist's perspective, a function should have no side effects there and that is what numpy.delete achieves in returning a copy. If it is very (num)pythonic to use pure functional programming is to be decided by others.

yueyangu · 2024-10-29T01:20:14Z

Good points on numpy.delete! Here’s a concise comparison and examples to clarify:

numpy.delete Behavior
numpy.delete creates a new array, leaving the original unchanged. This can be confusing and lead to bugs if one assumes in-place deletion:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
new_arr = np.delete(arr, 2)
print("Original:", arr)  # [1, 2, 3, 4, 5]
print("New:", new_arr)   # [1, 2, 4, 5]

Preferred Alternatives
Masking and slicing avoid making copies and are more efficient:
mask = np.arange(len(arr)) != 2
masked_arr = arr[mask]  # [1, 2, 4, 5]

sliced_arr = np.concatenate((arr[:2], arr[3:]))  # [1, 2, 4, 5]

Performance Impact
numpy.delete in loops can be slow and memory-heavy. Using masks for large arrays is usually faster and more memory-efficient:

arr = np.random.rand(10**6)
filtered_arr = arr[arr >= 0.5]

I hope it helps.

Thanks,
Yueyan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete numpy.delete #1010

Delete numpy.delete #1010

chillenzer commented Sep 15, 2022

chillenzer commented Sep 15, 2022

yueyangu commented Oct 29, 2024

Delete numpy.delete #1010

Delete numpy.delete #1010

Comments

chillenzer commented Sep 15, 2022

chillenzer commented Sep 15, 2022

yueyangu commented Oct 29, 2024