Skip to content

Tags: Bears-R-Us/arkouda

Tags

v2025.01.13

Toggle v2025.01.13's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Closes #3960: python interface for CommDiagnostics (#3966)

Co-authored-by: Amanda Potts <ajpotts@users.noreply.github.com>

v2024.12.06

Toggle v2024.12.06's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #3927 from ajpotts/3926_OverMemoryLimitError_in_pd…

…arrayclass_test

Closes #3926: OverMemoryLimitError in pdarrayclass_test

v2024.10.02

Toggle v2024.10.02's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
use separate variable name for benchmark 'size' in makefile to avoid …

…altering default test size (#3806)

Signed-off-by: Jeremiah Corrado <jeremiah.corrado@hpe.com>

v2024.06.21

Toggle v2024.06.21's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Closes #3339: Add multi-batch parquet read tests (#3350)

* Closes #3339: Add multi-batch parquet read tests

This PR (closes #3339) adds testing for parquet reads or arrays and strings large enough to trigger more than one batch. We also add testing of a segarray of segstrings containing empty segs and empty strings

* add proto

---------

Co-authored-by: Tess Hayes <stress-tess@users.noreply.github.com>

v2024.04.19

Toggle v2024.04.19's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Closes #3020 dataframe.dropna (#3101)

Co-authored-by: Amanda Potts <ajpotts@users.noreply.github.com>

v2024.03.18

Toggle v2024.03.18's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Closes #3017: Add documentation for our random number generation (#3044)

This PR (closes #3017) adds docs for our new random number generation using Generators.

Co-authored-by: Tess Hayes <stress-tess@users.noreply.github.com>

v2024.02.02

Toggle v2024.02.02's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
closes #2927 power divergence statistic (#2932)

* closes #2927 power divergence statistic

* add scipy to requirements

* add arkouda/akstats/_stats_py.pyi

* Fix F403 and F401 error codes on flake8 arkouda from arkouda/akmath/__init__.py and arkouda/akstats/__init__.py

* un-pin scipy from specific version

* add scipy license and minor changes in response to code review

* Update tests/akmath/akmath_test.py

---------

Co-authored-by: Amanda Potts <ajpotts@users.noreply.github.com>
Co-authored-by: pierce <48131946+pierce314159@users.noreply.github.com>

v2023.11.15

Toggle v2023.11.15's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Closes #2838: Expand dataframe merge functions to accept multiple col…

…umns (#2848)

This PR (closes #2838) expands the dataframe merge functions to act on multiple columns. When no value is provided for `on`, it defaults to the intersection of the columns of the left and right dataframe. `inner_join_merge` and `right_join_merge` were turned into helper functions that aren't exposed to the user to more closely match the pandas merge functionality where these are only avialble through `merge`

Co-authored-by: Pierce Hayes <pierce314159@users.noreply.github.com>

v2023.10.06

Toggle v2023.10.06's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Closes #2716: Add dataframe merge functionality (#2781)

* add merge functionality

* moving functionality to dataframe.py

* remove numeric import

* change exception error to TypeError

* int col float behavior

* remove extraneous code and fix type errors

* change the float cast from np to ak

* Update arkouda/dataframe.py

Co-authored-by: pierce <48131946+pierce314159@users.noreply.github.com>

* Update arkouda/dataframe.py

Co-authored-by: pierce <48131946+pierce314159@users.noreply.github.com>

* Update arkouda/dataframe.py

Co-authored-by: pierce <48131946+pierce314159@users.noreply.github.com>

* address some of Pierce's comments

* identical column suffixes

* added df.merge functions

* bug fix for the right_join_merge method

* add merge test for dataframe

* temp test fix, order is wonky but not wrong

* Update arkouda/dataframe.py

---------

Co-authored-by: Eddie <eddie@MacBook-Air.local>
Co-authored-by: pierce <48131946+pierce314159@users.noreply.github.com>
Co-authored-by: Pierce Hayes <pierce314159@users.noreply.github.com>

v2023.09.06

Toggle v2023.09.06's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fixes #2703: Sort bug with `nan`s (#2755)

* Fixes #2703: Sort bug with `nan`s

This PR (fixes #2703)

When a `nan` is present in `a` the value of `min reduce a` will equal `nan`. So `signbit(min reduce a)` will be false even if there are negatives present. This was causing the sort to mishandle `0.0`

I updated the code to do the same thing it used to if `min reduce a` is not a `nan`, and when it is to find the signbits of all values see if any are true (i.e. `| reduce signbit(a)`

I feel like calling `signbit` on every value of `a` then reducing shouldn't be too much more expensive than reducing first and doing only one `signbit` call. But I know the sort code is super optimized, so if ronawho doesn't mind looking this over and making sure I'm not doing something dumb that will kill the performance

* upated in response to PR feedback

---------

Co-authored-by: Pierce Hayes <pierce314159@users.noreply.github.com>