Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not ignoring NAs with 64-bit integers and grouped operations? #4444

Closed
go-see opened this issue May 14, 2020 · 1 comment · Fixed by #4445
Closed

Not ignoring NAs with 64-bit integers and grouped operations? #4444

go-see opened this issue May 14, 2020 · 1 comment · Fixed by #4445
Labels
GForce issues relating to optimized grouping calculations (GForce)
Milestone

Comments

@go-see
Copy link

go-see commented May 14, 2020

Problem

When I attempt to find the grouped minima of a 64-bit integer (using na.rm = T), it appears as though data.table is ignores the na.rm = T. This does not happen for other grouped operations on 64-bit integers (e.g., max or sum).

Minimal reproducible example

# Load packages
library(bit64)
library(data.table)
# Create data table
tmp = data.table(x = as.integer64(c(1, 2, NA)), id = 1)
# Grouped min does not work
tmp[, min(x, na.rm = T), by = id]
# Grouped max does work
tmp[, max(x, na.rm = T), by = id]

Output:

> tmp[, min(x, na.rm = T), by = id]
   id   V1
1:  1 <NA>
> # Grouped max does work
> tmp[, max(x, na.rm = T), by = id]
   id V1
1:  1  2

Output of sessionInfo()

R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.4

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bit64_0.9-7       bit_1.1-14        data.table_1.12.9

loaded via a namespace (and not attached):
[1] compiler_3.6.3 tools_3.6.3   
@MichaelChirico
Copy link
Member

Reproduced, and also noting that max() doesn't work for na.rm=FALSE:

tmp[, max(x, na.rm = F), by = id]
#       id    V1
# 1:     1     2

@jangorecki jangorecki added the GForce issues relating to optimized grouping calculations (GForce) label May 14, 2020
@mattdowle mattdowle added this to the 1.14.1 milestone Aug 20, 2021
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GForce issues relating to optimized grouping calculations (GForce)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants