Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbindlist fails in the presence of AsIs #4934

Closed
OfekShilon opened this issue Mar 14, 2021 · 7 comments · Fixed by #5446
Closed

rbindlist fails in the presence of AsIs #4934

OfekShilon opened this issue Mar 14, 2021 · 7 comments · Fixed by #5446

Comments

@OfekShilon
Copy link
Contributor

OfekShilon commented Mar 14, 2021

This might be by design, but I couldn't find an explicit mention of it in NEWS, issue reports or elsewhere.
We're trying to upgrade data.table from 1.11.8 to 1.13.6, and some of our tests break on code that simplifies to this:

d <- list()
d[[1]] <- data.frame(a=1, b=I(2))
d[[2]] <- data.frame(a=3, b=4)
rbindlist(d, fill=TRUE)

The rbindlist call succeeds in 1.11.8, but in 1.13.6 fails with:

Error in rbindlist(d, fill = TRUE) :
Class attribute on column 2 of item 2 does not match with column 2 of item 1.

We suspect it's a regression, as AsIs class should not account as a class change - just inhibit coercion.


Edit: turns out this surfaced due to an R3.4 -> R4.0.3 upgrade, not a data.table upgrade. Still, suggest it be considered for fixing.

# Output of sessionInfo()

R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS: /mnt/public/istra/apps/R/R-4.0.3.mkl/lib/libRblas.so
LAPACK: /mnt/public/istra/apps/R/R-4.0.3.mkl/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C

attached base packages:
[1] datasets utils stats graphics grDevices methods base
...

@OfekShilon
Copy link
Contributor Author

This seems due to an R upgrade, not data.table upgrade.

@jangorecki
Copy link
Member

jangorecki commented Mar 14, 2021

Interesting, thanks @OfekShilon for tracking this. Could you share some more info like which R versions you used? I see R 4.0.3, and what was the other one?

@OfekShilon
Copy link
Contributor Author

@jangorecki We're upgrading from R3.4, where this code passes. So while this isn't a data.table regression, I think this code really shouldn't fail - addition of "AsIs" to a class attribute shouldn't count as a class change.
This is the relevant code.
What do you think?

@OfekShilon OfekShilon reopened this Mar 14, 2021
@OfekShilon OfekShilon changed the title rbindlist breaking change, with AsIs rbindlist fails in the presence of AsIs Mar 14, 2021
@MichaelChirico
Copy link
Member

Maybe we should as.data.table non-data.table inputs to rbindlist, and as.data.table.data.frame should "force" AsIs columns?

Actually this raises another bug...

@OfekShilon
Copy link
Contributor Author

OfekShilon commented Mar 15, 2021

@MichaelChirico Maybe simply remove AsIs from the class attributes before comparing them with R_compute_identical? A bit ad-hoc, but feels safe.
The unified column probably should have the AsIs class appended back, if it were initially present in any summand.

@liorso
Copy link

liorso commented Mar 15, 2021

Ran the mentioned code with R3.4 with fill=FALSE Got similar to R4 error message:

> rbindlist(d, fill=FALSE)
Error in rbindlist(d, fill = F) :
  Class attributes at column 2 of input list at position 2 does not match with column 2 of input list at position 1. Coercion of objects of class 'factor' alone is handled internally by rbind/rbindlist at the moment.

Why does the fill argument change the type logic?

@OfekShilon
Copy link
Contributor Author

Note that base::rbind does behave as expected:

d <- list()
d[[1]] <- data.frame(a=1, b=I(2))
d[[2]] <- data.frame(a=3, b=4)
r<-do.call(rbind, d)
#  a b
#1 1 2
#2 3 4
> class(r$b)
#[1] "AsIs"

I believe rbindlist should conform.

ben-schwen added a commit that referenced this issue Aug 26, 2022
@OfekShilon OfekShilon added the bug label Oct 24, 2022
@jangorecki jangorecki modified the milestone: 1.15.0 Dec 22, 2023
MichaelChirico added a commit that referenced this issue Jul 24, 2024
* add fix #5309

* fix test numbering

* add rbind for ITime

* more tests

* add merge tests

* add AsIs #4934

* add news

* news typo

* add ignore.attr argument

* fix news

* change arguments of registered rbindlist

* add attribute to usage

* move nanotime tests

* adjust test numbering

* add test coverage

* prohibit NA for ignore.att

* move news

* finish todo of #5857

* Update NEWS.md

Co-authored-by: Michael Chirico <chiricom@google.com>

* update comment

* update doc for ignore.attr

* fix nit ignoreattr

* fix test consistency

* remove setnames

* update asis test to use rbindlist

* update test comments

* update NEWS num

* NEWS wording

* more NEWS wording

* template message for i18n

* simplify condition (C boolean --> no NA to worry about)

* && not &

* correct error message

---------

Co-authored-by: Michael Chirico <chiricom@google.com>
Co-authored-by: Michael Chirico <michaelchirico4@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants