Closed
Description
I was expecting the two approaches shown below to result in the same output: group by the Grp and then calculate A - B - C for each group. The second example is likely preferable, but is the first example a misuse/misunderstanding of .SD
or a bug?
> library(data.table)
data.table 1.9.4 For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
> library(reshape2)
> tmp = data.table(Grp=LETTERS[1:10], A=1:10, B=11:20, C=21:30)
> tmp
Grp A B C
1: A 1 11 21
2: B 2 12 22
3: C 3 13 23
4: D 4 14 24
5: E 5 15 25
6: F 6 16 26
7: G 7 17 27
8: H 8 18 28
9: I 9 19 29
10: J 10 20 30
> subtract = function(DT) {
+ for (nm in names(DT)[-1])
+ set(DT, j = nm, value = DT[[nm]] * -1)
+ DT[, base::sum(.SD)]
+ }
> tmp[, list(Value = subtract(.SD)), by = Grp]
Grp Value
1: A -31
2: B 2
3: C 3
4: D 4
5: E 5
6: F 6
7: G 7
8: H NaN
9: I 9
10: J 10
> subtract2 = function(x) {
+ ct = length(x)
+ x[2:ct] = x[2:ct] * -1
+ sum(x)
+ }
> tmp_ = melt(tmp, id.vars = "Grp", variable.factor = FALSE)
> tmp_[, list(Value = subtract2(value)), by = Grp]
Grp Value
1: A -31
2: B -32
3: C -33
4: D -34
5: E -35
6: F -36
7: G -37
8: H -38
9: I -39
10: J -40
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] reshape2_1.4 data.table_1.9.4
loaded via a namespace (and not attached):
[1] chron_2.3-45 plyr_1.8.1 Rcpp_0.11.3 stringr_0.6.2
[5] tools_3.1.1