-
-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Printing the first and last n observations for xts and/or zoo? #321
Comments
If it helps, here is one approach. Of course needs testing, but it works for me so far. library(xts)
xts_print <- function(x, n = 5) {
if (is.null(colnames(x))) {
nm <- paste0("X.", 1:ncol(x))
} else {
nm <- colnames(x)
}
df <- format(fortify.zoo(x), justify = "right")
colnames(df) <- c("Index", nm)
row.names(df) <- paste(format(rownames(df), justify = "right"),
":", sep = "")
nr <- nrow(df)
if (nr <= n && nr <= 5) {
print(df)
} else {
if (nr < n * 2) {
n <- floor(nr / 2)
}
cat("\n")
print(utils::head(df, n))
ndigits <- nchar(nrow(df))
if (ndigits >= 3) {
cat(rep(" ", ndigits - 3), "---")
} else {
cat("---")
}
nm2 <- vector(mode = "numeric", ncol(x))
for (i in 1:ncol(x)) {
nm2[i] <- formatC(" ", width = nchar(nm[i]))
}
attr(df, "names") <- c("", nm2)
print(utils::tail(df, n), right = TRUE, justify = "right")
}
}
data(sample_matrix)
samplexts <- as.xts(sample_matrix)
xts_print(samplexts)
#>
#> Index Open High Low Close
#> 1: 2007-01-02 50.03978 50.11778 49.95041 50.11778
#> 2: 2007-01-03 50.23050 50.42188 50.23050 50.39767
#> 3: 2007-01-04 50.42096 50.42096 50.26414 50.33236
#> 4: 2007-01-05 50.37347 50.37347 50.22103 50.33459
#> 5: 2007-01-06 50.24433 50.24433 50.11121 50.18112
#> ---
#> 176: 2007-06-26 47.44300 47.61611 47.44300 47.61611
#> 177: 2007-06-27 47.62323 47.71673 47.60015 47.62769
#> 178: 2007-06-28 47.67604 47.70460 47.57241 47.60716
#> 179: 2007-06-29 47.63629 47.77563 47.61733 47.66471
#> 180: 2007-06-30 47.67468 47.94127 47.67468 47.76719
xts_print(samplexts, n = 1)
#>
#> Index Open High Low Close
#> 1: 2007-01-02 50.03978 50.11778 49.95041 50.11778
#> ---
#> 180: 2007-06-30 47.67468 47.94127 47.67468 47.76719
xts_print(head(samplexts,10), n = 8)
#>
#> Index Open High Low Close
#> 1: 2007-01-02 50.03978 50.11778 49.95041 50.11778
#> 2: 2007-01-03 50.23050 50.42188 50.23050 50.39767
#> 3: 2007-01-04 50.42096 50.42096 50.26414 50.33236
#> 4: 2007-01-05 50.37347 50.37347 50.22103 50.33459
#> 5: 2007-01-06 50.24433 50.24433 50.11121 50.18112
#> ---
#> 6: 2007-01-07 50.13211 50.21561 49.99185 49.99185
#> 7: 2007-01-08 50.03555 50.10363 49.96971 49.98806
#> 8: 2007-01-09 49.99489 49.99489 49.80454 49.91333
#> 9: 2007-01-10 49.91228 50.13053 49.91228 49.97246
#> 10: 2007-01-11 49.88529 50.23910 49.88529 50.23910
# 2nd sample data
xm <- xts(cumsum(rnorm(100, 0, 0.2)), Sys.time() - 100:1)
xts_print(xm)
#>
#> Index X.1
#> 1: 2020-08-03 09:28:00 0.14533549
#> 2: 2020-08-03 09:28:01 0.26327216
#> 3: 2020-08-03 09:28:02 0.21394361
#> 4: 2020-08-03 09:28:03 0.20015489
#> 5: 2020-08-03 09:28:04 0.18350584
#> ---
#> 96: 2020-08-03 09:29:35 -1.74172313
#> 97: 2020-08-03 09:29:36 -1.66798390
#> 98: 2020-08-03 09:29:37 -1.47796503
#> 99: 2020-08-03 09:29:38 -1.16800551
#> 100: 2020-08-03 09:29:39 -1.18936443 |
I really liked your approach. Just now, I was improving your solution for the third time, and IMO the best solution is following:
I couldn't write better code than the authors of |
The main issue I see with both of these solutions is that they make it appear like xts objects have an 'index' column, which is not true. That's likely to cause a lot of confusion. This would also make xts inconsistent with zoo, and consistency with zoo is an objective because xts extends zoo. We need to consider differences in xts compared to zoo. I could discuss with the zoo team about adding a Also, with no disrespect to the data.table team, I'm not going to add a dependency on another package for a print method. |
print(data.table::as.data.table(x)) wouldn't make much sense because it has to copy whole object during conversion of xts (matrix) to data.table. Much easier just |
On 16 Sep 2020, at 12:49, Jan Gorecki ***@***.***> wrote:
print(data.table::as.data.table(x))
wouldn't make much sense because it has to copy during conversion of xts (matrix) to data.table. Much easier just
simple concatenate print output of head and tail of xts.
But without as.data.frame:
https://github.com/eddelbuettel/dang/blob/master/R/print.R <https://github.com/eddelbuettel/dang/blob/master/R/print.R>
… —
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#321 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABAHXMM53VKSTROWRYGDYJDSGCJ3HANCNFSM4KFTVH5A>.
|
The following code provides a solution for library("xts")
check.TZ <- xts:::check.TZ
tformat <- xts:::tformat
coredata <- zoo::coredata
print.xts <- function(x,
fmt,
max = getOption("xts.max.print"),
...) {
check.TZ(x)
if (missing(fmt)) {
fmt <- tformat(x)
}
if (is.null(fmt)) {
fmt <- TRUE
}
if (NROW(x) > max*2+1) {
index <- as.character(index(x))
index <- c(index[c(1:max)], "...", index[(NROW(x)-max+1):NROW(x)])
y <- rbind(
format(as.matrix(x[1:max, ])),
format(matrix(rep("", NCOL(x)), nrow = 1)),
format(as.matrix(x[(NROW(x)-max+1):NROW(x), ]))
)
rownames(y) <- format(index, justify = "right")
colnames(y) <- colnames(x)
} else {
y <- coredata(x, fmt)
}
if (length(y) == 0) {
if (!is.null(dim(x))) {
p <- structure(vector(storage.mode(y)), dim = dim(x),
dimnames = list(format(index(x)), colnames(x)))
print(p)
} else {
cat('Data:\n')
print(vector(storage.mode(y)))
cat('\n')
cat('Index:\n')
index <- index(x)
if (length(index) == 0) {
print(index)
} else {
print(str(index(x)))
}
}
} else {
print(y, quote = FALSE, right = TRUE, ...)
}
}
print.zoo <- function (x,
style = ifelse(length(dim(x)) == 0, "horizontal", "vertical"),
quote = FALSE,
max = getOption("zoo.max.print"),
...) {
style <- match.arg(style, c("horizontal", "vertical", "plain"))
if (is.null(dim(x)) && length(x) == 0) {
style <- "plain"
}
if (length(dim(x)) > 0 && style == "horizontal") {
style <- "plain"
}
if (style == "vertical") {
if (NROW(x) > max*2+1) {
index <- index2char(index(x), frequency = attr(x, "frequency"))
index <- c(index[c(1:max)], "...", index[(NROW(x)-max+1):NROW(x)])
y <- rbind(
format(as.matrix(x[1:max, ])),
format(matrix(rep("", NCOL(x)), nrow = 1)),
format(as.matrix(x[(NROW(x)-max+1):NROW(x), ]))
)
rownames(y) <- format(index, justify = "right")
colnames(y) <- colnames(x)
} else {
y <- as.matrix(coredata(x))
if (length(colnames(y)) < 1) {
colnames(y) <- rep("", NCOL(y))
}
if (NROW(y) > 0) {
rownames(y) <- index2char(index(x), frequency = attr(x, "frequency"))
}
}
print(y, quote = quote, ...)
} else if (style == "horizontal") {
y <- as.vector(x)
names(y) <- index2char(index(x), frequency = attr(x, "frequency"))
print(y, quote = quote, ...)
} else {
cat("Data:\n")
print(coredata(x), ...)
cat("\nIndex:\n")
print(index(x), ...)
}
invisible(x)
}
data("sample_matrix", package = "xts")
samplexts <- xts::as.xts(sample_matrix)
samplezoo <- zoo::as.zoo(sample_matrix)
options("xts.max.print" = 5)
options("zoo.max.print" = 5)
print.xts(samplexts)
#> Open High Low Close
#> 2007-01-02 50.03978 50.11778 49.95041 50.11778
#> 2007-01-03 50.23050 50.42188 50.23050 50.39767
#> 2007-01-04 50.42096 50.42096 50.26414 50.33236
#> 2007-01-05 50.37347 50.37347 50.22103 50.33459
#> 2007-01-06 50.24433 50.24433 50.11121 50.18112
#> ...
#> 2007-06-26 47.44300 47.61611 47.44300 47.61611
#> 2007-06-27 47.62323 47.71673 47.60015 47.62769
#> 2007-06-28 47.67604 47.70460 47.57241 47.60716
#> 2007-06-29 47.63629 47.77563 47.61733 47.66471
#> 2007-06-30 47.67468 47.94127 47.67468 47.76719
print.zoo(samplexts)
#> Open High Low Close
#> 2007-01-02 50.03978 50.11778 49.95041 50.11778
#> 2007-01-03 50.23050 50.42188 50.23050 50.39767
#> 2007-01-04 50.42096 50.42096 50.26414 50.33236
#> 2007-01-05 50.37347 50.37347 50.22103 50.33459
#> 2007-01-06 50.24433 50.24433 50.11121 50.18112
#> ...
#> 2007-06-26 47.44300 47.61611 47.44300 47.61611
#> 2007-06-27 47.62323 47.71673 47.60015 47.62769
#> 2007-06-28 47.67604 47.70460 47.57241 47.60716
#> 2007-06-29 47.63629 47.77563 47.61733 47.66471
#> 2007-06-30 47.67468 47.94127 47.67468 47.76719
print.zoo(samplezoo)
#> Open High Low Close
#> 1 50.03978 50.11778 49.95041 50.11778
#> 2 50.23050 50.42188 50.23050 50.39767
#> 3 50.42096 50.42096 50.26414 50.33236
#> 4 50.37347 50.37347 50.22103 50.33459
#> 5 50.24433 50.24433 50.11121 50.18112
#> ...
#> 176 47.44300 47.61611 47.44300 47.61611
#> 177 47.62323 47.71673 47.60015 47.62769
#> 178 47.67604 47.70460 47.57241 47.60716
#> 179 47.63629 47.77563 47.61733 47.66471
#> 180 47.67468 47.94127 47.67468 47.76719
library("microbenchmark")
x <- microbenchmark(
zoo_old = invisible(capture.output(zoo:::print.zoo(samplexts))),
xts_old = invisible(capture.output(xts:::print.xts(samplexts))),
zoo_new = invisible(capture.output(print.zoo(samplexts))),
xts_new = invisible(capture.output(print.xts(samplexts))),
times = 1000
)
summary(x)
#> expr min lq mean median uq max neval
#> 1 zoo_old 2.3590 2.46380 2.921920 2.59965 2.89375 12.7040 1000
#> 2 xts_old 2.3931 2.50755 2.972585 2.62770 2.92450 8.7730 1000
#> 3 zoo_new 1.7792 1.84510 2.236352 1.92520 2.16320 9.9530 1000
#> 4 xts_new 1.8103 1.88250 2.300003 1.96860 2.23665 9.1413 1000 |
Looks neat
You mean you run checks of reverse dependencies (ideally including Suggested revdeps). As this is what CRAN will expect from maintainers of zoo and xts. If it does break any package then probably better to have this as an opt-in feature for at least one release before changing that to default. |
that was misleading. I did not. Setting |
@markushhh, this looks really good! Thanks for all the effort you put into it! I've been talking with the zoo team about the potential for making this change in xts, and maybe in zoo too. No one is outright opposed, but we want to carefully consider the change. Here are a few things that came up:
|
Thanks for the proposed code @markushhh. Thanks for the summary @joshuaulrich. To expand on 2: I think it would be useful to avoid long printed chunks in the 1-d case as well. However, it is not clear to me what is a good general layout for this. A simple idea would be to print the head, a separate line with the ..., and then the tail:
My feeling is, though, that this does not necessarily convey one vector of things and might be confused with the matrix layout. Another idea would be to print it as one vector of c(head, empty, tail) where the empy element would have a ... index:
There it's really easy to miss the ... It's a bit better if it's not the end of the line but I'm also not thrilled about it.
Better ideas? |
|
@joshuaulrich Thanks for talking to them! @zeileis Thanks for joining in!
Truncation in other Languages and classes:
or
any idea/advice? |
@ggrothendieck for |
I think following style is a good example where vectors could be mixed up with matrices
|
|
@ggrothendieck Thanks. When do you need the plain style? |
In Julia they don't care about
|
Thanks @markushhh for collecting all this information, very useful! Just a couple of comments:
|
Many time series are "ragged", and several columns will start with NA's. So head and tail has the advantage of showing the most recent data where one will often have a more complete sample.
I agree this is a good idea for a more informative print method.
Agreed. |
@zeileis for zero-length series, plain style is in xts already implemented. No need for the extra argument. It's open to discuess whether there's a need for it in zoo. I guess that depends on zoo's dependencies, right?
I'm down! (printing both) |
Printing dimension: I agree. I also like printing both the overall dimension and the number of elements omitted. Plain style: zoo always had this argument, not sure who actually uses it (not me). It could be debated whether we should have introduced it or not. But given we have I think we ought to stick to it. Head only vs. head and tail: Convincing argument by Brian that in time series the tail is typically the most recent information and should be included. |
I've started working on this because I want it. :) I started with @markushhh's implementation (thanks again!). Here's what we still need:
Did I miss anything? Any other thoughts? |
I also started working on something similar for I'd appreciate everyone thoughts on that too! |
+1 for leaving index and dim output in |
Refactor print.xts() to only show the first and last 'max' rows if the number of rows is > 'trunc.rows'. Also truncate the number of columns if they would wrap to a new line. See #321.
I'm starting to come around to the idea of including them in the # zoo 1-d vector
## 2000-01-01 2000-01-02 2000-01-03 2000-01-04 2000-01-05
## 0.8414710 0.9092974 0.1411200 -0.7568025 -0.9589243
## ... (zoo vector with `n` elements omitted)
## 2000-04-05 2000-04-06 2000-04-07 2000-04-08 2000-04-09
## 0.9835877 0.3796077 -0.5733819 -0.9992068 -0.5063656
# zoo matrix
## Open High Low Close
## 2007-01-02 50.03978 50.11778 49.95041 50.11778
## 2007-01-03 50.23050 50.42188 50.23050 50.39767
## 2007-01-04 50.42096 50.42096 50.26414 50.33236
## 2007-01-05 50.37347 50.37347 50.22103 50.33459
## 2007-01-06 50.24433 50.24433 50.11121 50.18112
## ... (zoo matrix with `n` rows omitted)
## 2007-06-26 47.44300 47.61611 47.44300 47.61611
## 2007-06-27 47.62323 47.71673 47.60015 47.62769
## 2007-06-28 47.67604 47.70460 47.57241 47.60716
## 2007-06-29 47.63629 47.77563 47.61733 47.66471
## 2007-06-30 47.67468 47.94127 47.67468 47.76719 |
Here's a first draft of printing zoo vectors. diff --git a/pkg/zoo/R/zoo.R b/pkg/zoo/R/zoo.R
index 39c554b..2ae8224 100644
--- a/pkg/zoo/R/zoo.R
+++ b/pkg/zoo/R/zoo.R
@@ -71,7 +71,39 @@ print.zoo <- function (x, style = ifelse(length(dim(x)) == 0,
else if (style == "horizontal") {
y <- as.vector(x)
names(y) <- index2char(index(x), frequency = attr(x, "frequency"))
- print(y, quote = quote, ...)
+
+ beg <- NULL
+ end <- NULL
+ n_beg <- 1
+ n_end <- 1
+ while (length(beg) < 3 || length(end) < 3) {
+ if (length(beg) < 3) {
+ beg <- utils::capture.output(print.default(head(y, n_beg)))
+ n_beg <- n_beg + 1
+ }
+ if (length(end) < 3) {
+ end <- utils::capture.output(print.default(tail(y, n_end)))
+ n_end <- n_end + 1
+ }
+ }
+ beg <- utils::capture.output(print.default(head(y, n_beg-2)))
+ end <- utils::capture.output(print.default(tail(y, n_end-2)))
+
+ n_obs <- 1
+ for (i in seq_along(y)) {
+ o <- utils::capture.output(print.default(y[seq_len(i)]))
+ if (length(o) > 2) {
+ # output has wrapped to a new line
+ n_obs <- i - 1
+ break
+ }
+ }
+ o <- utils::capture.output(print.default(head(y, n_obs), quote = quote, ...))
+ p <- utils::capture.output(print.default(tail(y, n_obs), quote = quote, ...))
+ more_rows <- paste0("... zoo vector with ", length(y) - 2*n_obs,
+ " more observations")
+ z <- matrix(c(o, more_rows, p), ncol = 1)
+ writeLines(z)
}
else {
cat("Data:\n") And the output is: R$ z <- zoo(1:100, .Date(1:100))
R$ print(z)
1970-01-02 1970-01-03 1970-01-04 1970-01-05 1970-01-06 1970-01-07 1970-01-08 1970-01-09 1970-01-10 1970-01-11
1 2 3 4 5 6 7 8 9 10
... zoo vector with 80 more observations
1970-04-02 1970-04-03 1970-04-04 1970-04-05 1970-04-06 1970-04-07 1970-04-08 1970-04-09 1970-04-10 1970-04-11
91 92 93 94 95 96 97 98 99 100
|
Thanks for having a go at this Josh @joshuaulrich ! Comments:
In addition with a few further tweaks (naming objects, breaking from the loop, always using
|
Happy to! I thought it was most efficient to use my knowledge of doing this with
Agree about only using
Agree with all your comments here.
Agreed with allowing a number of observations before truncating. I like 50 lines because that's roughly what fits vertically on my laptop screen. That would be 25 1-d zoo vector observations because there are 2 lines/observation. I don't have strong feelings about this because changing it later shouldn't be an issue, especially if we provide a global option for users to set their personal preference. |
This is going into the 0.13.0 xts release. |
overall i like this feature and think its a good idea. just one thing i have found a bit frustrating is that |
I encountered this too and it needs to be fixed before release. Can you create another issue with a reproducible example for this bug? |
The top/bottom rows could have a different number of decimal places and there are often multiple variying spaces between columns. For example: close volume ma bsi 2022-01-03 09:31:00 476.470 803961.000 NA 54191.000 2022-01-03 09:32:00 476.700 179476.000 NA 53444.791 2022-01-03 09:33:00 476.540 197919.000 NA -16334.994 ... 2023-03-16 14:52:00 394.6000 46728.0000 392.8636 28319.4691 2023-03-16 14:53:00 394.6500 64648.0000 392.8755 15137.6857 2023-03-16 14:54:00 394.6500 69900.0000 392.8873 -1167.9368 There are 4 spaces between the index and the 'close' column, 2 between 'close' and 'volume', 4 between 'volume' and 'ma', and 2 between 'ma' and 'bsi'. There should be a consistent number of spaces between the columns. Most other classes of objects print with 1 space between the columns. The top rows have 3 decimals and the bottom rows have 4. These should also be the same. See #321.
No description provided.
The text was updated successfully, but these errors were encountered: