-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spdlog): impl JsonFormatter #69
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NotEvenANeko Appreciate your work on this. Serialized logging is a pretty meaningful feature that makes it easier for third-party programs to parse the logs output by spdlog-rs
.
Most code looks great, just a few minor issues and nitpicks. Nice work! 👍
record.serialize_field::<str>( | ||
"time", | ||
local_time_cacher | ||
.get(self.0.time()) | ||
.full_second_str() | ||
.as_ref(), | ||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the machine format JSON, a timestamp number is more friendly. How about we rename the field to timestamp
and then get the unix timestamp from the cacher?
UPDATE: Seems that the cacher is not needed, just get the timestamp from field time
in Record
should be fine.
} | ||
} | ||
|
||
/// JSON logs formatter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// JSON logs formatter | |
#[rustfmt::skip] | |
/// JSON logs formatter |
The doc formatting seems to be broken by Rustfmt, try adding a #[rustfmt::skip]
on it.
Not a blocker for this PR, but in a future PR maybe we could generalize this idea to something like |
spdlog/Cargo.toml
Outdated
@@ -37,6 +37,7 @@ native = [] | |||
libsystemd = ["libsystemd-sys"] | |||
multi-thread = ["crossbeam"] | |||
runtime-pattern = ["spdlog-internal"] | |||
json-formatter = ["serde", "serde_json"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this gate these days. Perhaps we can enable JsonFormatter
when feature serde
is enabled and feature serde
implies dependencies serde
and serde_json
, then feature json-formatter
is no longer needed.
The benefit of it is that we have ability to add other serialization crates in the future, so users can choose the "backend" of JsonFormatter
by enabling just one feature.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might add lots of dependencies for a single feature as new formatters are introduced.
Or we can just make a SerdeFormatter
as @Lancern said to decouple the serializer from the formatter. So user can choose any serializer instead of the serde_json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might add lots of dependencies for a single feature as new formatters are introduced.
Hmm, makes sense. What about enable JsonFormatter
when feature serde_json
is enabled, so serde_json
implies serde
?
Or we can just make a
SerdeFormatter
as @Lancern said to decouple the serializer from the formatter. So user can choose any serializer instead of theserde_json
.
SerdeFormatter
is fine, but we are in no hurry to implement it in this PR.
What I meant by my comment was that JsonFormatter
should not be limited to only using serde
as a backend. Semantically, JSON
is a format and serde
is an implementation. It seems more correct that the implementation implies the format, rather than the format implying the implementation. So that if now or in the future, there are any json crates that aren't serde
-based, we could easily add it without breaking anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serde_json
implies serde
sounds more reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that if now or in the future, there are any json crates that aren't serde-based, we could easily add it without breaking anything.
Is this really implementable? serde
is the de-facto standard and we already rely on the critical abstractions it provides. The "critical abstractions" I'm talking about here is the Serialize
trait which is provided by serde
. I cannot work out a way to bypass it and still implement a usable JsonFormatter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really implementable?
serde
is the de-facto standard [...] I cannot work out a way to bypass it and still implement a usableJsonFormatter
.
It is. For now, there are crates rkyv
, miniserde
, nanoserde
, etc., they each have their own advantages and are not serde
-based. If we can harmlessly preserve the maximum possibilities, why not do so?
[...] and we already rely on the critical abstractions it provides. The "critical abstractions" I'm talking about here is the
Serialize
trait which is provided byserde
.
If you are talking about deriving serde::Serialize
macro for types of spdlog-rs
, this can be unforced. Considering the following psuode-code:
#[cfg(and(feature = "serde", feature = "rkyv"))
compile_error!("you can select only one serialization backend");
#[cfg_attr(feature = "serde", derive(serde::Serialize))]
#[cfg_attr(feature = "rkyv", derive(rkyv::Serialize))]
pub enum Level {
// ...
}
Then implement JsonFormatter
:
// In fact, we'll apply this condition on `mod json_formatter` and `pub use json_formatter::JsonFormatter`.
#[cfg(any(feature = "serde_json", feature = "rkyv"))
pub struct JsonFormatter(PhantomData);
impl Formatter for JsonFormatter {
fn format(&self, record: &Record, dest: &mut StringBuf) -> crate::Result<FmtExtraInfo> {
#[cfg(feature = "serde_json")]
{
// using `serde_json`
}
#[cfg(feature = "rkyv")]
{
// using `rkyv`
}
}
}
Also, we are not in a hurry to implement these other crate supports in this PR right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SpriteOvO OK this makes sense. I thought you mean we completely abandon any dependencies on serialization libraries and let the users choose what suits them. So we're still depending on a bunch of pre-selected serialization libraries (even if they're just optional) and let the user choose one among them.
record.serialize_field( | ||
"timestamp", | ||
&self | ||
.0 | ||
.time() | ||
.duration_since(SystemTime::UNIX_EPOCH) | ||
.expect("invalid timestamp") | ||
.as_secs() | ||
.to_string(), | ||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
record.serialize_field( | |
"timestamp", | |
&self | |
.0 | |
.time() | |
.duration_since(SystemTime::UNIX_EPOCH) | |
.expect("invalid timestamp") | |
.as_secs() | |
.to_string(), | |
)?; | |
record.serialize_field( | |
"timestamp", | |
self | |
.0 | |
.time() | |
.duration_since(SystemTime::UNIX_EPOCH) | |
.ok() | |
.and_then(|dur| u64::try_from(dur.as_millis()).ok()) | |
.expect("invalid timestamp"), | |
)?; |
I've discussed with @NotEvenANeko about this in private messages.
Initially I simply thought that UNIX timestamp would be OK, but then realized that it is in seconds, which is obviously not precise enough for logging. After some investigation, I think a millisecond timestamp might be more appropriate, based on the following clues:
-
I prefer not to format it to a human-readable string (e.g. ISO 8601 / RFC 3339) because JSON will eventually be parsed by a machine. And the performance of date-time formatting is quite bad for both serialization and deserialization.
-
in JavaScript,
Date::now
and the constructor ofDate
returns and accepts a number of milliseconds since January 1, 1970 00:00:00 UTC, with leap seconds ignored. -
most programming languages have time libraries that support parsing millisecond timestamps directly or indirectly.
-
nanoseconds are more precise, but it uses more significant integer digits, and perhaps this precision is unnecessary.
Since this integer will eventually be expressed in JSON, and Duration::as_millis
returns a u128
, we need to convert it to u64
to express it in JSON. Although the maximum safe type for JSON integers in JavaScript is i53
, I tested it with Rust chrono
crate, and the maximum millisecond timestamp it supports parsing doesn't even exceed i53::MAX
, and that maximum timestamp has reached the year 262142. So in my opinion, representing it as a string is not necessary for the foreseeable future.
extern crate chrono; // 0.4.38
use chrono::prelude::*;
fn main() {
const JS_MAX_SAFE_INTEGER: i64 = 9007199254740991;
let max_utc = DateTime::<Utc>::MAX_UTC;
let max_timestamp = max_utc.timestamp_millis();
println!("Max UTC supported by chrono : {}", max_utc);
println!("Max timestamp in millisecond : {}", max_timestamp);
println!("Max safe integer in JavaScript: {}", JS_MAX_SAFE_INTEGER);
println!("Max u64 : {}", u64::MAX);
assert!(max_timestamp <= JS_MAX_SAFE_INTEGER);
}
Max UTC supported by chrono : +262142-12-31 23:59:59.999999999 UTC
Max timestamp in millisecond : 8210266876799999
Max safe integer in JavaScript: 9007199254740991
Max u64 : 18446744073709551615
Using u64::try_from
rather than as u64
here so we can catch the overflow error instead of implicit truncation.
2b6358c
to
2fce767
Compare
// | ||
// The performance can be significantly optimized here if the issue can be | ||
// solved. | ||
dest.write_str(&serde_json::to_string(&json_record)?)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this comment as a note.
test bench_1_full_formatter ... bench: 90.39 ns/iter (+/- 9.58)
test bench_1_json_formatter ... bench: 121.27 ns/iter (+/- 9.10)
I benchmarked JsonFormatter
and I was expecting it to be faster than that (close to FullFormatter
). After a quick check, the String
returned by serde::to_string
will do unnecessary allocation and should be optimized out.
Unfortunately, due to the serde_json
API limitation (serde-rs/json#863), this is currently not possible (at least not without using unsafe).
2fce767
to
e698a45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, much appreciated your work on this @NotEvenANeko and @Lancern's reviews.
Implement
JsonFormatter
withserde
andserde_json
Add
serde_json
as a feature, addserde
andserde_json
as deps.