Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Nokogiri::XSLT::Stylesheet to the TypedData API #2806

Merged

Conversation

casperisfine
Copy link

The old Data API is soft deprecated since a long time. The TypedData API is a bit more strict and opens the door for compaction, write barriers, memsize, etc (they have to be opted individually)

Here I just do a very simple translation of a single type, so it's easy to review.

@flavorjones
Copy link
Member

@casperisfine Thank you for starting this work! The C codebase definitely needs to be modernized and I've been putting this off for too long.

Do you think it's worth making all the TypedData changes on one branch to benchmark anything? If not I'm happy to merge this and we can just work our way through the classes one at a time.

@tenderlove
Copy link
Member

I don't think using TypedData would cause any performance regressions. Might be interesting to benchmark later, but IMO it's not worth our time to do it now.

btw, this PR LGTM

@flavorjones
Copy link
Member

flavorjones commented Mar 3, 2023

I don't think using TypedData would cause any performance regressions

Sorry, to be clear, I meant to suggest measuring some real-world benchmarks for performance or memory utilization improvements, not regressions -- just in case it was academically interesting to anybody.

@casperisfine
Copy link
Author

So TypedData by itself won't make any difference. (accessing a typeddata is one or two extra comparisons, not much else).

What would give performance eventually, would be to implement write barriers, and the performance impact would be on minor GC based on the amount of these objects are long lived.

But If Shopify's monolith's is anything to go by, Nokogiri objects aren't really long lived, when inspecting a production heap all I could find was 12 "DATA(Nokogiri/XMLNode)". Some different apps may have long lived Nokogiri objects perhaps, I don't know.

So I'm really not expecting much if any measurable performance improvements here, this PR (and the followups I was planning) are mostly about using the more strict and modern API, and allowing to implement the _memsize functions.

The old `Data` API is soft deprecated since a long time.
The `TypedData` API is a bit more strict and opens the door
for compaction, write barriers, memsize, etc (they have to be opted individually)
@flavorjones
Copy link
Member

👍 Will merge. Added a commit with some style changes (I'm trying to update the C files as I touch them). Will merge when it goes green.

@flavorjones flavorjones merged commit 5ec39e6 into sparklemotion:main Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants