Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redux: speed up JSON::Validator.validate #285

Merged
merged 4 commits into from
Jan 8, 2016

Conversation

iainbeeston
Copy link
Contributor

This is a rebase and update of #255

I've pulled out the change to the way uuids are calculated (in #255 the uuid of a Hash is calculated by taking a SHA of Hash.hash, whereas in the original the SHA of the string of the hash is used) because I feel that the risk is high and a better solution would be to use a faster hashing algorithm.

However, I've expanded the use of caching to get around the speed problems in Addressable.

Worst case this should be faster than the current implementation.

@iainbeeston
Copy link
Contributor Author

If I can get this reviewed I'd like to release as v2.6.0

end
temp_uri.fragment = "" if temp_uri.fragment.nil?
temp_uri = temp_uri.merge(:fragment => "") if temp_uri.fragment.nil? || temp_uri.fragment.empty?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If temp_uri.fragment.empty? is true, then will temp_uri.merge(fragment: "") actually change anything?

Also, can do temp_uri.merge!(fragment: "") instead of temp_uri = temp_uri.merge(fragment: "") to be consistent with preceding code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me answer those in turn:

  1. If temp_uri.fragment is "", then the URI will have a trailing "#" symbol. If temp_uri.fragment is nil or empty then no "#" is appended. So doing this does change the behaviour
  2. We can't use merge! here, because is temp_uri.relative? is false, then temp_uri has come directly from the cache in JSON::Util:URI, and uris stored in the cache are frozen. If you call merge! on a frozen uri it will raise an exception because you are modifying a frozen object. By using merge it will work with both branches of the code

@iainbeeston iainbeeston force-pushed the cache-normalized-uri branch 2 times, most recently from 19d49e1 to 95f9176 Compare December 24, 2015 07:27
@iainbeeston iainbeeston force-pushed the cache-normalized-uri branch 2 times, most recently from ef36232 to 3881475 Compare December 24, 2015 13:40
@iainbeeston
Copy link
Contributor Author

I've refactored slightly to duplicate cached URIs rather than freezing them

@iainbeeston
Copy link
Contributor Author

(Because in addressable dup is cheaper than freeze)

@@ -71,8 +70,6 @@ def read(location)
JSON::Schema.new(JSON::Validator.parse(body), uri)
end

# @param uri [Addressable::URI]
# @return [Boolean]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why delete the docstrings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in 6967d99 I've tried to abstract away the fact that we use Addressable::URI, however, I'm probably kidding myself. So I've restored this until I have time to make a more water-tight solution.

mjc and others added 4 commits December 24, 2015 15:31
Addressible's URI parsing shows up at the top of the profiler for our use case.

This patch adds a cache to JSON::Util::URI.normalized_uri, to trade CPU time for memory.

There are other expensive Addressible methods that should be cached elsewhere.

benchmark: https://github.com/mjc/json-schema-perf

MRI 2.2.2 before: 825.363  (± 6.2%) i/s
MRI 2.2.2 after:  1.436k (± 4.5%) i/s

1.74x faster

JRuby 1.7.20 before: 1.445k (± 2.5%) i/s
JRuby 1.7.20 after: 2.272k (± 6.1%) i/s

1.57x faster
Addressable::URI#fragment= triggers a revalidation of the URL, which is
super slow.  Don't do this when it isn't necessary.
Hopefully this will make it easier to change in future
@RST-J
Copy link
Contributor

RST-J commented Jan 7, 2016

Looks fine to me 👍

@iainbeeston
Copy link
Contributor Author

Thanks. I'll merge this and make a release

On 7 Jan 2016, at 8:33 pm, Jonas Peschla notifications@github.com wrote:

Looks fine to me


Reply to this email directly or view it on GitHub.

iainbeeston added a commit that referenced this pull request Jan 8, 2016
Redux: speed up JSON::Validator.validate
@iainbeeston iainbeeston merged commit 6a1620e into voxpupuli:master Jan 8, 2016
@iainbeeston iainbeeston deleted the cache-normalized-uri branch January 8, 2016 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants