Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak getCurrentContext for SPEED #5959

Merged
merged 1 commit into from
Nov 5, 2019

Conversation

headius
Copy link
Member

@headius headius commented Nov 5, 2019

This change reduces the overhead of calling Ruby.getCurrentContext
since we still call it in a number of places.

  • ThreadService now extends ThreadLocal rather than aggregating a ThreadLocal in a field. This eliminates one hop.
  • All hot-path methods in ThreadService are static.
  • Restore @kares recursion logic since it does appear faster than an explicit null check loop.

On OpenJDK 8 C2 this reduces single-thread getCurrentContext time from around 4ns to around 3.2ns. Other VMs have similar gains.

Tested with a trivial benchmark:

import org.jruby.Ruby;

public class ContextGetter {
  public static void main(String[] args) {
    Ruby runtime = Ruby.newInstance();
    while (true) {
      long nanos = System.nanoTime();
      for (int i = 0; i < 100_000_000; i++) {
        runtime.getCurrentContext();
      }
      System.out.println((System.nanoTime() - nanos) / 100_000_000.0);
    }
  }
}

This change reduces the overhead of calling Ruby.getCurrentContext
since we still call it in a number of places.

* ThreadService now extends ThreadLocal rather than aggregating
  a ThreadLocal in a field. This eliminates one hop.
* All hot-path methods in ThreadService are static.
* Restore @kares recursion logic since it does appear faster than
  an explicit null check loop.

On OpenJDK 8 C2 this reduces single-thread getCurrentContext time
from around 4ns to around 3.2ns. Other VMs have similar gains.

Tested with a trivial benchmark:

```java
import org.jruby.Ruby;

public class ContextGetter {
  public static void main(String[] args) {
    Ruby runtime = Ruby.newInstance();
    while (true) {
      long nanos = System.nanoTime();
      for (int i = 0; i < 100_000_000; i++) {
        runtime.getCurrentContext();
      }
      System.out.println((System.nanoTime() - nanos) / 100_000_000.0);
    }
  }
}
```
@headius headius added this to the JRuby 9.2.10.0 milestone Nov 5, 2019
@headius
Copy link
Member Author

headius commented Nov 5, 2019

Assembly output from this benchmark and this branch on OpenJDK8 C2 is here: https://gist.github.com/headius/ac6ae20bcaa33c9795c3df4decb88dfd

Note that the hot path appears to have inlined completely down to line 167, returning the non-null result at that point. This is a Very Good Thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant