Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micro-optimize P6opaque's get_attribute() #1884

Merged

Conversation

MasterDuke17
Copy link
Contributor

Instead of checking attr_st inside each case of the switch(kind), have an if/else for attr_st with a switch in each branch.

On this laptop, it seemed to reduce the instructions and time for compiling CORE.d and CORE.e (I didn't try CORE.c since that would have taken way too long under valgrind). I don't have access to my x86 desktop right now, and since this optimization seems like it will very likely be compiler-dependent, maybe @timo you could test it?

@MasterDuke17
Copy link
Contributor Author

Huh, no CI?

@lizmat
Copy link
Contributor

lizmat commented Jan 6, 2025

"Workflow runs completed with no jobs" ?? @patrickbkr @timo ??

@timo
Copy link
Member

timo commented Jan 6, 2025

sorry my fault for the commit i accidentally pushed to the main branch yesterday

@MasterDuke17
Copy link
Contributor Author

MasterDuke17 commented Jan 6, 2025

Also, here's a little bit of logging of what happens during a CORE.c compile (8 is obj, 4 is int64, and 7 is str).

16452991 non attr_st, kind = 8
12195638 yes attr_st, kind = 4
 4250708 yes attr_st, kind = 7
  494754 yes attr_st, kind = 8
    9046 yes attr_st, kind = 20
     109 yes attr_st, kind = 6

I believe a lot of the calls to get_attribute() are coming from e.g., getattr_i in interp.c. I wonder if it would make sense to create get_attribute_[oisun]() functions? attr_funcs might have to be made a bit bigger though, I don't know all the repercussions of that.

@MasterDuke17
Copy link
Contributor Author

I know the jit can devirtualize these in some instances, but since I don't have the jit on this hardware, I'm now wondering if spesh can be made to do anything.

Instead of checking `attr_st` inside each case of the `switch(kind)`,
have an if/else for attr_st with a switch in each branch.
@MasterDuke17 MasterDuke17 force-pushed the micro-optimize_get_attribute_for_P6opaque branch from f04d3c1 to a7dcba0 Compare January 7, 2025 15:12
@timo
Copy link
Member

timo commented Jan 7, 2025

spesh can turn getattr_* into different kinds of sp_get* instructions, which don't require jit compilation to give a nice speedup.

we could look at spesh logs to find places where the getattr_* remains in the "After" result to see if we spot anything that sticks out that we could optimize better?

there's also the case of getattr with a string for the attribute name, in which case spesh can't do anything if the attribute name isn't known at spesh time, though in theory we could have strings that have been seen logged and code-gen a string comparison and "specialized" branch with an optimized getattr variant. Would need to do some measurements to see what the opportunities are

Edit: From a quick look at like 0.1% of the spesh log from core.c.setting compilation we're very often not sure what the type is of the object we're getting the attribute from. Maybe the thing above with the name not being known is a very small portion and we mostly have the name but aren't sure about the type.

Also, in the spesh log we can't see which of the cases we don't optimize run often or rarely, so we would also want to count them from the actual run

@MasterDuke17
Copy link
Contributor Author

we could look at spesh logs to find places where the getattr_* remains in the "After" result to see if we spot anything that sticks out that we could optimize better?

I have a 1.6gb spesh log of compiling CORE.c. There are ~2.3k getattr_* remaining in the 'After:' of >300 functions. Here's a random example from 'MATCH':

  BB 8 (0x340e0216ec8):
    line: 1399 (pc 984)
    Instructions:
      getattr_s        r6(13),  r25(4), r10(31), lits($!name), liti16(-1)
      isnull_s        r24(10),  r6(13)
      if_i            r24(10),  BB(12)
    Successors: 12, 9
    Predecessors: 7
    Dominance children: 9, 12

@MasterDuke17
Copy link
Contributor Author

The count of the types of getattr:

   1242 getattr_i
   1544 getattr_o
     48 getattr_s
      4 getattr_u

@lizmat
Copy link
Contributor

lizmat commented Jan 8, 2025

Is this PR good to go as is?

@MasterDuke17
Copy link
Contributor Author

Is this PR good to go as is?

I think so. I believe it'll also generate slightly faster code on x86, but we could always merge now and I'll confirm that when I have access to my desktop again in a couple days.

@lizmat lizmat merged commit df1715d into MoarVM:main Jan 8, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants