Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport LLVM patches to fix X86 partial register stall #20025

Merged
merged 4 commits into from
Feb 23, 2017

Conversation

yuyichao
Copy link
Contributor

Fix #19976

This improves the performance of pisum on one of my machine by ~5.5x and the assembly looks better.

@nanosoldier runbenchmarks(ALL, vs = ":master")

@mbauman @KristofferC you probably have better idea about the original/other case this comes up.

@vchuravy vchuravy added this to the 0.6.0 milestone Jan 14, 2017
@Keno
Copy link
Member

Keno commented Jan 14, 2017

Out of curiosity, which of these commits is the actual fix for our issue?

@yuyichao
Copy link
Contributor Author

The fourth patch is the one that nominally fixes it. I included the fifth one because it claims to have fixed a bug in the fourth one. With only this two commit the LLVM tests fails and the failure looks like real codegen regressions so I included a few more that touches the same tests and the LLVM tests are now passing. I'm not entirely sure which ones fixes the "regression" caused by the fix but from the commit message I suspect it's the third one.

@yuyichao
Copy link
Contributor Author

Counted by the order they are cherry picked (see the [PATCH */5] in the commit message) Apparently I got the order wrong in llvm.mk but there doesn't seem to be too many conflicts.....

So the one that should be fixing this issue is [PATCH 4/5] Avoid false dependencies of undef machine operands (also the biggest one though a big part of it is tests...). The fix for it is [PATCH 5/5] Fixing bug committed in rev. 278321. The one that I think fixes the regression is [PATCH 3/5] ExecutionDepsFix - Fix bug in clearance calculation

@tkelman tkelman added the upstream The issue is with an upstream dependency, e.g. LLVM label Jan 14, 2017
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@yuyichao
Copy link
Contributor Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

Just to check which ones are noise....

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@tkelman
Copy link
Contributor

tkelman commented Jan 14, 2017

looks like ["array","growth",("append!",2048)] was noise, the rest were not

@Keno
Copy link
Member

Keno commented Jan 15, 2017

I've looked into the regression and the problem is that LLVM fails to take into account the clearance from an incoming backedge when choosing which register to use. Should be fixable, I'll try for a patch.

@Keno
Copy link
Member

Keno commented Jan 16, 2017

Upstream patch is here: https://reviews.llvm.org/D28759

Will push an update to this PR that adds the upstream patch, so we can re-run nanosoldier.

@Keno Keno force-pushed the yyc/codegen/llvm-stall branch from c1bc99f to 840b10f Compare January 16, 2017 05:39
@Keno
Copy link
Member

Keno commented Jan 16, 2017

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@Keno
Copy link
Member

Keno commented Jan 16, 2017

Looks like I did indeed fix the super bad regressions we saw. There's still comprehension_iteration left which I should probably look at. Also, there's some new ones which may or may not be real.
Let's run again to see: @nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@Keno
Copy link
Member

Keno commented Jan 16, 2017

Ok, I'm gonna go with the comprehension one is the only remaining regression. Looking into it.

@tkelman
Copy link
Contributor

tkelman commented Jan 16, 2017

["scalar","arithmetic",("div","Complex{Float32}","Complex{Float32}")] also looks real and a bit larger?

@Keno
Copy link
Member

Keno commented Jan 16, 2017

Ah, yes. In any case, the cause seems to be what @yuyichao mentioned to me. Namely, that calls to non-inlined functions don't reset clearance information.

@Keno
Copy link
Member

Keno commented Jan 16, 2017

https://reviews.llvm.org/D28786 same procedure yesterday.

@Keno
Copy link
Member

Keno commented Jan 16, 2017

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@vchuravy
Copy link
Member

Travis fails on 32bit:

julia: /home/travis/build/JuliaLang/julia/deps/srccache/llvm-3.9.1/include/llvm/CodeGen/MachineOperand.h:268: unsigned int llvm::MachineOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed.

@Keno Keno force-pushed the yyc/codegen/llvm-stall branch from d006cc4 to 780f9c5 Compare January 17, 2017 18:15
@Keno
Copy link
Member

Keno commented Jan 17, 2017

I was unable to reproduce this locally, but I had an off by one in the patch, which could have easily caused that. Fixed.

@Keno Keno force-pushed the yyc/codegen/llvm-stall branch from 780f9c5 to 6d094a3 Compare January 17, 2017 18:19
@Keno
Copy link
Member

Keno commented Jan 17, 2017

Also, I think the last nanosoldier run picked up some improvements on master, so @nanosoldier runbenchmarks(ALL, vs = ":master")

@Keno Keno force-pushed the yyc/codegen/llvm-stall branch from 5386ccb to f4cbc40 Compare January 19, 2017 01:42
@Keno
Copy link
Member

Keno commented Jan 19, 2017

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @jrevels

@vchuravy
Copy link
Member

Nanosoldier seems very happy with this 🎉. LGTM if travis is green

@martinholters
Copy link
Member

The travis failure looks unrelated. Any ideas what that might be?

@tkelman
Copy link
Contributor

tkelman commented Jan 19, 2017

it's the mac spawn test issue that #20073 is attempting to fix. these patches should be updated to match whatever gets committed upstream once it goes through review there, but lgtm for now

@Keno
Copy link
Member

Keno commented Jan 19, 2017

Let's wait a few days while upstream review is ongoing. The problem with replacing these patches later is that they won't apply cleanly to people who've had this version applied. Also, FYI, the second commit has been split up into
https://reviews.llvm.org/D28915.

@tkelman
Copy link
Contributor

tkelman commented Jan 24, 2017

Any news upstream on these?

@Keno
Copy link
Member

Keno commented Jan 24, 2017

No, but it's only been one business day since I posted that, and some of these folks might not work weekends ;).

@yuyichao
Copy link
Contributor Author

Any news a week later now?

@Keno
Copy link
Member

Keno commented Jan 30, 2017

Yes, for some reason I'm not getting email notifications on these anymore though.

@tkelman
Copy link
Contributor

tkelman commented Jan 30, 2017

I had the same happen on phabricator, seems flaky

@Keno
Copy link
Member

Keno commented Jan 31, 2017

1/3 is now nominally upstream. We'll see if it survives the buildbot onslaught.

@KristofferC
Copy link
Member

Bump, any news?

@tkelman
Copy link
Contributor

tkelman commented Feb 13, 2017

@Keno what is the status on these?

@Keno
Copy link
Member

Keno commented Feb 13, 2017

Making its way through review upstream.

@tkelman
Copy link
Contributor

tkelman commented Feb 13, 2017

Can we rebase this to reflect the latest state and get that in before alpha, then (later) add additional incremental patches on top of that as the upstream review continues?

@tkelman
Copy link
Contributor

tkelman commented Feb 14, 2017

getting the latest state of this in would help make the benchmark resilts much more predictable and useful

@tkelman
Copy link
Contributor

tkelman commented Feb 17, 2017

@Keno can we please just merge the latest state of this patch set now?

@Keno
Copy link
Member

Keno commented Feb 21, 2017

The latest state of this patch is still very much in flux. We can however, rebase this PR as is and merge it.

@tkelman
Copy link
Contributor

tkelman commented Feb 21, 2017

one of them was upstreamed, wasn't it? what's closer to what will eventually be upstream, what's in this PR now or the current state of the upstream reviews?

@StefanKarpinski
Copy link
Member

@tkelman
Copy link
Contributor

tkelman commented Feb 22, 2017

been a while since this last ran @nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @jrevels

@StefanKarpinski
Copy link
Member

Dang, that's a lot of performance improvements. Hard to believe LLVM let something this big regress. Oh wait, no it isn't.

@KristofferC
Copy link
Member

We got the best LLVM hackers don't we?

@tkelman tkelman merged commit f2b155a into master Feb 23, 2017
@tkelman tkelman deleted the yyc/codegen/llvm-stall branch February 23, 2017 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream The issue is with an upstream dependency, e.g. LLVM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants