RFC: Add inference->optimize analysis forwarding mechanism #36508

Keno · 2020-07-02T02:59:55Z

This change attempts to be a solution to the generalized problem
encountered in #36169. In short, we do a whole bunch of analysis
during inference to figure out the final type of an expression,
but sometimes, we may need intermediate results that were
computed along the way. So far, we don't really have a great
place to put those results, so we end up having to re-compute
them during the optimization phase. That's what #36169 did,
but is clearly not a scalable solution.

I encountered the exact same issue while working on a new AD
compiler plugin, that needs to do a whole bunch of work during
inference to determine what to do (e.g. call a primitive, recurse,
or increase the derivative level), and optimizations need to have
access to this information.

This PR adds an additional info field to CodeInfo and IRCode
that can be used to forward this kind of information. As a proof
of concept, it forwards method match info from inference to
inlining (we do already cache these, so there's little performance
gain from this per se - it's more to exercise the infrastructure).

The plan is to do an alternative fix to #36169 on top of this
as the next step, but I figured I'd open it up for discussion first.

Keno · 2020-07-02T03:00:28Z

Also cc @martinholters who was complaining about the lack of such a mechanism over in #36169.

martinholters · 2020-07-02T06:58:43Z

Should the info be a Vector (defaulting to [] instead of nothing) so that we can attach multiple different pieces of extra information?

JeffBezanson · 2020-07-02T18:31:08Z

base/compiler/ssair/inlining.jl

+# This is not currently called in the regular course, but may be needed
+# if we ever want to re-run inlining again later in the pass pipeline after
+# additional type information was discovered.
+function recompute_method_matches(atype, sv)


Suggested change

function recompute_method_matches(atype, sv)

function recompute_method_matches(@nospecialize(atype), sv::OptimizationState)

JeffBezanson · 2020-07-02T18:42:19Z

Saving all the method matches at each call site seems to be pretty expensive; sysimg +4%. I think we should discard those by default when the optimizer finishes.

This will need to update the deserialize method for CodeInfo. Ideally it should be able to detect older versions that don't have the new field and handle them correctly.

Keno · 2020-07-02T18:44:28Z

Yes, agree, I think we should discard these once inlining finished (by default, external optimizers can override) and have a special representation for vectors where all stmtinfo is norhing. I just didn't get around to implementing that yet.

vtjnash · 2020-07-02T18:55:02Z

base/compiler/abstractinterpretation.jl

+end
+
+# call where the function is known exactly
+function abstract_call_known(interp::AbstractInterpreter, @nospecialize(f),


Allocating all these Tuple objects seems unfortunate. We've tried to work pretty hard to remove and minimize that pattern, as it seems that it can sap performance.

Should we do a Pair{Any, Any} or a specialized struct instead. Seems like that may potentially be easier on the compiler.

Reminder to self: Whatever we do here, should also apply to e.g. typeinf_edge.

@vtjnash does Pair{Any,Any} sound reasonable to you?

Keno · 2020-07-02T21:31:26Z

Should the info be a Vector (defaulting to [] instead of nothing) so that we can attach multiple different pieces of extra information?

I thought about it, but I think at least for the moment, there needs to be a very clear consumer for this side channel information, since it can have semantic information. If we ever get to the point where we want to store multiple independent pieces of information here, we can re-evaluate what the best format for storing it is.

vtjnash · 2020-07-03T02:47:01Z

I've had some feeling that the intent of "CodeInfo" corresponds to just the information we want to convey to codegen. While "IRCode" is that information plus the additional per-statement information computed by inference (with the intent that "InstructionStream" helps to manage it as a StructOfArrays view). And "InferenceState" is the additional state required to compute that.

To that perception, would it make sense to add these new columns only to IRCode, but to find a way to preserve the inference result (including extra metadata like this) using that struct as the vehicle in the "InferenceResult" part of the internal caches in the AbstractInterpreter? (possibly also renaming IRCode to "AnnotatedCodeInfo," or something like that, to better hint at that distinction)

Keno · 2020-07-03T02:51:05Z

I think that's generally sensible, though for the AD case for example, I do also want to be able to cache the stmtinfo without ever running the optimizer. That's still fine for the moment, since I'm currently managing my own caches there, but it would be unfortunate to have a design where stmtinfo can never appear in the internal caches.

Keno · 2020-07-11T01:00:41Z

Alright, updated after some discussion with @JeffBezanson and @vtjnash earlier this week. Changes:

Rather than using tuples for (rt, info), there's now a dedicated CallMeta type. I verified that this codegens well and as expected without allocating the extra object.
I also sunk the assignment to the info array into abstract_eval_statement, right after abstract_call. I like having the information in the return value for abstract_call, because that gets called recursively (e.g. from abstract_iteration), so it's nice to bundle up all that information, but there's really no need to also forward propagate it from abstract_eval_statement. Also, this nicely separates out information about a call from information about other kinds of statements, and gives a good place to add additional information about calls.
stmtinfo got removed from CodeInfo and is not an inference/IRCode-only concept

Additionally, I added a fast-path in inlining, that short-circuits any additional work if the info is false (which is set by inference if it determines that doing the method lookup was worthless or inlining isn't warranted for other reasons). Interestingly, this change improved sysimg build time by about 5% over the merge base. I didn't look at it too closely, but I suppose it prevents some method table queries that inference had a fast-reject for but inlining was still doing. This went away after rebasing. It's now on par.

vtjnash

Yes, this seems better to me. I still think it may be a complicated way to specify the info-lattice (this PR simply discards the info when it's inconvenient to merge), since it can't readily handle either of the existing cases (the expression type—which calls tmerge—or the statement's edges list—which calls union), though it would be conservatively correct for adding handling of other metadata columns (like pure), so it seems okay.

vtjnash · 2020-07-14T00:13:15Z

base/compiler/ssair/inlining.jl

            splits = Any[sig.atype]
        else
            splits = Any[]
            for union_sig in UnionSplitSignature(sig.atypes)
                push!(splits, argtypes_to_type(union_sig))
            end
+            if !isa(info, UnionSplitInfo)
+                infos = fill(nothing, length(splits))


wrong object type?

You mean it should be Vector{Any}? Yeah, let me do that.

vtjnash · 2020-07-14T00:19:30Z

base/compiler/ssair/inlining.jl

+function recompute_method_matches(atype, sv)
+    # Regular case: Retrieve matching methods from cache (or compute them)
+    # World age does not need to be taken into account in the cache
+    # because it is forwarded from type inference through `sv.params`


This seems confusing to state. The world age is taken into account, because that's part of what sv.params means, and is partly why the cache is part of the params.

That comment was there before ;)

Though I should probably update it, because it's not the regular case anymore.

This change attempts to be a solution to the generalized problem encountered in #36169. In short, we do a whole bunch of analysis during inference to figure out the final type of an expression, but sometimes, we may need intermediate results that were computed along the way. So far, we don't really have a great place to put those results, so we end up having to re-compute them during the optimization phase. That's what #36169 did, but is clearly not a scalable solution. I encountered the exact same issue while working on a new AD compiler plugin, that needs to do a whole bunch of work during inference to determine what to do (e.g. call a primitive, recurse, or increase the derivative level), and optimizations need to have access to this information. This PR adds an additional `info` field to CodeInfo and IRCode that can be used to forward this kind of information. As a proof of concept, it forwards method match info from inference to inlining (we do already cache these, so there's little performance gain from this per se - it's more to exercise the infrastructure). The plan is to do an alternative fix to #36169 on top of this as the next step, but I figured I'd open it up for discussion first.

In #36508 we decided after some consideration not to add the `stmtinfo` to the `CodeInfo` object, since this info would never be used for codegen. However, this also means that external AbstractInterpreters that would like to cache pre-optimization results cannot simply cache the unoptimized `CodeInfo` and then later feed it into the optimizer. Instead they would need to cache the whole OptimizationState object, or maybe convert to IRCode before caching. However, at the moment we eagerly drop the `OptimizationState` wrapper as soon was we decide not to run the optimizer. This refactors things to keep the OptimizationState around for unoptimized methods, only dropping it right before caching, in a way that can be overriden by an external AbstractInterpreter.

In #36508 we decided after some consideration not to add the `stmtinfo` to the `CodeInfo` object, since this info would never be used for codegen. However, this also means that external AbstractInterpreters that would like to cache pre-optimization results cannot simply cache the unoptimized `CodeInfo` and then later feed it into the optimizer. Instead they would need to cache the whole OptimizationState object, or maybe convert to IRCode before caching. However, at the moment we eagerly drop the `OptimizationState` wrapper as soon was we decide not to run the optimizer. This refactors things to keep the OptimizationState around for unoptimized methods, only dropping it right before caching, in a way that can be overriden by an external AbstractInterpreter. We run into the inverse problem during costant propagation where inference would like to peek at the results of optimization in order to decide whether constant propagation is likely to be profitable. Of course, if optimization hasn't actually run yet for this AbstractInterpreter, this doesn't work. Factor out this logic such that an external interpreter can override this heuristic. E.g. for my AD interpreter, I'm thinking just looking at the vanilla function and checking its complexity would be a good heuristic (since the AD'd version is supposed to give the same result as the vanilla function, modulo capturing some additional state for the reverse pass).

This change attempts to be a solution to the generalized problem encountered in JuliaLang#36169. In short, we do a whole bunch of analysis during inference to figure out the final type of an expression, but sometimes, we may need intermediate results that were computed along the way. So far, we don't really have a great place to put those results, so we end up having to re-compute them during the optimization phase. That's what JuliaLang#36169 did, but is clearly not a scalable solution. I encountered the exact same issue while working on a new AD compiler plugin, that needs to do a whole bunch of work during inference to determine what to do (e.g. call a primitive, recurse, or increase the derivative level), and optimizations need to have access to this information. This PR adds an additional `info` field to CodeInfo and IRCode that can be used to forward this kind of information. As a proof of concept, it forwards method match info from inference to inlining (we do already cache these, so there's little performance gain from this per se - it's more to exercise the infrastructure). The plan is to do an alternative fix to JuliaLang#36169 on top of this as the next step, but I figured I'd open it up for discussion first.

In JuliaLang#36508 we decided after some consideration not to add the `stmtinfo` to the `CodeInfo` object, since this info would never be used for codegen. However, this also means that external AbstractInterpreters that would like to cache pre-optimization results cannot simply cache the unoptimized `CodeInfo` and then later feed it into the optimizer. Instead they would need to cache the whole OptimizationState object, or maybe convert to IRCode before caching. However, at the moment we eagerly drop the `OptimizationState` wrapper as soon was we decide not to run the optimizer. This refactors things to keep the OptimizationState around for unoptimized methods, only dropping it right before caching, in a way that can be overriden by an external AbstractInterpreter. We run into the inverse problem during costant propagation where inference would like to peek at the results of optimization in order to decide whether constant propagation is likely to be profitable. Of course, if optimization hasn't actually run yet for this AbstractInterpreter, this doesn't work. Factor out this logic such that an external interpreter can override this heuristic. E.g. for my AD interpreter, I'm thinking just looking at the vanilla function and checking its complexity would be a good heuristic (since the AD'd version is supposed to give the same result as the vanilla function, modulo capturing some additional state for the reverse pass).

Keno requested review from vtjnash and JeffBezanson July 2, 2020 02:59

JeffBezanson reviewed Jul 2, 2020

View reviewed changes

vtjnash reviewed Jul 2, 2020

View reviewed changes

Keno force-pushed the kf/stmtinfo branch from a5c3a33 to e8d0cc6 Compare July 11, 2020 00:52

Keno force-pushed the kf/stmtinfo branch 3 times, most recently from 4a5b393 to 6ca510c Compare July 13, 2020 22:36

vtjnash approved these changes Jul 14, 2020

View reviewed changes

Keno force-pushed the kf/stmtinfo branch from 6ca510c to f33a242 Compare July 14, 2020 22:52

Keno force-pushed the kf/stmtinfo branch from f33a242 to 372e878 Compare July 15, 2020 07:48

Keno merged commit 6f62363 into master Jul 15, 2020

Keno deleted the kf/stmtinfo branch July 15, 2020 16:24

Keno mentioned this pull request Jul 17, 2020

Refactor OptimizationState lifetime #36700

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Add inference->optimize analysis forwarding mechanism #36508

RFC: Add inference->optimize analysis forwarding mechanism #36508

Keno commented Jul 2, 2020

Keno commented Jul 2, 2020

martinholters commented Jul 2, 2020

JeffBezanson Jul 2, 2020

JeffBezanson commented Jul 2, 2020

Keno commented Jul 2, 2020

vtjnash Jul 2, 2020

Keno Jul 2, 2020

Keno Jul 3, 2020

Keno Jul 6, 2020

Keno commented Jul 2, 2020

vtjnash commented Jul 3, 2020

Keno commented Jul 3, 2020

Keno commented Jul 11, 2020 •

edited

Loading

vtjnash left a comment

vtjnash Jul 14, 2020

Keno Jul 14, 2020

vtjnash Jul 14, 2020

Keno Jul 14, 2020

Keno Jul 14, 2020

	function recompute_method_matches(atype, sv)
	function recompute_method_matches(@nospecialize(atype), sv::OptimizationState)

RFC: Add inference->optimize analysis forwarding mechanism #36508

RFC: Add inference->optimize analysis forwarding mechanism #36508

Conversation

Keno commented Jul 2, 2020

Keno commented Jul 2, 2020

martinholters commented Jul 2, 2020

Choose a reason for hiding this comment

JeffBezanson commented Jul 2, 2020

Keno commented Jul 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Keno commented Jul 2, 2020

vtjnash commented Jul 3, 2020

Keno commented Jul 3, 2020

Keno commented Jul 11, 2020 • edited Loading

vtjnash left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Keno commented Jul 11, 2020 •

edited

Loading