Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Speedy Tail Call Optimization (digital-asset#6003)
* Speedy Tail call optimization The goal of this PR is to achieve Tail call optimization digital-asset#5767 Tail call optimization means that tail-calls execute without consuming resources. In particular, they must not consume stack space. Speedy has two stacks: The `env`-stack and the `kontStack`. For an optimized tail call in Speedy, we must not extend either. In Speedy, all function calls are executed via the code `enterFullyAppliedFunction`. The behaviour of this code (prior to this PR) is as follows: (1) Push the values of all args and free-variables on the env-stack (because that's where the code expects to find them), and (2) Push a KPop continuation on the kontStack, which will restore the env-stack to its original size before returning to the code which made the function call. We must stop doing both these things. We achieve this as follows: (1) We address function args and free-vars via a new machine component: the current `frame`. (2) We make continuations responsible for restoring their own environment. As well as achieving proper tail calls, we also gain a performance improvement by (a) removing the many pushes to the env-stack, and (b) never having to push (and then later re-enter) a KPop continuation. The args array and the free-vars array already existed, so there is no additional cost associated with constructing these array. The only extra costs (which are smaller than the gains) are that we must manage the new `frame` component of the machine, and we must record frame/env-size information in continuations so they can restore their environment. To make use of the frame, we need to identify (at compile time) the run-time location for every variable in a speedy expression. This is done during the `closureConvert` phase. At run-time, an environment is now composed of both the existing env-stack and the frame. The only values which now live on the env-stack are those introduced by let-bindings and pattern-match-destructuring. All other are found in the frame. Changes to SEExpr: - Introduce a new expression form `SELoc`, with 3 sub classes: SELocS/SELocA/SELocF to represent the run-time location of a variable. - SELocS/A/F execute by calling corresponding lookup function in Speedy: getEnv(Stack,Arg,Free). - SEMakeClo takes a list of SELoc instead of list of int. - During closure conversion all SEVar are replaced by an SELocS/A/F. - SEVar are not allowed to exist at run-time (just as SEAbs may not exist). - We adapt the synthesised code for SEBuiltinRecursiveDefinition: FoldL, FoldR, EqualList It is worth noting the prior code also had the notion of before/after closureConvert, but SEVar was used for both meanings: Prior to closureConvert it meant the relative-index of the enclosing binder (lambda,let,..). After closureConvert it meant the relative-offset from the top of the env-stack where the value would be found at run-time. These are not quite the same! Now we have different sub-types (SEVar vs SELoc), this change of mode is made more explicit. Run-time changes: - Use the existing `KFun` continuation as the new `Frame` component. - `KFun` allows access to both the args of the current application, and the free-vars of the current closure. - A variable is looked up by it's run-time location (SELocS/A/F) - A function application is executed (`enterFullyAppliedFunction`), by setting the machine's `frame` component to the new current `KFun`. - When a continuation (KArg, KMatch, KPushTo, KCatch) is pushed, we record the current Frame and current stack depth within the continuation, so when it is entered, it can call `restoreEnv` to restore the environment to the state when the continuation was pushed. Changes to Compiler: - The required changes are to the `closureConvert` and `validate`. - `closureConvert` `remaps` is now a `Map` from the `SEVar`s relative-index to `SELoc` - `validate` now tracks 3-ints (maxS,masA,maxF) changelog_begin changelog_end * changes for Remy * Changes for Martin * test designed explicitly to blow if the free variables are captured incorrectly * address more comments * improve comment about shift in Compiler
- Loading branch information