Add internal use only virtual machine blocks #2147

mzgoddard · 2019-04-25T21:30:55Z

Resolves

Performance switching between blocks in Sequencer.stepThread and execute.

Proposed Changes

Depends on #2145.
Related to #2148.

Build a block operation set with all next blocks using a virtual machine block to step to the next one
Add for internal use only virtual machine blocks
Add Thread.STATUS_INTERRUPT
Add Thread.continuous
Store and call unbound block primitives with their context if available
Add a virtual machine block to handle the end of a stack frame for procedures, loop branches, non-loop branches, and threads
Add StackFrame.endBlockId
Skip profiling some opcodes (vm_*)
Use a virtual machine block to support the BROADCAST_INPUT block special case
Use a virtual machine block to report hat values
Use a virtual machine block to report stack click visual reports
Use a virtual machine block to report monitor results
Push vm_reenter_promise after the Promise resolves
Store reported values in execute.js:handlePromise
Collectively through other changes reduce the required branching around executing a block function

Reason for Changes

Build a block operation set with all next blocks using a virtual machine block to step to the next one

This is where this whole change set is heading so let's start here. Back in June/July we made a change to reporter execution by creating a set of all reporters and their final command block (optional for stack clicks) and executing that set in a for loop instead of the recursive walk through the reporters. This change is of the same idea, applied to command blocks.

With this a command block and each following block build a set of _ops we can use for promise thawing and _allOps inlining all of the _ops of the block and each next block. When the conditions are met this set of block operations can be performed in one for loop. If the conditions fail we safely fallback the do-while loop containing for the loop. And if we still fail there we can fall back to Sequencer.stepThread and ultimately Sequencer.stepThreads. In most cases we should be able to stay in execute and quickly execute a thread.

Add for internal use only virtual machine blocks

To make this possible some of the virtual machine behaviour need to be blocks. We can determine the block functions to perform ahead of time and execute them without have to passively test when we do that virtual machine work.

Effectively at this point execute uses this and the block operation sets to compile the current block state for a target on demand (or just-in-time). And with the compiled sets in the blocks cache, they will continue to be disposed of when a block is changed on the target.

Add Thread.STATUS_INTERRUPT

To support fall back from the inner execution loops, we make use of Thread.status and a new value Thread.STATUS_INTERRUPT. We already need to test status in case it becomes YIELD, YIELD_TICK, PROMISE_WAIT, or DONE. Using that fact we can use INTERRUPT in the internal use only blocks to signal the inner loop to break and return to outer code paths.

Using INTERRUPT, internal blocks can indicate to execute to reconsider the block operation set to execute if say for example another block a new branch. The branch blocks are not in the current execution set so execute should break out of the inner loop and retrieve the new block operation set to execute.

Add Thread.continuous

One more new member and value works with Thread.status to control the execute loops. Thread.continuous indicates if a thread should continue to execute after a command block or operate one block at a time. The default is false for one at a time. Sequencer will set it temporarily to true to enable the fast execute loop.

Store and call unbound block primitives with their context if available

A bound function has to be two function calls. The produced bound function calls the original function. A non-bound function called with Function.call is one function call. The difference of one function call can improve performance in the inner execution loop since many reporters (operator_*) are fairly cheap.

Add a virtual machine block to handle the end of a stack frame for procedures, loop branches, non-loop branches, and threads

Add StackFrame.endBlockId

We can move the branching for popping stacks from Sequencer and execute into virtual machine blocks. We can determine the blocks to be used ahead of time when a stack frame is pushed.

This lets us reduce the passive branch tests related to stack popping. I think popping with virtual machine blocks will follow easily as it mirrors the blocks that push the stack frames.

Skip profiling some opcodes (vm_*)

Profiling itself takes time away from executing more blocks. For internal behaviour I think skipping these blocks will help us make normal blocks perform better.

Use a virtual machine block to support the BROADCAST_INPUT block special case

BROADCAST_INPUT block inputs have a different behaviour when they store their value. This is currently supported by testing every block if its output will be stored as BROADCAST_INPUT. Making this the needed behaviour (casting to a string) a block behaviour we can determine to use that ahead of time after the block that produces its value. As such we'll remove a branching point from our inner loop.

This also lets us remove BROADCAST_INPUT special case logic when freezing and thawing thread execution for promises.

Use a virtual machine block to report hat values

Use a virtual machine block to report stack click visual reports

Use a virtual machine block to report monitor results

Like BROADCAST_INPUT, we currently test every reporter and command block if it is the lastOperation which may need to report to one of these behaviours. Then for command blocks (who for lastOperation is always true) or some reporters (who for lastOperation may be true if they are returning a stack click or monitor value) test the possible ways they need to report.

We can determine when these behaviours are needed ahead of time and turn the passive branching into planned actions. Hats can be determine when the operation is cached. Stack click reports and monitors can be determined when the thread is created and be set as the end block for the thread.

Though stack click is a bit weird in this. This version tries to report the topBlock when the last block executed in the therad was the topBlock or not. This likely isn't an issue as the case where we report is stack clicking a reporter. That'll still function here. The weird part is stack clicking on command blocks or a series of command blocks. We'll currently try to report the topBlock's value (which could be wrong if there was any non-running status) but this probably won't be an issue either as command blocks only return promises that resolve to undefined or undefined directly. And hat's which do return values are explicitly excluded in this change set.

Push vm_reenter_promise after the Promise resolves

Most of the thread from-a-promise-reentry logic is moved into a block. Most of this is the existing thread thawing behaviour with an added twist that temporarily modifies the operations set and sets continuous to false.

If the promise was for a command block we'll have the same thread popping behaviour. Just later when the thread would normally be executed. This is thanks to the vm_end_of_ virtual blocks. Those let us represent the popping behaviour in one place and promise reentry doesn't need to know it.

Store reported values in execute.js:handlePromise

This is a small change. Its the same block we already have but in handlePromise itself.

Collectively through other changes reduce the required branching around executing a block function

The inner execute loop at this point is very small. Which lets us execute blocks faster.

The remaining bit is profiling ... I want to move that, but the best case would be to split the execute function and call the right one from a newly exported execute function, or call the right inner function from sequencer.

The engine supports blocks that do not have defined functions.

- Add pointer member to thread. This is the current executing block. - Add stackFrame member to thread. This is the current frame information like procedure parameters. This is a step potentially towards stack-less threads. With further modifications we could have stack and stackFrames be null if a script does not call a procedure.

Define some engine behaviour as blocks inserted by the engine to pop stacks, restore promise, or report hats, stack clicks, and monitors.

- Support BROADCAST_INPUT input reporters with a vm block to cast the string. This removes a constant if test. - Build operations for all following command blocks. If the stack is not changed this is very fast. If it does change we'll operate at the previous speed while switching to the new set of command blocks.

kchadha

@mzgoddard, I think we're not quite ready to pull in these changes at the moment. In particular, we have some concerns about how the ongoing extensionification process will affect these changes (e.g. will this become the only use case for the existing code-path for core blocks)? I've filed #2177 to track this PR, since I'm closing it, to find a later time to figure out pulling in these changes or parts of these changes.

remove empty procedure_definetion block

fc561f8

The engine supports blocks that do not have defined functions.

mzgoddard added pr - needs review performance labels Apr 25, 2019

mzgoddard requested review from paulkaplan and kchadha April 25, 2019 21:30

This was referenced Apr 25, 2019

Raise params to the next frame when pushing #2145

Merged

Thread always done #2148

Open

thisandagain assigned kchadha Apr 29, 2019

thisandagain added this to the May 2019 milestone Apr 29, 2019

kchadha assigned cwillisf Apr 30, 2019

mzgoddard force-pushed the vm-blocks branch from 1693542 to aeb8fa6 Compare April 30, 2019 21:33

mzgoddard added 3 commits April 30, 2019 17:51

add for-virtual-machine-only blocks

431487e

Define some engine behaviour as blocks inserted by the engine to pop stacks, restore promise, or report hats, stack clicks, and monitors.

mzgoddard force-pushed the vm-blocks branch from aeb8fa6 to 9216f29 Compare April 30, 2019 21:51

kchadha mentioned this pull request May 21, 2019

Review changes from closed performance PR #2177

Open

kchadha reviewed May 21, 2019

View reviewed changes

kchadha closed this May 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add internal use only virtual machine blocks #2147

Add internal use only virtual machine blocks #2147

mzgoddard commented Apr 25, 2019 •

edited

Loading

kchadha left a comment

Add internal use only virtual machine blocks #2147

Add internal use only virtual machine blocks #2147

Conversation

mzgoddard commented Apr 25, 2019 • edited Loading

Resolves

Proposed Changes

Reason for Changes

kchadha left a comment

Choose a reason for hiding this comment

mzgoddard commented Apr 25, 2019 •

edited

Loading