Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add internal use only virtual machine blocks #2147

Closed
wants to merge 4 commits into from

Conversation

mzgoddard
Copy link
Contributor

@mzgoddard mzgoddard commented Apr 25, 2019

Resolves

Performance switching between blocks in Sequencer.stepThread and execute.

Proposed Changes

Depends on #2145.
Related to #2148.

  • Build a block operation set with all next blocks using a virtual machine block to step to the next one
  • Add for internal use only virtual machine blocks
  • Add Thread.STATUS_INTERRUPT
  • Add Thread.continuous
  • Store and call unbound block primitives with their context if available
  • Add a virtual machine block to handle the end of a stack frame for procedures, loop branches, non-loop branches, and threads
  • Add StackFrame.endBlockId
  • Skip profiling some opcodes (vm_*)
  • Use a virtual machine block to support the BROADCAST_INPUT block special case
  • Use a virtual machine block to report hat values
  • Use a virtual machine block to report stack click visual reports
  • Use a virtual machine block to report monitor results
  • Push vm_reenter_promise after the Promise resolves
  • Store reported values in execute.js:handlePromise
  • Collectively through other changes reduce the required branching around executing a block function

Reason for Changes

  • Build a block operation set with all next blocks using a virtual machine block to step to the next one

This is where this whole change set is heading so let's start here. Back in June/July we made a change to reporter execution by creating a set of all reporters and their final command block (optional for stack clicks) and executing that set in a for loop instead of the recursive walk through the reporters. This change is of the same idea, applied to command blocks.

With this a command block and each following block build a set of _ops we can use for promise thawing and _allOps inlining all of the _ops of the block and each next block. When the conditions are met this set of block operations can be performed in one for loop. If the conditions fail we safely fallback the do-while loop containing for the loop. And if we still fail there we can fall back to Sequencer.stepThread and ultimately Sequencer.stepThreads. In most cases we should be able to stay in execute and quickly execute a thread.

  • Add for internal use only virtual machine blocks

To make this possible some of the virtual machine behaviour need to be blocks. We can determine the block functions to perform ahead of time and execute them without have to passively test when we do that virtual machine work.

Effectively at this point execute uses this and the block operation sets to compile the current block state for a target on demand (or just-in-time). And with the compiled sets in the blocks cache, they will continue to be disposed of when a block is changed on the target.

  • Add Thread.STATUS_INTERRUPT

To support fall back from the inner execution loops, we make use of Thread.status and a new value Thread.STATUS_INTERRUPT. We already need to test status in case it becomes YIELD, YIELD_TICK, PROMISE_WAIT, or DONE. Using that fact we can use INTERRUPT in the internal use only blocks to signal the inner loop to break and return to outer code paths.

Using INTERRUPT, internal blocks can indicate to execute to reconsider the block operation set to execute if say for example another block a new branch. The branch blocks are not in the current execution set so execute should break out of the inner loop and retrieve the new block operation set to execute.

  • Add Thread.continuous

One more new member and value works with Thread.status to control the execute loops. Thread.continuous indicates if a thread should continue to execute after a command block or operate one block at a time. The default is false for one at a time. Sequencer will set it temporarily to true to enable the fast execute loop.

  • Store and call unbound block primitives with their context if available

A bound function has to be two function calls. The produced bound function calls the original function. A non-bound function called with Function.call is one function call. The difference of one function call can improve performance in the inner execution loop since many reporters (operator_*) are fairly cheap.

  • Add a virtual machine block to handle the end of a stack frame for procedures, loop branches, non-loop branches, and threads
  • Add StackFrame.endBlockId

We can move the branching for popping stacks from Sequencer and execute into virtual machine blocks. We can determine the blocks to be used ahead of time when a stack frame is pushed.

This lets us reduce the passive branch tests related to stack popping. I think popping with virtual machine blocks will follow easily as it mirrors the blocks that push the stack frames.

  • Skip profiling some opcodes (vm_*)

Profiling itself takes time away from executing more blocks. For internal behaviour I think skipping these blocks will help us make normal blocks perform better.

  • Use a virtual machine block to support the BROADCAST_INPUT block special case

BROADCAST_INPUT block inputs have a different behaviour when they store their value. This is currently supported by testing every block if its output will be stored as BROADCAST_INPUT. Making this the needed behaviour (casting to a string) a block behaviour we can determine to use that ahead of time after the block that produces its value. As such we'll remove a branching point from our inner loop.

This also lets us remove BROADCAST_INPUT special case logic when freezing and thawing thread execution for promises.

  • Use a virtual machine block to report hat values
  • Use a virtual machine block to report stack click visual reports
  • Use a virtual machine block to report monitor results

Like BROADCAST_INPUT, we currently test every reporter and command block if it is the lastOperation which may need to report to one of these behaviours. Then for command blocks (who for lastOperation is always true) or some reporters (who for lastOperation may be true if they are returning a stack click or monitor value) test the possible ways they need to report.

We can determine when these behaviours are needed ahead of time and turn the passive branching into planned actions. Hats can be determine when the operation is cached. Stack click reports and monitors can be determined when the thread is created and be set as the end block for the thread.

Though stack click is a bit weird in this. This version tries to report the topBlock when the last block executed in the therad was the topBlock or not. This likely isn't an issue as the case where we report is stack clicking a reporter. That'll still function here. The weird part is stack clicking on command blocks or a series of command blocks. We'll currently try to report the topBlock's value (which could be wrong if there was any non-running status) but this probably won't be an issue either as command blocks only return promises that resolve to undefined or undefined directly. And hat's which do return values are explicitly excluded in this change set.

  • Push vm_reenter_promise after the Promise resolves

Most of the thread from-a-promise-reentry logic is moved into a block. Most of this is the existing thread thawing behaviour with an added twist that temporarily modifies the operations set and sets continuous to false.

If the promise was for a command block we'll have the same thread popping behaviour. Just later when the thread would normally be executed. This is thanks to the vm_end_of_ virtual blocks. Those let us represent the popping behaviour in one place and promise reentry doesn't need to know it.

  • Store reported values in execute.js:handlePromise

This is a small change. Its the same block we already have but in handlePromise itself.

  • Collectively through other changes reduce the required branching around executing a block function

The inner execute loop at this point is very small. Which lets us execute blocks faster.

The remaining bit is profiling ... I want to move that, but the best case would be to split the execute function and call the right one from a newly exported execute function, or call the right inner function from sequencer.

The engine supports blocks that do not have defined functions.
- Add pointer member to thread. This is the current executing block.
- Add stackFrame member to thread. This is the current frame
  information like procedure parameters.

This is a step potentially towards stack-less threads. With further
modifications we could have stack and stackFrames be null if a script
does not call a procedure.
Define some engine behaviour as blocks inserted by the engine to pop
stacks, restore promise, or report hats, stack clicks, and monitors.
- Support BROADCAST_INPUT input reporters with a vm block to cast the
  string. This removes a constant if test.
- Build operations for all following command blocks. If the stack is
  not changed this is very fast. If it does change we'll operate at the
  previous speed while switching to the new set of command blocks.
Copy link
Contributor

@kchadha kchadha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzgoddard, I think we're not quite ready to pull in these changes at the moment. In particular, we have some concerns about how the ongoing extensionification process will affect these changes (e.g. will this become the only use case for the existing code-path for core blocks)? I've filed #2177 to track this PR, since I'm closing it, to find a later time to figure out pulling in these changes or parts of these changes.

@kchadha kchadha closed this May 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants