Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIL: add a utility which can manage per-block bitfields and flags efficiently. #35521

Merged
merged 3 commits into from
Jan 22, 2021

Conversation

eeckstein
Copy link
Contributor

@eeckstein eeckstein commented Jan 20, 2021

The BasicBlockBitfield utility adds a custom bitfield to a function's basic blocks.

It can be used by transforms to store temporary flags or tiny values per basic block.
It is very efficient: no memory allocation is needed, no hash set or map is needed for lookup and there is no initialization cost (in contrast to BasicBlockData which needs to iterate over all blocks at initialization).

@eeckstein eeckstein changed the title [DNM] SIL: add a utility which can manage per-block bitfields and flags efficiently. SIL: add a utility which can manage per-block bitfields and flags efficiently. Jan 21, 2021
@eeckstein eeckstein requested a review from atrick January 21, 2021 09:28
@eeckstein eeckstein force-pushed the sil-bitfields branch 3 times, most recently from b7e3b89 to 797f66e Compare January 21, 2021 16:02
Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design looks great if we actually had free space in the block. What's the SILBasicBlock size in bytes? It's important that it fits in a 64-byte cache line (the block index may have already broken that expectation)! One of the most important SIL design goals is that blocks are cheap. We need to quickly iterate over blocks, split all critical edges, and clone entire functions.

If SILBasicBlocks no longer fit in the same allocation block size, then this will be much more expensive than using per-function bitvectors, which are simpler and more flexible. Note that we should seldom need to recompute the number of blocks and non-growable bitvectors can be effectively constant-space-time, stack allocated things. The only time I've noticed the cost of initializing bitvectors is when they are allocated per-block, which doesn't apply here.

Although it may not help directly I think you should only use a 32-bit ID and just put a release-build-assert in the SILFunction that it's never surpassed. At least then we'll reserve some extra place for future block flags.

If the ArgumentList can be made a TinyPtrVector then this change (and the block index) might "break even". If you can keep blocks within 64-bytes, then I won't object. Although worry a bit that in the future we may want to add another field to blocks and using 32-bits for a bitfield ID isn't the best use of space.

/// DD and EEE are uninitialized
///
/// See also BasicBlockBitfield::bitFieldIndex;
uint64_t lastInitializedBitfieldIndex = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was referring to the last initialized "bit index". This should be called the bitfield "ID" and point out that it is a monotonically increasing ID, not an index at all.

/// If a block's lastInitializedBitfieldIndex is less than this index, it
/// means that the bits of that block are not initialized yet.
/// See also SILBasicBlock::lastInitializedBitfieldIndex
uint64_t bitFieldIndex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an index, it is a monotonically increasing ID

@eeckstein
Copy link
Contributor Author

I don't agree with the concern of exceeding 64-bytes (for several reasons). But we can discuss this offline.
Nevertheless, using TinyPtrVector for the arguments is a great idea!

@eeckstein
Copy link
Contributor Author

@swift-ci test

@eeckstein
Copy link
Contributor Author

@atrick I pushed a new version.
I'll change the argument list to a TinyPtrVector in a separate PR.

Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok looks good other than keeping nodes within a cache line. Can you confirm that we'll be at 64bytes with TinyVector? I just quickly eyeballed it

@atrick
Copy link
Contributor

atrick commented Jan 21, 2021

So @eeckstein informed me that SILBasicBlock was already just over a cache line, and cache alignment doesn't matter because we're using the bump pointer allocator, and using the same allocator for other types (I was thinking of issues we had with LLVM IR, especially MachineInstr)

@eeckstein eeckstein merged commit a0884ba into swiftlang:main Jan 22, 2021
@eeckstein eeckstein deleted the sil-bitfields branch January 22, 2021 07:39
/// (or in) local variables in transformations. But it's e.g. not possible to
/// store a BasicBlockBitfield in an Analysis.
/// * The total number of bits which are alive at the same time must not exceed
/// 32.
Copy link
Contributor

@gottesmm gottesmm Jan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eeckstein Post-commit review.

Can you add some more docs here? Specifically, you never describe where the actual bits are and why the bitfield needs to follow a strict bitfield discipline. I would also like a reference in the document

/// Utility to add a custom bitfield to a function's basic blocks.
///
/// This can be used by transforms to store temporary flags or tiny values per
/// basic block. The memory managed is a 32 bit field within each basic block (\see BasicBlock:: customBits) and thus 
/// is very efficient: no memory allocation is needed, no hash set or map is
/// needed for lookup and there is no initialization cost (in contrast to
/// BasicBlockData which needs to iterate over all blocks at initialization).
///
/// Invariants:
/// * Each BasicBlockBitfield instance takes ownership of a range of bits from the basic block's spare
///    bits until destruction and thus those bits can not be reused by any other BasicBlockBitfield while the
///    instance is live.
/// * BasicBlockBitfield instances must be allocated and deallocated
///   following a strict stack discipline. This means, it's fine to use them as
///   (or in) local variables in transformations. But it's e.g. not possible to
///   store a BasicBlockBitfield in an Analysis.
/// * The total number of bits which are alive at the same time must not exceed
///   32 (the size of \see BasicBlock::customBits)
///
/// Some examples of utilities that use this are: ValueLifetimeAnalysis, ReachableBlocks, 

This would have given me all of the information that I needed to understand the code without having to search around. I found that I had to do that while reading the code.

That being said, I really like this change and see how it will save a bunch of compile time!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I improved the comment in #35583


// Fake SILFunction and SILBasicBlock.

struct SILFunction {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eeckstein this is my favorite part of this commit! We need to do this more often!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants