-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIL: add a utility which can manage per-block bitfields and flags efficiently. #35521
Conversation
9a3715e
to
8ac34e2
Compare
b7e3b89
to
797f66e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design looks great if we actually had free space in the block. What's the SILBasicBlock size in bytes? It's important that it fits in a 64-byte cache line (the block index may have already broken that expectation)! One of the most important SIL design goals is that blocks are cheap. We need to quickly iterate over blocks, split all critical edges, and clone entire functions.
If SILBasicBlocks no longer fit in the same allocation block size, then this will be much more expensive than using per-function bitvectors, which are simpler and more flexible. Note that we should seldom need to recompute the number of blocks and non-growable bitvectors can be effectively constant-space-time, stack allocated things. The only time I've noticed the cost of initializing bitvectors is when they are allocated per-block, which doesn't apply here.
Although it may not help directly I think you should only use a 32-bit ID and just put a release-build-assert in the SILFunction that it's never surpassed. At least then we'll reserve some extra place for future block flags.
If the ArgumentList can be made a TinyPtrVector
then this change (and the block index) might "break even". If you can keep blocks within 64-bytes, then I won't object. Although worry a bit that in the future we may want to add another field to blocks and using 32-bits for a bitfield ID isn't the best use of space.
include/swift/SIL/SILBasicBlock.h
Outdated
/// DD and EEE are uninitialized | ||
/// | ||
/// See also BasicBlockBitfield::bitFieldIndex; | ||
uint64_t lastInitializedBitfieldIndex = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was referring to the last initialized "bit index". This should be called the bitfield "ID" and point out that it is a monotonically increasing ID, not an index at all.
include/swift/SIL/SILBitfield.h
Outdated
/// If a block's lastInitializedBitfieldIndex is less than this index, it | ||
/// means that the bits of that block are not initialized yet. | ||
/// See also SILBasicBlock::lastInitializedBitfieldIndex | ||
uint64_t bitFieldIndex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an index, it is a monotonically increasing ID
I don't agree with the concern of exceeding 64-bytes (for several reasons). But we can discuss this offline. |
…iciently. It is very efficient: no memory allocation is needed an initialization is at zero cost.
797f66e
to
3e8612b
Compare
@swift-ci test |
@atrick I pushed a new version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok looks good other than keeping nodes within a cache line. Can you confirm that we'll be at 64bytes with TinyVector? I just quickly eyeballed it
So @eeckstein informed me that SILBasicBlock was already just over a cache line, and cache alignment doesn't matter because we're using the bump pointer allocator, and using the same allocator for other types (I was thinking of issues we had with LLVM IR, especially MachineInstr) |
/// (or in) local variables in transformations. But it's e.g. not possible to | ||
/// store a BasicBlockBitfield in an Analysis. | ||
/// * The total number of bits which are alive at the same time must not exceed | ||
/// 32. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eeckstein Post-commit review.
Can you add some more docs here? Specifically, you never describe where the actual bits are and why the bitfield needs to follow a strict bitfield discipline. I would also like a reference in the document
/// Utility to add a custom bitfield to a function's basic blocks.
///
/// This can be used by transforms to store temporary flags or tiny values per
/// basic block. The memory managed is a 32 bit field within each basic block (\see BasicBlock:: customBits) and thus
/// is very efficient: no memory allocation is needed, no hash set or map is
/// needed for lookup and there is no initialization cost (in contrast to
/// BasicBlockData which needs to iterate over all blocks at initialization).
///
/// Invariants:
/// * Each BasicBlockBitfield instance takes ownership of a range of bits from the basic block's spare
/// bits until destruction and thus those bits can not be reused by any other BasicBlockBitfield while the
/// instance is live.
/// * BasicBlockBitfield instances must be allocated and deallocated
/// following a strict stack discipline. This means, it's fine to use them as
/// (or in) local variables in transformations. But it's e.g. not possible to
/// store a BasicBlockBitfield in an Analysis.
/// * The total number of bits which are alive at the same time must not exceed
/// 32 (the size of \see BasicBlock::customBits)
///
/// Some examples of utilities that use this are: ValueLifetimeAnalysis, ReachableBlocks,
This would have given me all of the information that I needed to understand the code without having to search around. I found that I had to do that while reading the code.
That being said, I really like this change and see how it will save a bunch of compile time!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I improved the comment in #35583
|
||
// Fake SILFunction and SILBasicBlock. | ||
|
||
struct SILFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eeckstein this is my favorite part of this commit! We need to do this more often!
The BasicBlockBitfield utility adds a custom bitfield to a function's basic blocks.
It can be used by transforms to store temporary flags or tiny values per basic block.
It is very efficient: no memory allocation is needed, no hash set or map is needed for lookup and there is no initialization cost (in contrast to BasicBlockData which needs to iterate over all blocks at initialization).