-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block availability data enum #6866
base: unstable
Are you sure you want to change the base?
Conversation
@@ -204,46 +185,62 @@ impl<E: EthSpec> PendingComponents<E> { | |||
/// | |||
/// Returns `true` if both the block exists and the number of received blobs / custody columns | |||
/// matches the number of expected blobs / custody columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs above need update
let num_received_blobs = self.num_received_blobs(); | ||
let num_received_columns = self.num_received_data_columns(); | ||
if spec.is_peer_das_enabled_for_epoch(block.epoch()) { | ||
if self.verified_data_columns.len() >= custody_column_count { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed about this before - we don't allow for duplicates when adding columns to the available cache, so if len()
is greater than custody column count, we likely have added non-custody column to the cache, and this would result in:
- inconsistent columns in DB
- incorrectly identifying a block as available where it isn't (getting some non custody columns, but custody columns may be missing)
With the current approach, we must be able to detect if there's a bug here.
"received_columns" => num_received_columns, | ||
"expected_columns" => expected_columns_msg, | ||
); | ||
// TODO(das) get only the columns that we need, filter the rest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this filter necessary with the current approach?
Is this TODO
to account for peer sampling?
Feels like it's inconsistent in the context of subnet sampling though - because we could have enough verified_data_columns
but end up storing less than custody_column_count
?
|
||
let all_columns_received = expected_columns_opt | ||
.is_some_and(|num_expected_columns| num_expected_columns == num_received_columns); | ||
if let Some(data_columns_recv) = self.data_column_recv.take() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, the ordering is good - so if we get enough gossip columns while computing, then we still make the block available and not wait for the computation.
} else { | ||
// Before PeerDAS, blobs | ||
let num_received_blobs = self.verified_blobs.iter().flatten().count(); | ||
if num_received_blobs >= num_expected_blobs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's definitely a bug if >
happens, I'm not sure if we just make it available and store the the "uncommitted blobs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the previous handling is good - and if there's unexpected blob there should be an error AvailabilityCheckError::Unexpected
verify_kzg_for_blob_list(blob_list.iter(), &self.kzg) | ||
.map_err(AvailabilityCheckError::InvalidBlobs)?; | ||
Ok(MaybeAvailableBlock::Available(AvailableBlock { | ||
block_root, | ||
block, | ||
blobs, | ||
data: AvailableBlockData::Blobs(blob_list), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's an issue but just want to point out that we'll be using the AvailableBlockData::Blobs
here instead of AvailableBlockData::NoData
when the list is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think we'd actually add a StoreOp
in this scenario, so we might end up storing an empty value
"block_root" => %block_root, | ||
"count" => blobs.len(), | ||
); | ||
Ok(Some(StoreOp::PutBlobs(block_root, blobs))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep the is_empty
check for both blobs and columns so we dont't store empty values?
DataColumns(DataColumnSidecarList<E>), | ||
/// An optional receiver for `DataColumnSidecarList`. | ||
/// | ||
/// This field is `Some` when data columns are being computed asynchronously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs need update, this is no longer an Option
pub fn has_blobs(&self) -> bool { | ||
match self.data { | ||
AvailableBlockData::NoData => false, | ||
AvailableBlockData::Blobs(_) => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned earlier in the Rpc block path, we might end up using this variant for an empty list
lighthouse/beacon_node/network/src/sync/block_sidecar_coupling.rs
Lines 117 to 134 in 0572729
let mut blobs_buffer = vec![None; self.max_blobs_per_block]; | |
for blob in blob_list { | |
let blob_index = blob.index as usize; | |
let Some(blob_opt) = blobs_buffer.get_mut(blob_index) else { | |
return Err("Invalid blob index".to_string()); | |
}; | |
if blob_opt.is_some() { | |
return Err("Repeat blob index".to_string()); | |
} else { | |
*blob_opt = Some(blob); | |
} | |
} | |
let blobs = RuntimeVariableList::new( | |
blobs_buffer.into_iter().flatten().collect::<Vec<_>>(), | |
self.max_blobs_per_block, | |
) | |
.map_err(|_| "Blobs returned exceeds max length".to_string())?; | |
responses.push(RpcBlock::new(None, block, Some(blobs)).map_err(|e| format!("{e:?}"))?) |
// TODO(das): Once the columns are received, they will not be available in | ||
// the early attester cache. If someone does a query to us via RPC we | ||
// will get downscored. | ||
AvailableBlockData::DataColumnsRecv(_) => (None, None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - this is an existing issue. I'll raise an issue tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's only little window where the downscoring can happen - once we finish computing then we'll be able to serve from the db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, very elegant solution, and much cleaner.
I've added some comments, the main questions I have are the DA checker is_available
checks and edge case handling.
Issue Addressed
PeerDAS has undergone multiple refactors + the blending with the get_blobs optimization has generated technical debt.
A function signature like this
lighthouse/beacon_node/beacon_chain/src/beacon_chain.rs
Lines 7171 to 7178 in f008b84
Allows at least the following combination of states:
In reality, we don't have that many possible states, only:
NoData
: pre-deneb, pre-PeerDAS with 0 blobs or post-PeerDAS with 0 blobsBlobs(BlobSidecarList<E>)
: post-Deneb pre-PeerDAS with > 0 blobsDataColumns(DataColumnSidecarList<E>)
: post-PeerDAS with > 0 blobsDataColumnsRecv(oneshot::Receiver<DataColumnSidecarList<E>>)
: post-PeerDAS with > 0 blobs, but we obtained the columns via reconstruction^ this are the variants of the new
AvailableBlockData
enumSo we go from 2^5 states to 4 well-defined. Downstream code benefits nicely from this clarity and I think it makes the whole feature much more maintainable.
Currently
is_available
returns a bool, and then we construct the available block inmake_available
. In a way the availability condition is duplicated in both functions. Instead, this PR constructsAvailableBlockData
inis_available
so the availability conditions are written once