Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block availability data enum #6866

Open
wants to merge 6 commits into
base: unstable
Choose a base branch
from

Conversation

dapplion
Copy link
Collaborator

Issue Addressed

PeerDAS has undergone multiple refactors + the blending with the get_blobs optimization has generated technical debt.

A function signature like this

fn get_blobs_or_columns_store_op(
&self,
block_root: Hash256,
block_epoch: Epoch,
blobs: Option<BlobSidecarList<T::EthSpec>>,
data_columns: Option<DataColumnSidecarList<T::EthSpec>>,
data_column_recv: Option<oneshot::Receiver<DataColumnSidecarList<T::EthSpec>>>,
) -> Result<Option<StoreOp<T::EthSpec>>, String> {

Allows at least the following combination of states:

  • blobs: Some / None
  • data_columns: Some / None
  • data_column_recv: Some / None
  • Block has data? Yes / No
  • Block post-PeerDAS? Yes / No

In reality, we don't have that many possible states, only:

  • NoData: pre-deneb, pre-PeerDAS with 0 blobs or post-PeerDAS with 0 blobs
  • Blobs(BlobSidecarList<E>): post-Deneb pre-PeerDAS with > 0 blobs
  • DataColumns(DataColumnSidecarList<E>): post-PeerDAS with > 0 blobs
  • DataColumnsRecv(oneshot::Receiver<DataColumnSidecarList<E>>): post-PeerDAS with > 0 blobs, but we obtained the columns via reconstruction

^ this are the variants of the new AvailableBlockData enum

So we go from 2^5 states to 4 well-defined. Downstream code benefits nicely from this clarity and I think it makes the whole feature much more maintainable.

Currently is_available returns a bool, and then we construct the available block in make_available. In a way the availability condition is duplicated in both functions. Instead, this PR constructs AvailableBlockData in is_available so the availability conditions are written once

if let Some(block_data) = is_available(..) {
    let available_block = make_available(block_data);
}

@dapplion dapplion added the das Data Availability Sampling label Jan 25, 2025
@dapplion dapplion added the ready-for-review The code is ready for review label Jan 25, 2025
@@ -204,46 +185,62 @@ impl<E: EthSpec> PendingComponents<E> {
///
/// Returns `true` if both the block exists and the number of received blobs / custody columns
/// matches the number of expected blobs / custody columns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs above need update

let num_received_blobs = self.num_received_blobs();
let num_received_columns = self.num_received_data_columns();
if spec.is_peer_das_enabled_for_epoch(block.epoch()) {
if self.verified_data_columns.len() >= custody_column_count {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed about this before - we don't allow for duplicates when adding columns to the available cache, so if len() is greater than custody column count, we likely have added non-custody column to the cache, and this would result in:

  1. inconsistent columns in DB
  2. incorrectly identifying a block as available where it isn't (getting some non custody columns, but custody columns may be missing)

With the current approach, we must be able to detect if there's a bug here.

"received_columns" => num_received_columns,
"expected_columns" => expected_columns_msg,
);
// TODO(das) get only the columns that we need, filter the rest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this filter necessary with the current approach?
Is this TODO to account for peer sampling?
Feels like it's inconsistent in the context of subnet sampling though - because we could have enough verified_data_columns but end up storing less than custody_column_count?


let all_columns_received = expected_columns_opt
.is_some_and(|num_expected_columns| num_expected_columns == num_received_columns);
if let Some(data_columns_recv) = self.data_column_recv.take() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, the ordering is good - so if we get enough gossip columns while computing, then we still make the block available and not wait for the computation.

} else {
// Before PeerDAS, blobs
let num_received_blobs = self.verified_blobs.iter().flatten().count();
if num_received_blobs >= num_expected_blobs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's definitely a bug if > happens, I'm not sure if we just make it available and store the the "uncommitted blobs"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous handling is good - and if there's unexpected blob there should be an error AvailabilityCheckError::Unexpected

verify_kzg_for_blob_list(blob_list.iter(), &self.kzg)
.map_err(AvailabilityCheckError::InvalidBlobs)?;
Ok(MaybeAvailableBlock::Available(AvailableBlock {
block_root,
block,
blobs,
data: AvailableBlockData::Blobs(blob_list),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's an issue but just want to point out that we'll be using the AvailableBlockData::Blobs here instead of AvailableBlockData::NoData when the list is empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think we'd actually add a StoreOp in this scenario, so we might end up storing an empty value

"block_root" => %block_root,
"count" => blobs.len(),
);
Ok(Some(StoreOp::PutBlobs(block_root, blobs)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep the is_empty check for both blobs and columns so we dont't store empty values?

DataColumns(DataColumnSidecarList<E>),
/// An optional receiver for `DataColumnSidecarList`.
///
/// This field is `Some` when data columns are being computed asynchronously.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs need update, this is no longer an Option

pub fn has_blobs(&self) -> bool {
match self.data {
AvailableBlockData::NoData => false,
AvailableBlockData::Blobs(_) => true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned earlier in the Rpc block path, we might end up using this variant for an empty list

let mut blobs_buffer = vec![None; self.max_blobs_per_block];
for blob in blob_list {
let blob_index = blob.index as usize;
let Some(blob_opt) = blobs_buffer.get_mut(blob_index) else {
return Err("Invalid blob index".to_string());
};
if blob_opt.is_some() {
return Err("Repeat blob index".to_string());
} else {
*blob_opt = Some(blob);
}
}
let blobs = RuntimeVariableList::new(
blobs_buffer.into_iter().flatten().collect::<Vec<_>>(),
self.max_blobs_per_block,
)
.map_err(|_| "Blobs returned exceeds max length".to_string())?;
responses.push(RpcBlock::new(None, block, Some(blobs)).map_err(|e| format!("{e:?}"))?)

// TODO(das): Once the columns are received, they will not be available in
// the early attester cache. If someone does a query to us via RPC we
// will get downscored.
AvailableBlockData::DataColumnsRecv(_) => (None, None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - this is an existing issue. I'll raise an issue tomorrow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's only little window where the downscoring can happen - once we finish computing then we'll be able to serve from the db

@jimmygchen jimmygchen added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Jan 30, 2025
Copy link
Member

@jimmygchen jimmygchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, very elegant solution, and much cleaner.
I've added some comments, the main questions I have are the DA checker is_available checks and edge case handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
das Data Availability Sampling waiting-on-author The reviewer has suggested changes and awaits thier implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants