Skip to content

Set f as a configuration parameter #547

Open
@hesusruiz

Description

First, congratulations for a fantastic library (that for some reason went under my radar until recently).

I wanted to make a PR for some functionality, but first I would like to know your opinion to see if it makes sense and it does not affect safety or liveness. Thanks in advance for that.

All PBFT-based implementations that I know derive f (max. faulty replicas) from n (total number of replicas), and this library is not different (in computeQuorum(n)). This is fine if you want to maximise resiliency against failures for a given cost of operating replicas (a given n), or alternatively minimise the cost of operating replicas for a given resiliency against byzantine failures.

But in our use case, we have many volunteer organisations (both private and public administrations), willing to operate each one a replica, and cost of operation is not a problem. We would like to set the desired resiliency f independently of n (except for the relationship n >= 3f + 1).

So for example, if we have a total of 21 replicas, instead of supporting 6 bft failures we would like to support "only" 3, but we would like to still operate 21 replicas. In our configuration supporting 3 bft failures, a quorum would require 7 replicas (2f+1) to ensure that two quorums overlap at least in a correct one. By the way, this configuration would support 10 CFT failures instead of just 6 of the current library.

The changes to enable the configuration of f and modify computeQuorum(n) seem easy, but the important thing is if this would affect the implementation in some wrong way.

If you are curious about why on earth we would want to "reduce" the resiliency of the network if we already are assuming the cost of operating a big number of replicas, I could elaborate. But in summary, we have been operating in production for 3 years a BFT decentralised network which may become soon much bigger (national scale). We want to have a reasonable BFT resiliency, big CFT resiliency (which is the most common failure we have experienced), and a big number of different entities operating the network collaboratively (there should not be any operator at all).

So, our limit for n would be based on the protocol overhead, not on the desired level of BFT resiliency.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions