In a distributed system, all the network participants must agree on a common state of blockchain. State of a blockchain at any time T is the account balances of all the addresses. Agreement of majority of Validators on the state of blockchain is referred to as Consensus.
The consensus process must be as decentralized as possible. In this article we will discuss the consensus mechanism of Polkadot in detail. For the sake of better understanding, the article is divided into 3 parts. So let’s get started.
PART- 1 : Selection of Validators
Polkadot uses nominated- Proof of Stake to select a set of Validators from a group. In n-PoS, there are 2 network participants: Nominator and Validator.
- Nominators are the DOT holders who nominate a validator by backing them with DOT tokens to participate in the consensus process.
- Validators run their own full nodes and participate in consensus. They run hardware and are backed by Nominators’ DOT tokens.
Few terminologies before continuing:
- Era: 24 hours time period for which a set of validators are selected.
- Active Set: The set of Validators who are actively participating in consensus and receiving staking rewards.
- Waiting Set: Set of Validators who couldn’t enter the active set maybe because of low nominator backing than threshold value.
In traditional PoS system, the Validators receive rewards in proportion to their stakes. A Validator with 2 Mn staked tokens will receive twice the rewards as compared to the Validator with 1 Mn tokens staked.
Whereas in Polkadot’s n-PoS, all the validators of Active Set receive equal rewards irrespective of their staking share. The validator with 2 million tokens staked and the validator with 1 million tokens staked will get equal staking rewards.
In Polkadot, all the 297 Validators in Active Set receive EQUAL rewards during the era. To maximize rewards, Nominator has to ensure 3 conditions:
- back high staked validator so that enters Active Set; so nominator will also receive rewards
- back Validator with least stakes in Active Set; to maximize rewards
- make sure he is not backing a Validator with already 256 nominators, as only top 256 nominators are rewarded per Validator.
This ensures all Validators of Active Set have nearly equal DOT tokens nominated to them. Thus their network share is also almost equal. The 10 richest Validators have only 5.7% network share. And it takes the top 82 Validators to make up one-third of network share. This power distribution is what makes Polkadot decentralized; as the amount of validators required to corrupt the network will be very high.
The minimum number of validators required to make up 33.3% of network staking share. NC of a network is X, means it requires top X validators to acquire one-third of network share. So higher value means higher decentralization.
On acquiring 33.3% network share, a bad validator can manipulate transactions. On acquiring 51% share, he can pass a governance proposal to drain the community pool or any other malicious proposal in his favor.
NC of Solana is 27, Cosmos is 7, ETH 2.0 is 2, and Polkadot is a whopping 82. The reason being even distribution of network power.
PART- 2 : Block Production
Now that we have an Active Set of Validators, what the heck do they perform?
The validators perform 2 responsibilities: Block Production and Block Finalization. Let’s begin with block production.
Block Production simply means adding new blocks to the chain. If block production stops, the chain will come to halt. For this, Polkadot uses the BABE protocol that ensures continuous block production.
BABE (Blind Assignment for Blockchain Extension) as the name suggests randomly assigns a Validator (using a lottery) the role to extend the blockchain (produce new blocks).
How are Validators selected for Block Production?
To begin, keep in mind that a Slot has 6 seconds and Epoch has 2400 slots (4 hours). In each slot, all Validators roll a die with a secret hash; and execute a function known as Verifiable Random Function. This VRF returns 2 things: a RESULT ( random value) and a PROOF (that the result is calculated correctly).
VRF = f ( Secret key , (n-2)th epoch Randomness value , Slot number);
secret key: a key specifically made for these die roll
epoch randomness value: hash of VRF values from the blocks in the epoch before last (N-2), so past randomness affects the current pending randomness (N)
slot number: out of 2400 slots which one is this
All the Validators will get a unique RESULT and PROOF for their calculation. The RESULT is then compared with a Threshold Value determined by system.
- If RESULT < THRESHOLD, that particular Validator is a block production candidate, i.e. he will produce a new block in that slot. This Validator will submit the blocks to the network along with the PROOF that is generated in execution of the VRF function.
- If RESULT > THRESHOLD, the Validator will collect blocks and verify signature and check if Validator is the actual Slot Leader by checking the PROOF submitted by the Validator.
Now, there are 3 possibilities.
- The slot has only 1 block production candidate (for only 1 Validator, RESULT < THRESHOLD)
- The slot has multiple block production candidate ( for multiple Validators, RESULT < THRESHOLD)
- The slot has zero block production candidate ( for no Validator, RESULT < THRESHOLD)
Case 1: Only 1 block production Candidate
This is the ideal case, the block submitted by Validator will be verified by other Validators and if valid, this gets added to the blockchain. The block produced by a VRF selected Validator is called Primary Block.
Case 2: Multiple Candidates
All these candidates will submit their blocks to the network. The first valid block to reach and be verified by the majority network wins. But this might cause forking; and can take long time to get finalized as BABE offers probabilistic finality; thus GRANDPA — the finality gadget kicks in and instantly finalizes the best fork with it’s Fork Selection Rule (discussed later)
Case 3: No Candidates
Generally if a slot has no block producer, it tends to remain block-less, resulting in network downtime. But Polkadot fixes this problem by running a secondary “Robin Style” Validator selection algorithm in the background.
Validators selected using this algorithm always produce a block in background; know as Secondary Block. This secondary block is overshadowed if a Primary Block is produced for that slot by a VRF selected validator. In this way, the slots might have a Primary block or a Secondary block but can not remain block-less.
The block produced by a VRF selected Validator is called Primary Block.
The block produced by Robin Style algorithm is called Secondary Block.
PART- 3 : Block Finalization
To avoid forking and to provide Provable finality to the blocks, Polkadot uses the finality gadget GRANDPA.
GRANDPA stands for GHOST-based Recursive Ancestor Deriving Prefix Agreement. This allows every Validator to agree on a chain by a simple Fork Rule Choice. Goal is to agree on many blocks at once and finalize a valid chain.
In probabilistic finality block production may continue for a long time without GRANDPA, so it needs to catch up to ensure forking is prevented asap.
We always need two-third of Validators’ agreement for finalization. Out of 297 Validators, (⅔) valdators i.e. 197 should always be online. Suppose we have only 150 Validators online, in this case the blocks will still be produced with BABE’s probabilistic finality, multiple forks might be possible.
But as soon as two-third (= 197) Validators again come online, the most appropriate Chain will be selected by Simple Fork Rule and get finalized with Provable finality.
The chain must follow 2 conditions to be selected and finalized.
- It should start from the most recent GRANDPA finalized block.
- It should have the most number of Primary Blocks.
Let’s understand this better with an example.
In the above image, the Black blocks are the ones finalized by Grandpa. F2 being the most recent finalized block. A,B,C and D are the different chains or forks; with Primary blocks as 1 and Secondary blocks as 2. Applying the Fork Rule on each one of these one by one.
- Chains starts from most recent finalized block ? YES. Starts from F2.
- Chain has most primary blocks? NO. Chain is longest but only 1 primary block.
- Chains starts from most recent finalized block ? YES
- Chain has most primary blocks? YES. Not the longest chain but has most Primary blocks.
- Chains starts from most recent finalized block ? YES
- Chain has most primary blocks? NO. Only 2 primary blocks.
- Chains starts from most recent finalized block ? NO
- Chain has most primary blocks? Doesn’t matter as Condition 1 is false.
Thus, GRANDPA will finalize the chain B as it satisfied both conditions of Fork Rule. This is the final part of the Consensus process on how blocks are finalized in Polkadot.
The consensus mechanism in a distributed system should be as decentralized as possible. With n-PoS Polkadot is able to achieve a Nakamoto coefficient of 82; that makes it very decentralized.
In each ERA of 24 hours, 297 validators are selected into Active Set for participating in the consensus process. Now, in every SLOT of 6 seconds these validators will roll a die or participate in a lottery to get selected as a Block Production Candidate. They will execute VRF, produce blocks and submit them to the network. Rest of the Validators will check for authenticity and finalize the blocks using the GRANADA finality gadget. To avoid any fork, GRANDPA uses the simple Fork Rule. This sums up the Polkadot’s Consensus Mechanism.