区块链上的共识机制主要解决由谁来构造区块,以及如何维护区块链统一的问题
The mechanism of consensus on the block chain primarily addresses the question of who will construct the block and how to maintain the unity of the block chain
1.1.3.1 POW(工作量证明)
代表项目:BTC
Item for representation: BTC
由于不同的节点接受数据有所区别,为了保证数据一致性,每个区块数据只能由一个节点进行记录。BTC通过“工作量证明”(Proof of Work,PoW)来确认记账节点。每个节点如果想生成一个新的区块并写入区块链,必须解出比特币网络出的PoW问题。其关键的要素是工作量证明函数、区块信息及难度值。工作量证明函数是这道题的计算方式,区块决定了这道题的输入数据,难度值决定了这道题所需要的计算量。可以简单理解成就是将不同的nonce值作为输入,尝试进行SHA256哈希运算,找出满足给定数量前导0的哈希值的过程。而要求的前导0的个数越多,代表难度越大。
Because of differences in data acceptance at different nodes, data for each block can only be recorded by one node in order to ensure consistency. BTC confirms the account node by “Proof of Work”, PoW. If each node is to generate a new block and write into the block chain, it must solve the PoW problem from the Bitcoin network. The key element is the workload proof function, block information and difficulty values. The workload proof function is the way the subject is calculated, the block determines the input data for the subject, and the difficulty value determines the calculation required for the issue. It is easy to understand that the accomplishment is to use different nonce values as input, to try the SHA256 Hashi calculation, and to find a process to satisfy the desired prefix zero.
比特币节点求解工作量证明问题的步骤大致归纳如下:
The steps taken to solve the problem of proof of workload at the Bitcoin node are broadly summarized as follows:
- 生成铸币交易,并与其他所有准备打包进区块的交易组成交易列表,通过Merkle树算法生成Merkle根哈希;
- 把Merkle根哈希及其他相关字段组装成区块头,将区块头的80字节数据作为工作量证明的输入;
- 不停地变更区块头中的随机数,即nonce的数值,并对每次变更后的区块头做双重SHA256运算(即SHA256(SHA256(Block_Header))),将结果值与当前网络的目标值做对比,如果小于目标值,则解题成功,工作量证明完成。
1.1.3.2 POS(股权证明)
代表项目:目前以太坊的共识机制正在朝POS方向迭代
Representation item: The consensus mechanism in Taiku is currently evolving in the direction of POS
由于PoW 方式需要消耗大量的算力,使得大家陷入算力竞争中,导致算力集中于头部矿场存在安全性风险,同时挖矿的过程也没有实际价值,因此出现了POS机制。
The PoW approach, which requires a great deal of arithmetic, has plunged people into arithmetic competition, leading to a concentration of arithmetic on the safety risks of head mines and the lack of real value of the mining process, has led to the emergence of a POS mechanism.
目前PoS机制仍然有着问题需要解决,仍然处于发展之中。目前比较常见的是第一种Peercoin所采用的方式。
The PoS mechanism still has problems to solve and is still developing. What is more common is the first Percoin approach.
以Peercoin(点点币)为代表的PoW+PoS混合共识 其所采用的股权证明PoS模式下,有一个名词叫币龄,每个币每天产生1币龄,在hash计算中使得难度与交易输入的币龄成反比。同时在产出区块后清空币龄。但是仍然需要参与区块生产的节点进行一定量的哈希值计算,即以类似工作量的方式生产区块,只不过各节点通过计算寻找出合法区块的概率与节点持有的权益相关,即根据权益选择生产者,并且采用基于权益的激励方式。只是一定程度上解决了POW机制带来的能源大量消耗浪费的问题。
The PoW+PoS mixed consensus, represented by Peercoin (point money), has adopted the PoS model, under which a noun is called currency age, where each currency produces one currency a day, which inverses the difficulty in the hash calculation to the age of the currency entered into the transaction. At the same time, the currency is empty after the output block. But a certain amount of Hashi value is still required for the nodes involved in the production of blocks, i.e., production of blocks in a manner similar to the workload, except that the probability of the nodes being found is related to the interests held by the nodes, i.e., the selection of producers on the basis of an entitlement, and the adoption of an equity-based incentive approach.
以以太坊Casper为代表的新型PoS共识
The new PoS Consensus, represented by the Tatmadaw Casper
- 验证者押下一定比例的他们拥有的以太币作为保证金。
- 然后,开始验证每一个区块高度上的每一个候选块(由缴纳了保证金的验证人提交的块)。也就是说,当他们发现一个可以他们认为可以被加到链上的区块的时候,他们将以通过押下赌注来验证它。
- 通过多位验证人的下注,对于每个高度最终会选出唯一一个胜出块。
- 如果该区块被加到链上,然后验证者们将得到一个跟他们的赌注成比例的奖励。
- 但是,如果一个验证者采用一种恶意的方式行动、试图做“无利害关系”的事(如多次下注,反复下注),他们将立即遭到惩罚。
代表项目:EOS
Item for representation: EOS
对于PoS机制的加密货币,每个节点都可以创建区块。DPoS是由被社区选举的可信帐户(超级账户)来创建区块。DPoS机制类似于股份制公司,普通股民进不了董事会,要投票选举代表(受托人)代他们做决策。
For the encrypted currency of the PoS mechanism, each node can create blocks. The DPoS is a credible account (super-account) elected by the community to create blocks. The DPoS mechanism is similar to a stock company, where ordinary stockholders cannot enter the board of directors and vote for their representatives (trustors) to make decisions on their behalf.
PBFT的设计思想在很多共识机制中都有借鉴,同时也被很多联盟链采用。
The design ideas of PBFT have been incorporated into many consensus mechanisms and have also been adopted in many alliances.
支持容错故障节点之外,还支持容错作恶节点。假设集群节点数为 N,有问题的节点为 f。pbft 算法的最大容错节点数量是(n-1)/3。有问题的节点中,可以既是故障节点,也可以是作恶节点,或者只是故障节点或者只是作恶节点。
In addition to supporting fault tolerance nodes, there is also support for miscalculation nodes. Assuming that the number of cluster nodes is N, the problem node is f. The maximum number of fault nodes for pbft is (n-1)/3. In problematic nodes, it can be either a failure node or a bad node, or just a failure node or just a bad node.
假设故障节点和作恶节点都是不同的节点。那么就会有 f 个问题节点和 f 个故障节点,当发现节点是问题节点后,会被集群排除在外,剩下 f 个故障节点,那么根据小数服从多数的原则,集群里正常节点只需要比f个节点再多一个节点,即 f+1 个节点,确节点的数量就会比故障节点数量多,那么集群就能达成共识。所以,所有类型的节点数量加起来就是 f+1 个正确节点,f个故障节点和f个问题节点,即 3f+1=n
Assuming that nodes and nodes are different nodes, then there will be f problem nodes and f nodes, and when the nodes are found to be problem nodes, they will be excluded from the cluster, leaving f nodes, then according to the principle of decimal compliance, the normal nodes in the cluster will need only one more node than the f nodes, i.e. f+1 nodes, and the number of exact nodes will be greater than the number of failure nodes, so the clusters will be able to reach consensus. So, the number of nodes of all types will be added up to f+1 correct nodes, f failure nodes and f question nodes, i.e. 3f+1=n
pbft 算法的基本流程主要有以下四步:
The basic process for pbft algorithms consists of the following four steps:
- 客户端发送请求给主节点
- 主节点广播请求给其它节点,节点执行 pbft 算法的三阶段共识流程。
- 节点处理完三阶段流程后,返回消息给客户端。
- 客户端收到来自 f+1 个节点的相同消息后,代表共识已经正确完成。 为什么收到 f+1 个节点的相同消息后就代表共识已经正确完成?从上一小节的推导里可知,无论是最好的情况还是最坏的情况,如果客户端收到 f+1 个节点的相同消息,那么就代表有足够多的正确节点已全部达成共识并处理完毕了。
1.1.3.5 RAFT
Fabric:已实现了raft算法 与PBFT不同,RAFT不支持作恶节点,因此更多的用于私链中。raft 算法包含三种角色,分别是:跟随者( follower ),候选人(candidate )和领导者( leader )。集群中的一个节点在某一时刻只能是这三种状态的其中一种,这三种角色是可以随着时间和条件的变化而互相转换的。
Fabric: Unlike PBFT, the Raft algorithm has been achieved and RAFT does not support the node of malfeasance, so it is more used in the private chain. The Raft algorithm consists of three roles: follower (follower), candidate (candidate) and leader (lead). A node in a cluster can be only one of these three situations at a given point in time, and these three roles can be converted over time and conditions.
raft 算法主要有两个过程:一个过程是领导者选举,另一个过程是日志复制,其中日志复制过程会分记录日志和提交数据两个阶段。raft 算法支持最大的容错故障节点是(N-1)/2,其中 N 为 集群中总的节点数量。
The raft algorithm consists mainly of two processes: one is a leader election and the other is a log copy, where the log copying process divides the log and the submission of data. The raft algorithm supports the largest fault node (N-1)/2 where N is the total number of nodes in the cluster.
在Raft中,每个结点会处于下面三种状态中的一种:
In Raft, each node will be in one of the following three states:
- follower:所有结点都以follower的状态开始。如果没收到leader消息则会变成candidate状态
- candidate:会向其他结点“拉选票”,如果得到大部分的票则成为leader。这个过程就叫做Leader选举(Leader Election)
- leader:所有对系统的修改都会先经过leader。每个修改都会写一条日志(log entry) leader收到修改请求后的过程如下,这个过程叫做日志复制(Log Replication):
- 复制日志到所有follower结点(replicate entry)
- 大部分结点响应时才提交日志
- 通知所有follower结点日志已提交
- 所有follower也提交日志
- 整个系统处于一致的状态
Raft算法动画:https://link.zhihu.com/?target=http%3A//thesecretlivesofdata.com/raft/
Raft algorithm animated: https://link.zhihu.com/?target=http%3A//thesecretvesofdata.com/raft/
1.1.4.1 PBFT VS RAFT
1.1.4.2 主流算法对比
ZAB 是通过“一切以领导者为准”的强领导者模型和严格按照顺序处理、提交提案,来实现操作的顺序性的。主节点是基于 TCP 协议来广播消息的,并保证了消息接收的顺序性。出来的比较早。
The ZAB is operational in sequence through a “all-lead” leadership model and the submission of proposals in a strictly sequential manner. The main node is based on the TCP protocol to broadcast messages and to ensure that messages are received in sequence. It comes out early.
Raft协议是Raft协议就是一切以领导者为准的方式,实现一系列值的共识和各节点日志的一致性。通过日志的连续性来保证消息或者说是数据的顺序性以及完整性。Raft协议中日志不仅是数据的载体,同时,日志的完整性还会影响领导者选举的结果(注:选举的因素不仅仅只有日志的完整性来保证,还有任期等其他因素)。
The Raft agreement is that the Raft agreement is all about the unity of a series of values and the consistency of the logs by reference to the leader. The continuity of the logs guarantees information or the sequencing and integrity of the data. The logs in the Raft agreement are not only the vector of data, but also the integrity of the logs influences the results of the leadership elections (Note: the factors for the elections are not just the integrity of the logs, but also other factors such as tenure).
Raft目前是工程上使用较为广泛的强一致性、去中心化、高可用的分布式协议。在这里强调了是在工程上,因为在学术理论界,最耀眼的还是大名鼎鼎的Paxos。
Raft is now a more extensive, strong-consistency, decentralised, and highly available distribution protocol for engineering. Here the emphasis is on engineering, because in academic theory, the most prominent is the famous Paxos.
异同点:
Similarity:
-
领导者选举: ZAB 采用的“见贤思齐、相互推荐”的快速领导者选举(Fast Leader Election),节点间通过PK竞争(资本是所持有的信息)看哪个节点更适合做Leader,一个节点PK后,会将选票信息广播出去,最终选举出了大多数节点中数据最完整的节点。 Raft 采用的是“一张选票、先到先得”的自定义算法(注:里面包含了一个随机等待时间的概念,来保证最多几次选举就能完整选举过程。),这里简单说一下就是一个节点发现leader挂了,就选举自己为leader,然后通知其他节点,其他节点把选票投给第一个通知它的节点。(注:这里其实也会涉及到PK,根据数据的完整性以及任期等信息,如果通知它的节点 没有当前节点的数据完整等那么 当前节点是不会将选票投给该节点) 以上看来,Raft 的领导者选举,需要通讯的消息数更少,选举也更快。
(Note: Raft uses a self-defined method of “one vote, first-come, first-served” (note: it contains a concept of randomly waiting time to ensure that a maximum number of elections can be held in a complete electoral process). This is simply a node where Leader is found to be dead, where he is elected as Leader, and then the other nodes are notified, where the votes are cast to the first node to which the data are most complete. (Note: This will also involve the PK, based on data integrity and information about the term of office, and if the node is notified that the current node is not complete, the node will not be released to that node.)
-
日志复制: Raft 和 ZAB 相同,都是以领导者的日志为准来实现日志一致,而且日志必须是连续的,也必须按照顺序提交。 ZAB通过TCP来保证操作的顺序性。 Raft协议通过Log Entry 加自己的校验来实现日志的连续性。建议看上面的博客文章
Log copy: Like Raft and ZAB, logs are consistent by reference to the leader's log, and logs must be continuous and submitted sequentially. ZAB uses TCP to ensure the sequence of operations. Raft agrees to maintain log continuity by adding its own checkup to Log Entry. It is suggested to read the blog post above.
-
读操作和一致性: ZAB 的设计目标是操作的顺序性,在 ZooKeeper 中默认实现的是最终一致性,读操作可以在任何节点上执行;(注:很多地方说ZK是CP这没有毛病,但是并不是指ZK中的读写时强一致性,是指在发生P的时候,ZK是C,或者看到这个很懵,看下上面的博客,以及仔细看下CAP理论的图,里面也清晰的标记着在没发生P的时候,AC是可以共存的) 而 Raft 的设计目标是强一致性(也就是线性一致性),所以 Raft 更灵活(可以自己配置),Raft 系统既可以提供强一致性,也可以提供最终一致性,但是一般为了保证性能,默认提供的也是最终一致性。
ZAB's design objective is the sequence of operations, which is achieved by default in ZooKeeper and can be performed at any node; (Note: Many places say that ZK is CP, but do not refer to strong consistency when reading and writing in ZK, which means that at the time of the P, ZK is C, or seeing this poor, and looking at the blog above, and looking at the CAP theory with a clear indication that the AC can co-exist in the absence of the P, while Raft's design objective is strong consistency (i.e. linear consistency), so Raft is more flexible (which can be configured by itself), Raft can provide both strong consistency and ultimate consistency, but generally for the sake of assurance it is also final consistency by default.
-
写操作: Raft 和 ZAB 相同,写操作都必须在领导者节点上处理。
Writing operations: Like Raft and ZAB, writing operations must be handled at the leader node.
-
成员变更: Raft 和 ZAB 都支持成员变更,其中 ZAB 以动态配置(dynamic configuration)的方式实现的。那么当你在节点变更时,不需要重启机器,集群是一直运行的,服务也不会中断。
Membership changes: Raft and ZAB support membership changes in which ZAB is done in a dynamic configuration. Then when you change at node, you do not need to restart the machine, the cluster is running and the service is not interrupted.
-
设计阶段: 相比 ZAB,Raft 的设计更为简洁,比如 Raft 没有引入类似 ZAB 的成员发现和数据同步阶段,而是当节点发起选举时,递增任期编号,在选举结束后,广播心跳,直接建立领导者关系,然后向各节点同步日志,来实现数据副本的一致性。 ZAB协议的数据同步的阶段,ZAB集群式无法对外提供服务。 其实,ZAB 的成员发现,可以和领导者选举合到一起,类似 Raft,在领导者选举结束后,直接建立领导者关系,而不是再引入一个新的阶段;数据同步阶段,是一个冗余的设计,可以去除的,因为 ZAB 不是必须要先实现数据副本的一致性,才可以处理写请求,而且这个设计是没有额外的意义和价值的。
Design phase: The design design is more concise than the ZAB, Raft design, for example, did not introduce a ZAB-like member discovery and data synchronization phase, but instead, when node elections are launched, incremental term numbers are added, and when the election is over, the heart beats, direct leadership relationships are established, and then synchronizes the logs to each node to achieve data copy consistency. The ZAB agreement’s data synchronization phase, in which the ZAB cluster does not provide external service. Indeed, ZAB members find it possible to combine with the leader’s election, similar to Raft, to establish a direct leadership relationship after the leader’s election, rather than to introduce a new phase; the data synchronization phase is a redundant design that can be removed, because ZAB does not have to reconcile the data copy before processing requests, and this design has no additional meaning or value.
-
设计独立性 ZAB 和 ZooKeeper 强耦合,你无法在实际系统中独立使用;而 Raft 的实现(比如 Hashicorp Raft)是可以独立使用的,编程友好。
Design independence ZAB and ZooKeeper are strongly coupled, and you cannot use it independently in the actual system; while Raft's realization (e.g. Hashicorp Raft) can be independently used, programming friendlyly.
注册有任何问题请添加 微信:MVIP619 拉你进入群
打开微信扫一扫
添加客服
进入交流群
发表评论