Parallel transactions on ABEYCHAIN

May 3, 2022

As of today, the multi-core processor architecture has become a major trend in the industry, as parallelization technology can fully utilize the potential of CPUs. ABEYCHAIN’s Parallelizing Transaction Execution mechanism uses the advantages of multi-core processors to the fullest by enabling transactions in blocks to be executed in parallel as much as possible.

The traditional transaction execution mechanism works mostly as such: transactions are read one by one from the block.  After each transaction is executed, the state machine will move to the next state until all transactions are executed sequentially, as shown in the following figure:

It is obvious that this method of transaction execution is not very performance optimal. Even if the two transactions do not intersect, they can only be executed sequentially. In practical applications, splitting unrelated transactions and executing them in parallel will greatly optimize execution efficiency.

Issue 1: how to ensure that all nodes can reach the same state after execution in the same block?

In the traditional transaction execution mechanism, each time a transaction is executed, the state root will change once. After all transactions are executed, the final state root represents the current state of the blockchain.

When transactions are executed in parallel, the traditional method of calculating state root will no longer be applicable. The transactions need to be executed first, and the history of the state change of each transaction is recorded afterwards.

After all transactions are executed, a state root is calculated based on these historical records. This ensures that even if the transaction is executed in parallel, the final consensus node can still reach consensus.

Issue 2: how to recognize if there is dependency between two transactions?

If two transactions have no actual dependency between each other but are deemed so, this will cause unnecessary loss of performance efficiency.  On the contrary, if the two transactions rewrite the state of the same account but are executed in parallel, the final state of the account may be uncertain. Therefore, the determination of dependency is an important issue that affects performance and even indicates whether the blockchain can work normally.

We can easily tell whether two transactions are dependent by observing the address of the sender and receiver, such as the following example transactions: A→B, C→D, D→E.  Here, you can see that transaction D→E depends on the result of transaction C→D, but transaction A→B has little relationship with the other two transactions, therefore can be executed in parallel.

This analysis is correct in a blockchain that only supports simple transactions, but it may not be as accurate once it is placed in a Turing-complete blockchain that runs smart contracts – because it is impossible to know exactly what is in the transaction contract written by the user. Transaction A->B seems to have nothing to do with the account status of C and D.  However, in reality party A is a special account.  When each transaction is transferred through A’s account, transaction fees must be deducted from party C’s account first. In this scenario, all three transactions are actually related, so they cannot be executed in parallel.

In order to resolve this type of situation, we adopt a trial and error method.  Specific steps are as such:

  1. Preparing: convert all transactions to “message” type;
  2. Grouping: group transactions according to associated address; 
  3. Executing: execute each transaction group in parallel;
  4. Check for conflicts: check whether the transaction groups are in conflict according to returned results. If there is a conflict, roll back the conflicting transactions, then regroup and re-execute.
  5. Collect results: collect the execution results of each group, update the tree, and generate the state root

With the above method, transaction-related issues in parallel transactions can be resolved. Of course, inaccurate initial grouping, transaction rollback may still cause performance inefficiency at times.  But in most cases, transactions are irrelevant.  It is feasible to accurately group and execute transactions in parallel to improve transaction execution efficiency.

ABEYChain并行交易技术简介

在多核处理器架构已经成为主流的今天,利用并行化技术充分挖掘CPU潜力是行之有效的方案。ABEY Chain设计并实现了Parallelizing Transaction Execution ,它能充分发挥多核处理器优势,使区块中的交易能够尽可能并行执行。

传统交易执行方案是:从待共识的区块逐条读出交易,执行完每一笔交易后,状态机都会迁移至下一个状态,直到所有交易都被串行执行完成,如下图所示:

显而易见,这种交易执行方式对性能并不友好。即使两笔交易没有交集,也只能按照先后顺序依次执行。在实际应用中,将每笔不相关的交易拆开,并行执行,会大幅度优化执行效率。

对于同一个区块,如何确保所有节点执行完后能够达到同一状态

在传统的交易执行方案中,每执行一笔交易,state root便发生一次变迁,所有交易执行完后,最终的state root就代表了当前区块链的状态。

当交易被并行且乱序执行后,传统计算state root的方式显然不再适用。需要先执行交易,将每笔交易对状态的改变历史记录下来,待所有交易执行完后,再根据这些历史记录再算出一个state root,由此就可以保证即使并行执行交易,最后共识节点仍然能够达成一致。

如何判断两笔交易之间是否存在依赖关系?

若两笔交易本来无依赖关系但被判定为有,则会导致不必要的性能损失;反之,如果这两笔交易会改写同一个账户的状态却被并行执行了,则该账户最后的状态可能是不确定的。因此,依赖关系的判定是影响性能甚至能决定区块链能否正常工作的重要问题。

在简单的转账交易中,我们可以根据转账的发送者和接受者的地址,来判断两笔交易是否有依赖关系,比如如下3笔转账交易:A→BC→DD→E,可以很容易看出,交易D→E依赖于交易C→D的结果,但是交易A→B和其他两笔交易没有什么关系,因此可以并行执行。

这种分析在只支持简单转账的区块链中是正确的,但是一旦放到图灵完备、运行智能合约的区块链中,则可能不那么准确,因为无法准确知道用户编写的转账合约中到底有什么操作,可能出现的情况是:A->B的交易看似与CD的账户状态无关,但是在用户的底层实现中,A是特殊账户,通过A账户每转出每一笔钱必须要先从C账户中扣除一定手续费。在这种场景下,3笔交易均有关联,则它们之间无法使用并行的方式执行,若还按照先前的依赖分析方法对交易进行划分,则必定会出现问题。

为了解决这个问题,我们采用试错的方式,具体执行的步骤:

  1. prepare,把所有交易转成message类型
  2. group根据关联地址分组;
  3. execute,并行执行各个交易组
  4. check conflict,根据返回的结果检查交易组是否冲突,如果冲突,则就滚冲突的交易并重新分组并行执行,
  5. collect result,收集各组执行结果,更新trie,并生成state root

通过这个过程,就能够解决可能遇到的交易关联的问题。当然可能会由于初始分组不准确造成执行冲突后回滚重新分组的损耗,但是在绝大部分情况下,交易都是不相关的,那么可以准确进行分组并并行执行交易,提高交易执行效率。