Content area
Popular blockchains such as Ethereum and several others execute complex transactions in the block through user-defined scripts known as smart contracts. Serial execution of smart contract transactions/atomic units (AUs) fails to harness the multiprocessing power offered by the prevalence of multi-core processors. By adding concurrency to the execution of AUs, we can achieve better efficiency and higher throughput. In this paper, we develop a concurrent miner that proposes a block by executing AUs concurrently using optimistic Software Transactional Memory systems (STMs). It efficiently captures independent AUs in the concurrent bin and dependent AUs in the block graph (BG). Later, we propose a concurrent validator that re-executes the same AUs concurrently and deterministically using the concurrent bin followed by the BG given by the miner to verify the block. We rigorously prove the correctness of concurrent execution of AUs. The performance benchmark shows that the average speedup for the optimized concurrent miner is
Introduction
Blockchain is commonly believed to be a revolutionary technology for doing business over the Internet. It is a decentralized, distributed database or ledger of records that stores the information in cryptographically linked blocks. Cryptocurrencies such as Bitcoin [37] and Ethereum [16] were the first to popularize blockchain technology. Blockchains are now considered for automating and securely storing user records such as healthcare, financial services, real estate, etc. A blockchain network consists of multiple peers (or nodes) that do not necessarily trust each other. Each node maintains a copy of the distributed ledger. Clients, users of the blockchain, send requests or transactions to the nodes of the blockchain, called miners. The miners collect multiple transactions from clients or other peers, form blocks, and propose new blocks to be added to the blockchain.
Transactions sent by clients to miners are part of a larger code called smart contracts that provide several complex services, such as managing the system state, ensuring rules, or credential checking of the parties involved [13]. Smart contracts are like classes in object-oriented programming languages that encapsulate data and methods. The data represents the smart contract state (and the blockchain). Methods (or functions) are transactions invoked by clients that operate on the data and change the contract state. For example, Ethereum uses Solidity [44], while Hyperledger Fabric [3] supports Java, Go, Node.js, etc., for implementing smart contracts.
Motivation for concurrent execution of smart contracts Dickerson et al. [13] observed that transactions are executed in two different contexts in the Ethereum blockchain. First, transactions are executed by miners while forming a block. A miner selects a sequence of client requests (transactions) and executes the smart contract code of these transactions in sequence, transforming the state of the associated contract in this process. The miner then stores the sequence of transactions, the resulting final state of the contracts, and the previous block hash in the block. After creating a block, the miner proposes it be added to the blockchain through the consensus protocol. The other peers in the system, also called validators in this context, validate the block proposed by the miner. The validators re-execute transactions in the block serially to verify the block’s final state. If the final state computed by a validator matches with the final state of the miner in the block, then the block is accepted, and the miner who proposed that block is rewarded. Otherwise, the block is rejected. It has been observed that the validation by validators runs several times more than the mining by miners [13]. It means that more participants play the role of validator than the mining role to add new blocks to the blockchain.
This design of smart contract execution is not efficient as it does not allow any concurrency. In today’s world of multi-core systems, serial execution does not utilize all cores, resulting in lower throughput. This limitation is not specific only to the Ethereum blockchain but also applies to other popular blockchains as well. Higher throughput means more transaction execution per unit time, which clearly will be desired by both miners and validators.
However, the concurrent execution of smart contract transactions is not straightforward. The transactions could consist of conflicting accesses to shared data objects. Two contract transactions conflict if they both access the same shared data object and one of them performs a write operation. Arbitrary execution of contract transactions by miners might result in an inconsistent final state of the blockchain. The common solution to the correct concurrent execution of the transactions is to ensure that the execution is serializable [38]. A usual correctness criterion in databases, serializability, ensures that the concurrent execution is equivalent to some serial execution of the same transactions. Thus, miners must ensure that their execution is serializable or one of its variants, as described later.
The concurrent execution of contract transactions in a block by validators, although highly desirable, can further complicate the situation. Suppose a miner ensures that the concurrent execution of transactions in a block is serializable. Later, a validator re-executes the same transactions concurrently. The validator may concurrently execute two conflicting transactions in an order different from that of the miner. Thus, the serialization order of the miner is different from the validator. This can result in the validator obtaining a final state different from the miner. Consequently, the validator may incorrectly reject a valid block, as depicted in Fig. 1.
[See PDF for image]
Fig. 1
a Consists of two concurrent conflicting transactions and working on same shared data-objects x which are part of a block. b Represents the miner’s concurrent execution with an equivalent serial schedule as and final state (or FS) as 20 from the initial state (or IS) 0. Whereas c shows the concurrent execution by a validator with an equivalent serial schedule as , and the final state as 10 from IS 0, which is different from the final state proposed by the miner. Such a situation leads to the rejection of the valid block by the validator, which is undesirable
Dickerson et al. [13] identified these issues and proposed a solution for concurrent execution by both miners and validators. The miner concurrently executes block transactions using abstract locks and inverse logs to generate a serializable execution. Then, to enable correct concurrent execution by validators, miners provide a happen-before graph in the block. The happen-before graph is a direct acyclic graph of the block transactions. If there is a path from a transaction to , the validator must execute before . Transactions with no path between them can be executed concurrently. The validator uses the happen-before graph to execute transactions concurrently using a fork-join approach. This approach ensures that miners’ and validators’ final state of all valid blocks matches, and any valid block is not rejected due to concurrent execution. A happen-before graph in the block provides a greater performance increase for validators to consider such blocks and execute them more quickly through parallelization than a block without parallelization. It encourages miners to provide such tools in the block for concurrent execution by validators.
Proposed approach—optimistic concurrent execution It is well known that locks are pessimistic in nature. So, we propose a novel and efficient framework for the concurrent miner using optimistic Software Transactional Memory Systems (STMs). STMs offer parallel programming paradigms for concurrent execution without worrying about consistency issues.
The miner requires executing smart contract transactions concurrently and outputting a graph capturing dependencies among transactions (e.g., in the form of a happen-before graph). We call this graph the Block Graph (BG). The proposed concurrent miner uses an optimistic STM system to execute contract transactions concurrently. Since STMs also work with transactions, we differentiate between smart contract transactions and STM transactions. A STM transaction is a code that the STM system tries to execute atomically with other concurrent transactions. If the STM system cannot execute it atomically, then the transaction is aborted. A smart contract transaction is supposed to be executed serially. As a result, when we run it concurrently, we expect it to be executed atomically. In the rest of the document, we refer to smart contract transactions as Atomic Units (or AUs) and STM transactions as transactions. The miner uses the STM system to initiate a transaction for each AU. If a transaction is aborted, the STM will continually initiate new transactions for the same AU until an invocation commits.
To ensure correct concurrent execution, a popular correctness guarantee provided by STM systems is opacity [20], which is stronger than serializability. Opacity, like serializability, requires that concurrent execution of transactions, even those that abort, be equal to some serial execution. This ensures that even aborted transactions keep reading consistent values until they are aborted. As a result, a miner employing an STM does not encounter undesirable side effects like crash failures, infinite loops, division by zero, etc. This guarantee is provided by STMs, which execute transactions optimistically and provide atomic (opaque) reads and writes on transactional objects (or t-objects).
We have chosen two timestamp-based STMs in our design: (1) Basic Timestamp Ordering or BTO STM [48, Chap 4], maintains only one version for each t-object. (2) Multi-Version Timestamp Ordering or MVTO STM [29], maintains multiple versions corresponding to each t-object, which further reduces the number of aborts and improves the throughput.
The advantage of using timestamp-based STM is that the equivalent serial history is ordered based on the transactions’ timestamps. Thus, the miner can generate the BG of AUs using timestamps. We call this the STM approach. Dickerson et al. [13] developed the BG in a serial manner. Saraph and Herlihy [40] proposed a simple bin-based two-phase speculative approach to execute AUs concurrently in the Ethereum blockchain. Though, the bin-based approach reduces the size of the block but fails to exploit the maximum concurrency. We name this approach the Speculative Bin (Spec Bin) approach. In the proposed approach, we combined the bin-based approach [40] with the STM approach [5] for the optimal storage of the BG in the block to exploit the concurrency. As a result, the concurrent miner generates an efficient BG in a concurrent and lock-free manner [26].
The concurrent miner applies the STM approach to generate two bins while executing AUs concurrently: a concurrent bin and a sequential bin. AUs that do not have any conflicts (can be executed concurrently without synchronization) are stored in the concurrent bin. While the AUs having conflicts are stored in a sequential bin in the BG form to record dependencies. This combined technique reduces the size of the BG when compared with Anjana et al. [5] by storing the graph only for sequential bin AUs instead of all AUs. The STM approach of [5] does not distinguish between independent and dependent transactions; hence, each AU in the block has a corresponding vertex node in the BG. In contrast, in our approach, independent transactions are stored in the concurrent bin, which requires less space than adding vertex nodes in the BG, whereas dependent transactions in the sequential bin have a corresponding vertex node in the BG.
We propose a concurrent validator that creates multiple threads. Each of these threads processes the concurrent bin, followed by the efficient BG provided by the concurrent miner, and re-executes the AUs for validation. The BG consists of only dependent AUs. A validator thread claims a node that does not have any dependencies in the BG, i.e., a node without any incoming edges by marking it. After that, it executes the corresponding AUs deterministically. Since threads execute nodes without incoming edges, the concurrently executing AUs will be independent. Therefore, the validator threads need not have to worry about synchronization issues. We name this approach adopted by the validator as a decentralized approach, since numerous threads operate on the BG concurrently without a master thread. The validator approach adopted by Dickerson et al. [13] works on a fork-join technique in which a master thread allocates different tasks to slave threads. The master thread identifies AUs that do not have any incoming dependencies in the BG and allocates them to different slave threads. We compare the performance of these approaches with the serial validator.
The significant contributions of the paper are as follows:
The proposed solution is unique in a way that it combines the advantages of the bin-based approach [40] and the STM approach [5] to overcome their limitations and attain higher throughput. We implement the concurrent miner using BTO and MVTO optimistic STM, but it is generic to any STM protocol. It can be integrated with any order-execute-based1 permissioned or permissionless blockchain platforms such as Sawtooth [41], Tezos [19], and EOS [15] (see Sect. 4).
To make the OptSmart approach block space optimal and efficient, we have optimized the BG size by combining the speculative bin approach with the STM approach. The non-conflicting AUs are stored in the concurrent bin, while conflicting AUs are stored in the BG to record the conflicts by the concurrent miner. A lock-free and concurrent graph library [9] is used to generate an efficient BG that contains only dependent AUs and optimizes the size of the BG when compared with Anjana et al. [5] (see Sect. 4.3).
We propose a concurrent two-phase decentralized validator that re-executes AUs deterministically and efficiently in two phases: first, a concurrent bin phase to execute all independent AUs concurrently without any synchronization between threads, followed by an STM phase using the efficient BG provided by the concurrent miner to execute dependent AUs concurrently (see Sect. 4.4).
We rigorously prove that the concurrent miner and validator satisfy the opacity correctness criterion (see Sect. 5).
We achieve an average speedup of and for optimized concurrent miners using BTO and MVTO STM protocols. Optimized decentralized BTO and MVTO concurrent validators outperform an average of and compared to the serial validators, respectively. The optimized miner outperforms to on average over state-of-the-art concurrent miners, while the optimized validator outperforms to over state-of-the-art concurrent validators. Moreover, the proposed efficient BG saves more average block space when compared with the state-of-the-art (see Sect. 6).
Related work
This section presents various techniques that have been proposed in the literature to improve the performance and scalability of blockchain platforms. The techniques for concurrent processing of block transactions are discussed in Sect. 2.1, blockchain network sharding in Sect. 2.2, and improved consensus protocols in Sect. 2.3.
The interpretation of blockchain was introduced by Satoshi Nakamoto in 2009 as Bitcoin [37] to perform electronic transactions without third-party interference. To expand blockchain functionalities beyond financial transactions (cryptocurrencies), Ethereum adopted smart contracts, first introduced by Nick Szabo [47] in 1997. Smart contracts are client-to-blockchain interfaces that reduce computational transaction costs and securely store the contracts on a distributed network. Concurrent execution of smart contracts may result in various security issues that necessitate independent research, hence not discussed in this work. The work in [12, 32, 34] focuses on the safety and security of smart contracts. We mainly focus on the concurrent execution of smart contracts. Table 1 provides a concise summary of closely related works on the concurrent execution of AUs.
Research on concurrent execution of AUs
Dickerson et al. [13] proposed a technique to execute AUs in the blockchain concurrently. They proposed a pessimistic ScalaSTM and abstract lock-based technique at the miner and a fork-join technique at the validator using a happen-before graph added by the miner in the block. They observed that miners and validators could concurrently execute AUs to exploit the concurrency offered by ubiquitous multi-core processors and achieve higher throughput (This approach is discussed earlier in Sect. 1).
Zhang and Zhang [50] proposed a concurrent miner that employs a pessimistic concurrency control protocol that delays the read until the corresponding write(s) commits and ensures a conflict-serializable schedule. They proposed a concurrent validator that uses the MVTO protocol to execute transactions concurrently using write sets provided by the concurrent miner in the block.
Table 1. Summary of related work on concurrent execution of AUs
Miner approach | Locks | Require block graph | Validator approach | |
|---|---|---|---|---|
Dickerson et al. [13] | Pessimistic ScalaSTM | Yes | Yes | Fork-join |
Zhang and Zhang [50] | – | – | Read, Write Set | MVTO approach |
Anjana et al. [5] | Optimistic RWSTM | No | Yes | Decentralized |
Amiri et al. [1] | Static analysis | – | Yes | – |
Saraph and Herlihy [40] | Bin-based approach | Yes | No | Bin-based |
Anjana et al. [4] | Optimistic ObjectSTM | No | Yes | Decentralized |
Proposed approach | Bin + Optimistic RWSTM | No | No (if no dependencies) / Yes | Decentralized |
Anjana et al. [5] proposed an optimistic Read-Write STM (RWSTM) based concurrent miner approach using BTO and MVTO timestamp protocols. Timestamps are used to identify conflicts between AUs. The miner executes AUs using RWSTM and constructs the BG at runtime using timestamps. Later, a concurrent decentralized validator (dec-validator) executes AUs in the block in a decentralized manner. The dec-validator is more efficient than the fork-join validator since there is no master validator thread to allocate AUs to the slave validator threads to execute them. Instead, all validator threads independently identify the source vertex (a vertex with an indegree of 0) in the BG and claim the source node to execute the corresponding AU.
Amiri et al. [1] proposed ParBlockchain, a technique for concurrently executing transactions in a permissioned blockchain block. They developed an OXII paradigm2 to support distributed applications. The OXII paradigm orders the block transactions based on the agreement between the orderer nodes using static analysis or speculative execution to obtain the read-set and write-set of each transaction, then generates the BG and constructs the block. The executors from the respective applications execute transactions concurrently and then validate them by re-executing the block transactions. So, the nodes of the ParBlockchain execute transactions in two phases using the OXII paradigm. Based on the transaction conflicts, a block with the BG is generated in the first phase, known as the ordering phase. The second phase, known as the execution phase, executes the block transactions concurrently using the BG appended to the block.
Saraph and Herlihy [40] proposed a simple bin-based two-phase speculative approach to execute AUs concurrently. They empirically validated the possible benefits of their approach by evaluating it on historical transactions from the Ethereum blockchain. In the first phase, the miner employs locks and attempts to execute AUs in the block concurrently by rolling back those that cause the conflict(s). All aborted AUs are kept in the sequential bin and sequentially executed in the second phase. The miner gives concurrent and sequential bin hints in the block to validators to execute the same schedule for conflicting AUs as the miner. The validator executes the concurrent bin AUs concurrently and the sequential bin AUs sequentially. Giving hints about bins takes less space than the BG. However, it does not harness the maximum concurrency available within the block.
Anjana et al. [4] proposed an approach that uses optimistic single-version and multi-version Object-based STMs (OSTMs) for the concurrent execution of AUs by the miner. The OSTMs operate at a higher (object) level rather than a page (read-write) level and construct the BG. However, the BG is still quite significantly large in existing approaches and needs higher bandwidth to broadcast such a large block for validation.
Apart from the concurrent execution of block transactions to improve the efficiency of blockchain platforms, various other attempts have been made to identify bottlenecks (such as the consensus protocol and network bandwidth) and propose techniques to mitigate them. We classify them into network sharding-based techniques (discussed in Sect. 2.2) and improved consensus protocol-based techniques (discussed in Sect. 2.3).
Network sharding-based techniques
Sharding-based techniques have recently been extensively studied to improve the throughput of blockchain platforms. These techniques divide the blockchain network into several sub-networks and work on the assumption that the majority of sub-network nodes are reliable; for example, in a sub-network of s nodes, there are no more than t byzantine nodes (). Fabric [3], Tendermint [31], Elastico [33], and RedBelly [10] are among the practical sharding-based blockchains that are based on the Byzantine Fault Tolerance (BFT) consensus protocol.
Elastico [33] is an open sharding-based blockchain protocol that allows participants to join a random cluster. A leader-based committee verifies cluster transactions and adds a block to the global chain using the practical BFT (pBFT) consensus protocol. However, executing pBFT on a large number of nodes degrades performance [46] and increases the probability of a few nodes failing. RedBelly [10] uses a partially synchronous consensus algorithm for permissioned nodes to add new blocks while permissionless clients submit transactions. Typically, all nodes in a sub-network of the sharded blockchain validate a transaction. However, in RedBally, between and nodes (t is the maximum number of byzantine nodes) in the sub-network perform this validation to increase performance by processing more transactions per consensus instance.
Hyperledger Fabric [3], the most extensively used permissioned blockchain platform, introduced the notion of channels, allowing parallel processing of transactions independently in different channels to increase throughput. Users send transactions to their designated channels, and peers within each channel process these transactions and maintain the system’s partitioned state. Moreover, Fabric introduced the execute-order-validate paradigm in which transactions are first stimulated by endorsing peers and then sent to the ordering service to order and create the block, which is later validated by the endorsing and committing peers. It has been observed that the ordering service is a major performance bottleneck in the Fabric blockchain. Further, the ordering of the transactions is completely arbitrary at the ordering service. Such an arbitrary ordering of transactions results in unnecessary serialization conflicts and causes the commit phase to abort many valid transactions. Sharma et al. [42] observed this problem and proposed a serialization-conflict-aware block transaction ordering that dramatically reduces legitimate transaction aborts. Further, based on the fine-grained concurrency control mechanism, they proposed an early abort of various conflicting transactions that could throttle the network and fail to commit later in the pipeline. The serialization-conflict-aware block transaction ordering and early abort enable the commit of many subsequent legitimate transactions, hence increasing overall throughput.
Another sharding-based blockchain platform is Rapidchain [49], which achieves fast throughput via block pipelining and the gossiping protocol for inter-shard transaction processing. On the other hand, OmniLedger [28] proposes a mechanism for allocating nodes to different clusters and processing intra-shard transactions using a lock-based atomic consensus protocol with an assumption of partial synchrony. AHL [11] presents a trusted hardware-assisted technique for decreasing the number of nodes in a sub-network. Due to the random assignment of nodes to clusters, it is possible to achieve good security with fewer nodes per sub-network than OmniLedger. SharPer [2] uses a flattened consensus protocol and stores the chain as a directed acyclic graph. Each cluster stores only a partial view of the chain and supports intra-shard and inter-shard transactions. Hellings et al. [25] proposed a ByShard framework for shard-resilient systems and an Orchestrate-Execute Model (OEM) in ByShard for processing cross-shard transactions in no more than two consensus steps between the participating shards with minimal use of expensive byzantine resilient primitives. Additionally, they demonstrated that OEM could efficiently process cross-shard transactions based on two-phase commit and two-phase locking protocols.
To summarize, OmniLedger, RedBelly, and RapidChain employ the Unspent Transaction Output (UTXO) model to ensure cross-shard transaction atomicity without using a distributed commit protocol. RapidChain, on the other hand, does not provide isolation, while OmniLedger imposes blocking on cross-shard transactions. Though some approaches support cross-shard transactions, others do not. In comparison to AHL, SharPer employs an account-based paradigm and enables cross-chain transactions by implementing a global consensus mechanism based on the pBFT. On the other hand, Fabric supports the account-based paradigm and requires a trusted channel among participants to execute cross-shard transactions. Additionally, most of the existing sharding-based blockchains split the network into multiple shards and process transactions independently by utilizing variants of the BFT consensus or atomic commit protocols to increase transaction throughput.
Consensus-based techniques
Consensus protocols are identified as the major bottleneck in many blockchain platforms. For example, in permissionless blockchains such as Bitcoin and Ethereum, PoW computation dominates the block creation time. Most of the existing permissioned blockchain platforms rely on BFT consensus protocols such as pBFT, deligated-BFT, federated-BFT, etc., to achieve higher throughput than permissionless blockchains. However, the platform still suffers from scalability and performance issues compared to traditional centralized systems [14]. Various solutions have been proposed in the literature to improve blockchain performance by introducing an improved version of the consensus protocol. Here, in this section, we discuss several such solutions.
Gupta et al. [23] introduced the Resilient Concurrent Consensus (RCC) paradigm for implementing concurrent consensus in practice by running different consensus instances concurrently with minimum coordination to design high-throughput and scalable systems. They They achieved transaction throughput that was previously unattainable by primary-backup consensus systems. Primary-backup consensus (fully-replicated) systems consist of a primary replica and numerous backup replicas. The primary one is the only one that processes client requests or transactions. Hence, the throughput of these systems is limited to the capabilities of a single replica. The RCC paradigm enables any primary-backup consensus protocols to be concurrent, consequently increasing the resilience of consensus-based systems to failures by significantly lowering the impact of faulty replicas on the system’s throughput and operations. The proposed RCC paradigm is tested by integrating it into ResilientDB−a high-performance resilient blockchain platform [24].
Stathakopoulou et al. [45] proposed a similar evolutionary technique for parallel consensus called the Mir-BFT. It is a robust BFT total order broadcast protocol that may significantly increase the throughput of permissioned and PoS permissionless blockchain systems by running concurrent pBFT instances. The Mir-BFT consensus system enables several leaders (block creators or miners) to propose blocks independently. Further, to address the duplication attacks by malicious clients, they proposed a rotational assignment of partitioned request hash space to leaders to ensure that each request is eventually assigned to the correct leader. Dang et al. [11] proposed a hardware-assisted sharding-based high-performance consensus technique. They proposed enhancing the performance of individual shards by enriching BFT consensus protocols and implementing an efficient shard formation protocol that safely assigns nodes into shards by relying on trusted hardware, the Intel Software Guard Extensions (Intel SGX) [35]. The SGX provides a trusted execution environment in the presence of malicious actors in the network. They also developed a generic distributed transaction protocol that ensures transaction safety and liveness even when network peers are malicious.
To conclude, similar to RCC, Mir-BFT eliminates the bandwidth bottleneck associated with a single replica. However, both systems deal with failures differently. In a Mir-BFT epoch, a global primary replica determines the system’s running instances via rotating assignments. Nevertheless, in the event of a failure, the system halts all running instances and restarts with a new epoch using a view change protocol. As a result, Mir-BFT throughput drops to zero during failure recovery, whereas this is not the case with RCC, showing that RCC outperforms Mir-BFT. A hardware-assisted sharding-based high-performance consensus technique relying on the Intel SGX trusted execution environment is proposed in [11]. These techniques can be implemented in each shard of the sharding-based blockchains to address performance and scalability issues. A comprehensive discussion of the theoretical limitations and practical application of consensus protocols for permissioned and permissionless blockchain platforms is provided in [22].
In contrast to these approaches, we propose a framework for concurrent execution of AUs. Our framework optimizes the block transaction execution to increase performance, which could be advantageous even when the blockchain platform is upgraded via network sharding or consensus protocols.
System model
In this section, we present the notions related to STMs and the execution model used in the proposed approach.
Following [21, 30], we assume a system of n processes/threads, that access a collection of transactional objects or t-objects via atomic transactions. Each transaction has a unique identifier. Within a transaction, processes can perform transactional operations or methods:
STM.begin () – initiates an STM transaction and returns a unique id (timestamp).STM.write (x, v) (or ) – updates a t-objectx with value v in the local write buffer (memory) of transaction i.STM.read (x) (or ) – attempts to read a t-objectx and returns value to transaction i.STM.tryC () – tries to commit the transaction i and returns commit (or ) if it succeeds.STM.tryA () – aborts the transaction i and returns .
History A history is a sequence of events, i.e., a sequence of invocations and responses of transactional operations. The collection of events is denoted as . For simplicity, we consider sequential histories, i.e., the invocation of each transactional operation is immediately followed by a matching response. Therefore, we treat each transactional operation as one atomic event and let denote the total order on the transactional operations incurred by H. We identify a history H as tuple .
We assume that transactions are deterministic, which means that they always produce the same results when given the same inputs. Further, we consider well-formed histories, i.e., no transaction begins before the previous transaction invocation has been completed (either commits or aborts). We also assume that every history has an initial committed transaction that initializes t-objects with value 0. The set of transactions that appear in H is denoted by . The set of committed (resp., aborted) transactions in H is denoted by (resp., ). The set of incomplete or live transactions in H is denoted by .
We construct a complete history of H, denoted as , by inserting immediately after the last event of every transaction . But for of transaction , if it is allowed by the STM system to update the first t-object successfully, that means updates made by are consistent, so will immediately return commit.
Transaction real-time and conflict order For two transactions , we say that precedes in the real-time order of H, denoted as , if is complete in H and the last event of precedes the first event of in H. If neither nor , then and overlap in H. We say that a history is serial (or t-sequential) if transactions are ordered by real-time order. We say that are in conflict, denoted as , if
and ;
, and ; and
, and .
Valid and legal histories A successful read (i.e., ) in a history H is said to be valid if there exist a transaction that wrote v to x and committed before . History H is valid if all its successful read operations are valid.
We define ’s lastWrite as the latest commit event preceding in H such that ( can also be ). A successful read operation (i.e., ), is said to be legal if the transaction containing ’s lastWrite also writes v onto x. The history H is legal if all its successful read operations are legal. From the definitions we get that if H is legal then it is also valid.
Notions of equivalence Two histories H and are equivalent if they have the same set of events. We say two histories are multi-version view equivalent [48, Chap. 5] or MVVE if
are valid histories and
H is equivalent to .
are legal histories and
H is equivalent to . By restricting to legal histories, view equivalence does not use multi-versions.
are legal histories and
conflict in are the same, i.e., .
VSR, MVSR, and CSR A history H is said to be VSR (or View Serializable) [48, Chap. 3], if there exist a serial history S such that S is view equivalent to H. But this notion considers only single-version corresponding to each t-object.
MVSR (or Multi-Version View Serializable) maintains multiple version corresponding to each t-object. A history H is said to MVSR [48, Chap. 5], if there exist a serial history S such that S is multi-version view equivalent to H. It can be proved that verifying the membership of VSR as well as MVSR in databases is NP-Complete [38]. To circumvent this issue, researchers in databases have identified an efficient sub-class of VSR, called CSR based on the notion of conflicts. The membership of CSR can be verified in polynomial time using conflict graph characterization.
A history H is said to be CSR (or Conflict Serializable) [48, Chap. 3], if there exist a serial history S such that S is conflict equivalent to H.
Serializability and opacity Serializability [38] is a commonly used criterion in databases. But it is not suitable for STMs as it does not consider the correctness of aborted transactions as shown by Guerraoui and Kapalka [20]. Opacity, on the other hand, considers the correctness of aborted transactions as well.
A history H is said to be opaque [20, 21] if it is valid and there exists a t-sequential legal history S such that
S is equivalent to complete history and
S respects , i.e., .
Co-opacity A history H is said to be co-opaque [30] if it is valid and there exists a t-sequential legal history S such that
S is equivalent to complete history ;
S respects , i.e., ; and
S preserves conflicts (i.e. ).
The invocation and response events can be reordered to get a valid sequential history.
The generated sequential history satisfies the object’s sequential specification.
If a response event precedes an invocation event in the original history, then this should be preserved in the sequential reordering.
Proposed mechanism
In this section, we first present a high-level overview of how conflicts are identified and stored in OptSmart (Sect. 4.1). Next, in Sect. 4.2, we provide the lock-free concurrent block graph library methods, followed by concurrent execution of AUs by miners (Sect. 4.3) and validators (Sect. 4.4). Finally, we present the optimizations used by OptSmart that improve storage optimality and efficiency when compared with the state-of-the-art in Sect. 4.5.
Transaction conflicts in OptSmart
In OptSmart, conflicts, aborts, and re-execution of AUs depend on the STM protocol, i.e., BTO and MVTO. A unique timestamp is allocated to each AU when a corresponding STM transaction begins using
[See PDF for image]
Fig. 2
Let’s consider three STM transactions: T, T, and T, executing AU, AU and AU, respectively. (a) Shows the structure of a shared data item in BTO STM. (b) depicts the timeline of T, T, and T performing operations ), and for respective AUs. Here, T conflicts with T since T has written to data item A and committed successfully before T. With each shared data item, the id (timestamp) of all committed transactions is stored in the . Similarly, T is reading data item A, earlier written by T and T. Therefore, T conflicts with T and T; these conflicts are stored in the corresponding field conf of T in the confList, as shown in (c)
The underlying STM system is updated to record the read-write set of each transaction at runtime for this purpose. The modified data structure of BTO STM to maintain conflicts is shown in Fig. 2a. As shows the structure of a shared data item consisting of five fields: . Where Account is a unique shared data item (A), val stores the value of the data item. stores the maximum transaction timestamp that performed a read operation on A. stores the timestamp (id) of committed STM transactions that performed a write operation on data item A. next field stores the pointer to the next shared data item in the list.
STM system detects conflicts for each committed transaction (AU) and stores them in the list called the confList (as shown in Fig. 2c) and BG. In BTO, two transactions (T and T) conflict, if both are accessing a common data item k and at least one of them is writing to k. The conflict rules for BTO STM are given in Eq. (1).
1
When a transaction T performs a read operation on A, it updates the with its timestamp if its timestamp (T.ts) is greater than the current value of . stores the timestamp (id) of all committed STM transactions that performed a write operation on A. These fields are used to generate conflicts between the transactions and the () method of the STM system populates the confList for a committed transaction T. The confList for a transaction T (or AU) can be obtained through theThe block graph
This subsection presents the data structure of the BG and concurrent graph library methods accessed by the concurrent miner and validator.
Data structure of the lock-free concurrent block graph As illustrated in Fig. 3, the concurrent miner constructs the concurrent bin and the block graph to assist validators for concurrent execution while avoiding FBR errors. A vector is used to store non-conflicting AUs in the concurrent bin. The concurrent bin, as shown in Fig. 3a, is made up of six AUs with the ids 3, 4, 5, 7, 8. While, we use the adjacency list to maintain the block graph BG(V, E), as shown in Fig. 3b. Where V is a set of vertices (or vNodes) that are stored in the vertex list (or vList) in increasing order of timestamp between two sentinel nodes vHead (-) and vTail (+). For brevity of representation, sentinel nodes are not shown in the figure. Each vertex node (or vNode) contains . Where is the id of a atomic unit executed by the STM transaction T. To maintain the indegree count of each vNode, we initialize inCnt as 0. vNext and eNext are initialize as nil.
[See PDF for image]
Fig. 3
Pictorial representation of the Block Graph
As shown in Fig. 3b, E is a set of edges that maintains conflicts of vNode in the edge list (or eList). eList stores eNodes (or conflicting transaction nodes, say T) in increasing order of AU id between two sentinel nodes, eHead (-) and eTail (+), sentinel nodes are not shown in the figure for the brevity of representation. The edge node (or eNode) contains , eNext = nil. Here, x is a unique id of the AUs with a corresponding STM transaction T. The STM transaction T is having a conflict with T (STM transaction of other conflicting ) such that T is less than T. We add conflicting edges from lower timestamps (lower AU id) to higher timestamp transactions (higher AU ids) to maintain the acyclicity in the BG, i.e., the conflict edge is from T (or ) to T (or ) in the BG. Figure 3c illustrates this using three AUs with id 1, 2, and 6, which maintain the acyclicity while adding an edge from lower to higher AU id. To make it search efficient, vertex node reference (or vref) keeps the reference of its own vertex, which is present in the vList and eNext initializes as nil.
The concurrent bin and the BG generated by the concurrent miner help validators execute AUs in the block concurrently and deterministically in two phases: In the first phase using the concurrent bin and then in the second phase through lock-free graph library methods3. The lock-free graph library exports five methods as follows:
Lock-free graph library methods accessed by the concurrent miner The concurrent miner calls the
After successfully adding vNode in the BG, the concurrent miner calls
Lock-free graph library methods accessed by the concurrent validator Concurrent validators use the
If the source node does not exist in the local cacheList, then the concurrent validator thread calls
The concurrent miner
Smart contracts in blockchain are executed in two different contexts. First, the miner proposes a new block. Second, multiple validators re-execute to verify and validate the block proposed by the miner. In this subsection, we describe how the miner executes the smart contracts concurrently.
The concurrent miner collects the client transactions from the network and stores them in the pending transaction pool. These transactions are associated with the methods (atomic units) of smart contracts. To run the smart contracts concurrently, we have faced the challenge of identifying the conflicting transactions at run-time because smart contract languages are Turing-complete. Two transaction conflicts if they access a shared data-object, and at least one of them performs a write operation.
At concurrent miner, conflicts are identified during run-time using an efficient framework provided by the optimistic Software Transactional Memory system (STMs). STMs access the shared data-objects called t-objects. Each shared t-object is initialized to an Initial State (or IS). The atomic units may modify the IS to some other valid state. Eventually, it reaches the Final State (or FS) at the end of block-creation. As shown in Algorithm 8, the concurrent miner first adds AUs in the concurrent bin at Line 88. Each transaction T gets the unique timestamp i from
If curStep is write(x) at Line 102 then it calls the
If the conflict list is nil (Line 114), there is no need to create a node in the BG. Otherwise, create a node with respective dependencies in the BG and remove those AUs from the concurrent bin (Line 117). It calls
The concurrent validator
The concurrent validator validates the block proposed by the concurrent miner. It executes the block transactions concurrently and deterministically in two phases using the concurrent bin and the BG provided by the concurrent miner. In the first phase, validator threads execute the independent AUs of concurrent bin concurrently (Line 127 to Line 131). Then in the second phase, it uses the BG to executes the dependent AUs by
After the successful execution of atomic units, the concurrent validator compares its computed final state with the final states given by the concurrent miner. If the final state of shared data-objects matches, then the block proposed by the concurrent miner is valid. Finally, based on consensus between network peers, the block is appended to the blockchain, and the respective concurrent miner is rewarded.
Optimizations
To make the proposed approach storage optimal and efficient, this subsection explains the key change performed on top of the solution proposed by Anjana et al. [5].
In Anjana et al. [5], there is a corresponding vertex node in the block graph (BG) for every AU in the block. We observed that all AUs in the block need not have dependencies. Adding a vertex node for such AUs takes additional space in the block. This is the first optimization our approach provides. In OptSmart, only the dependent AUs have a vertex in the BG, while the independent AUs are stored in the concurrent bin, which does not need any additional space. During the execution, a concurrent miner thread does not add a vertex to the BG if it identifies that the currently executed AU does not depend on the AUs already executed. However, suppose any other miner thread detects any dependence during the remaining AUs execution. That thread will add the dependent AU vertices in the BG.
For example, if we have n AUs in a block and a vertex node size is K bytes to store in the BG, then it needs a total of K bytes of vertex node space for Anjana et al. [5]. Suppose from n AUs, only have the dependencies, then a total of K bytes vertex space is needed in the BG. In the proposed approach, the space optimization can be 100% in the best case when all AUs in a block are independent. While in the worst case, it can be 0% when all AUs are dependent. In practice, only a few AUs in a block have dependencies [13, 40]. Our experiment setup is more conservative and operates on a few benchmark contracts that produce more dependencies than existing blockchain workloads. Space-optimized BG helps to improve the network bandwidth and reduce network congestion.
Further, our approach combines the benefits of both the Speculative Bin-based approach [40] and the STM-based approach [5] to yield the maximum speedup that can be achieved by validators to execute AUs. So, another optimization is at the validators’ side; due to the concurrent bin in the block, the time taken to traverse the BG will decrease; hence, speedup increases. Concurrent validator execution is modified and divided into two phases. First, it concurrently executes AUs of the concurrent bin using multiple threads, since AUs in the concurrent bin will be independent. In the second phase, dependent AUs are stored in the BG and concurrently executed using BG to preserve the transaction execution order as executed by the miner.
Correctness
The correctness of concurrent BG, miner, and validator is described in this section. We first list the linearization points (LPs) of methods of the BG library [9] as follows:
addVert (vNode): (vPred.vNext.CAS(vCurr, vNode)) in Line 19 is the LP point ofaddVert () method if vNode is not exist in the BG. If vNode is exist in the BG then (vCurr.vNode.) in Line 17 is the LP point.addEdge (fromNode, toNode): (ePred.eNext.CAS(eCurr, eNode)) in Line 31 is the LP point ofaddEdge () method if eNode is not exist in the BG. If eNode is exist in the BG then (eCurr. toNode.) in Line 29 is the LP point.searchLocal (cacheVer, AU): (cacheVer.inCnt.CAS(0, -1)) in Line 41 is the LP point ofsearchLocal () method.searchGlobal (BG, AU): (vNode.inCnt.CAS(0, -1)) in Line 52 is the LP point ofsearchGlobal () method.decInCount (remNode): Line 63 is the LP point ofdecInCount () method.
Corollary 1
A historygenerated by the concurrent miner using the BTO protocol satisfies co-opacity.
Let is a concurrent history generated by the concurrent miner while executing AU s concurrently using the MVTO protocol [29]. The MVTO protocol proves that any history generated by it satisfies opacity [20].
Corollary 2
A historygenerated by the concurrent miner using the MVTO protocol satisfies opacity.
While executing the history using BTO or MVTO protocol, concurrent miner generates the block graph (BG) in a lock-free manner as similar to [9].
Corollary 3
The BG generated by the concurrent miner is lock-free.
Theorem 1
All dependencies between the conflicting nodes by BTO and MVTO are captured in BG.
Proof
Dependencies between the conflicting nodes are captured in the BG using LP points of the lock-free graph library methods defined above. A concurrent miner constructs the lock-free BG using the BTO and MVTO protocols in Sect. 4.2. BG consists of vertices and edges, where each committed AU acts as a vertex, and edges (or dependencies) represent the conflicts of the respective STM protocol (BTO and MVTO). The graph generated by BTO [48, Chap 4] consists of read-write, write-read, and write-write conflicts, which are the same as dependencies captured in BG. Similarly, the graph generated by MVTO [29] consists of read-from and multi-version conflicts, which are the same as the dependencies captured in BG. Since the BTO and MVTO protocols used by the miner for concurrent execution are correct, i.e., these protocols capture all dependencies correctly between the conflicting nodes. Hence, dependencies between the conflicting AUs by BTO and MVTO are captured in the BG properly. While, independent AUs are stored in the concurrent bin and can be executed in any order.
Theorem 2
A historygenerated by the concurrent miner using the BTO protocol and a historygenerated by the concurrent validator are view equivalent.
Proof
A concurrent miner executes the conflicting AUs of concurrently using the BTO protocol, capturing the dependencies of in the BG, while independent AUs are stored in the concurrent bin and can be executed in any order. Then the miner creates block B and broadcasts it along with the concurrent bin and the BG to concurrent validators to verify block B. The concurrent validator applies the topological sort on the BG to get the equivalent execution order for conflicting AUs and obtain an equivalent serial schedule . Since the BG is constructed from , it considers all conflicts and obtained from the topological sort of the BG. So, is equivalent to . Similarly, also follows the read from relation of . Hence, is legal. Since and are equivalent to each other, and is legal. So, and are view equivalent.
Theorem 3
A history generated by the concurrent miner using the MVTO protocol and a history generated by the concurrent validator are multi-version view equivalent.
Proof
Similar to the proof of Theorem 2, the concurrent miner executes the AUs of concurrently using the MVTO protocol, captures the dependencies in the BG, stores independent AUs in the concurrent bin, proposes a block B, and broadcasts it to the concurrent validators to verify it. MVTO maintains multiple-versions corresponding to each shared object. Later, the concurrent validator obtained by applying topological sort on the BG to get the equivalent execution order for conflicting AUs. is obtained from a topological sort on the BG, so is equivalent to . Similarly, the BG maintains the read from relations of . So, from the MVTO protocol, if T reads a value for shared object k, say from T in , then T commits before in . Therefore, is valid. Since and are equivalent to each other and is valid. So, and are multi-version view equivalent.
Experimental evaluation
This section presents a detailed experimental evaluation of the proposed approach; it provides a high-level overview of the benchmark contracts, experimental setup, and workloads. We analyze the proposed approach on different matrices such as speedup, block graph size, throughput, performance overhead in gas used, STM transaction aborts, and single-threaded execution. Finally, the section ends with a discussion.
By leveraging concurrent execution of AUs, we aim to improve the efficiency of miners and validators while optimizing the space. We ran simulations on a set of benchmark experiments using Ethereum smart contracts from the Solidity documentation [44]. Since the Ethereum Virtual Machine (EVM) does not support multi-threading [13, 16], we converted Ethereum smart contracts into C++. We compared the proposed approach with state-of-the-art approaches [5, 13, 40] over baseline serial execution on three different workloads by varying the number of AUs, threads, and shared objects. Our benchmark experiments are conservative and consist of one or more smart contract AUs in a block, leading to more conflicts than actual conflicts in practice (a block consists of AUs from 1.5 million deployed contracts [36]). Saraph and Herlihy [40] conducted an empirical study using historical data from the Ethereum blockchain. They proposed a speculative Bin-based technique that outperforms serial execution. OptSmart combines the Bin-based and STM approaches and is expected to outperform the Bin-based technique. The block transaction execution time is significantly less than the PoW consensus time in permissionless blockchains such as Bitcoin and Ehtereum. However, due to fewer conflicts in the actual blockchain, the proposed approach will provide more excellent concurrency in permissioned and permissionless Proof-of-Stack (PoS) based blockchain platforms.
Our experimental evaluation is designed to answer the following questions: (1) How much speedup is achieved by concurrent miners and validators with varying AUs, fixed threads, and shared objects? As conflicts increase with increasing AUs, a decrease in speedup is expected. (2) How does speedup change when increasing the number of threads with a fixed number of AUs and shared objects? We expect to see speedup increase with increasing threads confined by the logical threads available within the system. (3) How does speedup change with varying shared objects, fixed AUs, and threads? We expect an increase in speedup due to conflict deterioration with objects increasing. So, we anticipate concurrent miners and validators will overweigh serial miners and validators with fewer conflicts.
Contract selection and benchmarking
This section provides a comprehensive overview of benchmark contracts, coin, ballot, and simple auction, from the Solidity documentation [44]. We chose these contracts to demonstrate the wide variety of use-cases, such as a financial application using a coin contract, a collaborative use-case using a ballot, and a bidding application using an auction contract. The AUs in a block for the coin, ballot, and auction benchmark operate on the same contract, i.e., the transaction calls for one or more methods of the same contract. In practice, a block consists of AUs from different contracts; hence, we designed another benchmark called the mix contract consisting of AUs from the coin, ballot, and auction contracts in equal proportions in a block. The benchmark contracts are as follows:
Coin contract The coin contract is the simplest form of sub-currency. The users involved in the contract have accounts, and accounts are shared objects. It implements methods such as
Ballot contract The ballot contract is an electronic voting contract in which voters and proposals are shared objects. The
Simple auction contract It is an online auction contract in which bidders bid for a commodity online. When the auction ends, the amount from the maximum bidder is granted as the owner of the commodity. The bidders, max bid, and the max bidder are the shared object. In our experiments, the initial contract state is a fixed number of bidders with a fixed initial account balance and a fixed auction period to end. In the beginning, the maximum bidder and bid are set to null (the base price and the owner can be set accordingly). The bidder uses the contract method
Mix contract In this contract, we combine the AUs in equal proportion from the above three contracts (coin, ballot, and auction). Therefore, our experiment block consists of an equal number of corresponding contract transactions with the same initial state.
Experimental setup and workloads
We ran our experiments on a large-scale 2-socket Intel(R) Xeon(R) CPU E5-2690 V4 @ 2.60 GHz with a total of 56 hyper-threads (14 cores per socket and two threads per core) with 32 GB of RAM running Ubuntu 18.04.
It has been observed that speedup varies from contract to contract on different workloads. The speedup on various contracts is not for comparison between contracts. Instead, it demonstrates the proposed approach efficiency on several use-cases in the blockchain. We have considered the following three workloads (see Table 2) for performance evaluation:
Table 2. Workloads
Workload | AUs | Threads | Shared objects |
|---|---|---|---|
Workload 1 (W1) | 100–600 | 50 | 2K |
Workload 2 (W2) | 300 | 10–60 | 2K |
Workload 3 (W3) | 300 | 50 | 1K–6K |
In workload 1 (W1), a block consists of AUs that vary from 100 to 600, fixed 50 threads, and 2K shared objects. The AUs per block in the Ethereum blockchain are on an average of 100, while the actual could be more than 200 [13], however, with a theoretical maximum of [43] after a recent increase in the gas limit. Over time, the number of AUs per block is increasing. In practice, one block can have fewer AUs than the theoretical cap, which depends on the block’s gas limit and the gas price of the transactions. We will see that in a block, the percentage of data conflicts increases with increasing AUs. We found that the conflict varies from contract to contract and has a diverse effect on speedup.
In workload 2 (W2), we varied the number of threads from 10 to 60 while fixing the AUs to 300 and the shared objects to 2K. Our experiment system consists of a maximum of 56 hardware threads, so we experimented with a maximum of 60 threads. We observed that the speedup of the proposed approach increases with an increasing number of threads limited by logical threads.
The number of AUs and threads in workload 3 (W3) is 50 and 300, respectively, although the shared objects range from 1K to 6K. This workload is used with each contract to measure the impact of the number of participants involved and conflicts per block. Data conflicts are expected to decrease with an increasing number of shared objects. However, the search time may increase. The speedup depends on the contract’s execution, but it increases with an increasing number of shared objects.4
Analysis
For analysis, blocks of AUs were generated for each benchmark on three workloads: W1 (varying AUs), W2 (varying threads), and W3 (varying shared objects). Then, miners and validators execute the blocks concurrently. The corresponding block serial execution is considered a baseline to compute the speedup of proposed miners and validators. The running time is collected for 15 iterations, with 10 blocks per iteration and 10 validators validating each block. The first block of each iteration is left as a warm-up run. A total of 150 blocks are created for each data point in the plot. Each block’s execution time is averaged by 9. Further, the total time taken by all iterations is averaged by the number of iterations for each data point; Eq. (2) is used to compute a reading.
2
where is an average time for a reading, n is the number of iterations, m is the number of blocks, and is block execution time.The proposed approach is compared with the state-of-the-art speculative bin (SpecBin) approach [40], the fork-join approach (fj-validator) [13], and the STM approach proposed in [5] (we call it the default or Def approach). The proposed approach combines the benefits of both the bin-based and the STM approaches to the get maximum benefit for concurrent miners and validators (we call it the optimized or Opt approach)5. The proposed OptSmart miner and validator are expected to produce an optimal BG to minimize space overhead and outperform state-of-the-art approaches.
In all plots, subfigures (a), (b), and (c) correspond to W1, W2, and W3, respectively. Figures 4, 5, 6 and 7 show the speedup achieved by proposed and state-of-the-art concurrent miners over serial miners for all benchmarks and workloads. Figures 8, 9, 10 and 11 show the speedup achieved by proposed and state-of-the-art concurrent decentralized validators over serial validators for all benchmarks and workloads. Figure 12 shows the speedup achieved by proposed and state-of-the-art concurrent fork-join validators over serial validators for the mix contract. Figure 13 shows the average number of edges (dependencies) and vertices (AUs) in the block graph for the mix contract on all workloads, while Fig. 14 shows the percentage of additional space required to store the block graph in the Ethereum block. The amount of gas used per block and the average number of STM transaction aborts are given in Figs. 17 and 18, respectively. The execution overhead of the concurrent approaches for resource-constrained peers with single-threaded execution is given in Fig. 19 for the mix benchmark. Figures 15 and 16 depict the miners’ and validators’ throughput on the mix benchmark.6 The overall average speedup by concurrent miners and validators over the serial miners and validators on all benchmark contracts is given in Appendix A Table 4.
Speedup analysis
Figures 4a to 7a show the speedup for concurrent miners on W1. As shown in Figs. 4a and 7a for read-intensive contracts such as in coin and mix, the speedup increases with the number of AUs. While in write-intensive contracts such as ballot and auction, the speedup does not increase with the number of AUs; instead, it may drop-down, as shown in Figs. 5a and 6a, respectively. This is because contention increases with an increase in the number of AUs.
[See PDF for image]
Fig. 4
Concurrent miner speedup over serial miner for coin contract
[See PDF for image]
Fig. 5
Concurrent miner speedup over serial miner for ballot contract
[See PDF for image]
Fig. 6
Concurrent miner speedup over serial miner for auction contract
[See PDF for image]
Fig. 7
Concurrent miner speedup over serial miner for mix contract
[See PDF for image]
Fig. 8
Concurrent decentralized validator speedup over serial validator for coin contract
[See PDF for image]
Fig. 9
Concurrent decentralized validator speedup over serial validator for ballot contract
[See PDF for image]
Fig. 10
Concurrent decentralized validator speedup over serial validator for auction contract
[See PDF for image]
Fig. 11
Concurrent decentralized validator speedup over serial validator for mix contract
Figures 8a, 9a, 10a, 11a and 12a show the speedup for concurrent validators over serial validators on W1. The speedup for concurrent validators (decentralized and fork-join) increases with an increase in the number of AUs. Figures 8a, 9a, 10a, 11a and 11a demonstrate the speedup achieved by decentralized validators. It can be observed that for read-intensive benchmarks, the optimized MVTO decentralized validator (Opt-MVTO dec-validator) outperforms other validators. In contrast, in write-intensive benchmarks, the default MVTO decentralized validator (Def-MVTO dec-validator) achieves better speedup over other validators. Due to the overhead of multithreading for the concurrent bin with very fewer AUs. We observed that with increasing AUs in the blocks, the conflicts also increase. As a result, the number of transactions in the concurrent bin decreases. The speedup of the speculative bin decentralized validator (SpecBin dec-validator) is relatively less than that of concurrent STM dec-validators. Since the STM miner precisely determines the dependencies between the AUs of the block and harnesses the maximum concurrency, it has an edge over the bin-based miner. Suppose if the block consists of the AUs with very few dependencies, then the SpecBin dec-validator outperforms other validators, as shown in Fig. 8a.
Figure 12a shows the speedup for fork-join validators on W1 for mix contract. We can observe that the proposed optimized MVTO fork-join validator (Opt-MVTO fj-validator) outperforms other fork-join and serial validators due to lower overhead at master validator thread to allocate independent AUs to slave validator threads. We noticed that decentralized concurrent validators’ speedup is relatively high over fork-join concurrent validators because there is no bottleneck in this approach for allocating the AUs. All threads in the decentralized approach work independently. It can also be observed that with fewer AUs in mix contract, the speedup by fork-join validators drops to the point it is less than the serial due to the overhead of thread creation dominate the speedup achieved, as shown in Fig. 12a.
In W1, concurrent miners achieve a minimum of and up to speedup over serial miners across the contracts. The concurrent STM dec-validators achieve a minimum and maximum up to speedup, while SpecBin dec-validator speedup ranges from to over serial validator across the contracts. The fork-join validators achieve a maximum speedup of over the serial validator. The proposed Opt-MVTO miner achieves an average speedup of over the serial miner, which is significant. In contrast, it achieves an effective average speedup of , , , and over Def-BTO, Opt-BTO, Def-MVTO, and SpecBin miners, respectively. Similarly proposed Opt-MVTO dec-validator attends an average speedup of over the serial validator. At the same time, a maximum of is achieved by Def-MVTO dec-validator, the average speedup of SpecBin dec-validator is , remarkably less over STM validators, across the benchmarks. The Opt-MVTO dec-validator achieves an effective average speedup of , , , and over Def-BTO, Opt-BTO, Def-MVTO, and SpecBin dec-validator, respectively. The fork-join validator speedup remains as low as and as high as over baseline serial validator across all benchmarks.
Figures 4b, 5b, 6b, 7b, 8b, 9b, 10b, 11b and 12b show the speedup on W2. The speedup increases with an increase in the number of threads. However, it is limited by the maximum number of logical threads in the experimental system. Thus, a slight drop in the speedup can be seen from 50 threads to 60 threads because the experimental system has a maximum of 56 logical threads. The reset of the concurrent miner observations is similar to the workload W1 based on read-intensive and write-intensive benchmarks.
As shown in Figs. 8b, 9b, 10b and 11b, the concurrent decentralized validators’ speedup increase with an increase in threads. While as shown in Fig. 12b, the concurrent fork-join validators’ speedup drops with increasing threads. The reason is that the master validator in the fork-join approach becomes a bottleneck. The decentralized validators’ observation shows that for the read-intensive benchmark, the Opt-MVTO dec-validator outperforms other validators. While in the write-intensive benchmark, the Def-MVTO dec-validator outperforms other validators, as shown in Fig. 9b. However, in the fork-join validator, the proposed Opt-MVTO fj-validator outperforms all other validators due to the optimizations.
In W2, concurrent miners obtain a minimum of and a maximum of speedup over serial miners across the benchmarks. The concurrent STM dec-validators achieve a minimum speedup of and maximum up to , while the SpecBin dec-validator speedup ranges from to over serial validator across the benchmarks. The fork-join validators achieve a maximum speedup of over the serial validator. The proposed Opt-MVTO miner achieves an impressive average speedup of over the serial miner. In contrast, it achieves an effective average speedup of , , , and over Def-BTO, Opt-BTO, Def-MVTO, and SpecBin miners, respectively. The proposed Opt-MVTO dec-validator attend an average speedup of over the serial validator. At the same time, a maximum of speedup is achieved by the Def-MVTO dec-validator. The average speedup of SpecBin dec-validator is , remarkably less over STM validators, across the benchmarks. The Opt-MVTO dec-validator achieves an effective average speedup of , , , and over Def-BTO, Opt-BTO, Def-MVTO, and SpecBin dec-validator, respectively. The fork-join validator speedup remains as low as and as high as over baseline serial validator across all benchmarks.
The plots in Figs. 4c, 5c, 6c, 7c, 8c, 9c, 10c, 11c and 12c show the concurrent miners and validators speedup on W3. As shared objects increase, the concurrent miner speedup increases because conflict decreases due to less contention. Additionally, when contention is very low, more AUs are added in the concurrent bin. However, it also depends on the contract. If the contract is write-intensive, fewer AUs are added in the concurrent bin. While more AUs added in the concurrent bin for read-intensive contracts.
As shown in Figs. 4c and 7c, the speculative bin miners surpass STM miners due to read-intensive contracts. While in Figs. 5c and 6c, the Def-MVTO Miner outperforms other miners as shared objects increase. In contrast, Def-BTO Miner performs better over other miners when AUs are fewer because search time in write-intensive contracts to determine respective versions is much more in MVTO miner than BTO miner. Although, all concurrent miners performers better than the serial miner. In W3, concurrent miners start at around and achieve a maximum of speedup over serial miners across all contracts.
The speedup by validators (decentralized and fork-join) increases with shared objects. In Figs. 8c, 10c, and 11c, proposed Opt-STM dec-validator perform better over other validators. However, for write-intensive contracts, the number of AUs in the concurrent bin would be less. Therefore, the speedup by Def-STM dec-validators is greater than Opt-STM dec-validators, as shown in Fig. 9c. The SpecBin dec-validator speedup is quite less over concurrent STM dec-validators because STM miner precisely determines the dependencies between the AUs than the bin based miner.
[See PDF for image]
Fig. 12
Concurrent fork-join validator speedup over serial validator for mix contract
In fork-join validators, proposed Opt-STM fj-validators outperform over all other fj-validators, as shown in Fig. 12c because of less contention at the master validator thread in the proposed approach to allocate independent AUs to slave validator threads. We noticed that decentralized concurrent validators speedup is relatively high over fork-join concurrent validators with similar reasoning explained above.
In W3, concurrent miners obtain a maximum speedup of over serial miners, maximum across the workloads, and benchmarks for concurrent miners. The STM decentralized validators start at a minimum speedup of and achieve a maximum of . In comparison, the SpecBin dec-validator speedup ranges from to over serial miner across the benchmarks. The Opt-MVTO fj-validator achieves a maximum speedup of over the serial validator across the benchmarks. It indicates that when dependencies (conflicts) decrease in the block graph, the proposed optimized fork-join validator started performing well due to more number of AUs being added to the concurrent bin. The concurrent validators benefited from the work of the concurrent miners and outperformed serial validators. The proposed Opt-MVTO miner achieves a notable average speedup of over the serial miner. In contrast, it achieves an effective average speedup of , , , and over Def-BTO, Opt-BTO, Def-MVTO, and SpecBin miners, respectively.
Opt-MVTO and Def-MVTO dec-validator attend an average speedup of and over the serial validator. The average speedup of SpecBin dec-validator is , across the benchmarks, due to significantly fewer conflicts and more AUs in concurrent bin, leading to high-performance gain in read-intensive workloads. The Opt-MVTO dec-validator achieves an effective average speedup of , , , and over Def-BTO, Opt-BTO, Def-MVTO, and SpecBin dec-validator, respectively. The fork-join validator speedup remains as low as and as high as over baseline serial validator across all benchmarks.
Block graph analysis
Figure 13 shows the average number of edges (dependencies as histograms) and vertices (AUs as line chart) in the BG for mix contract on all workloads7. The average number of edges (dependencies) in the BG for both Default and Optimized approach for respective STM protocol remains the same; hence only two histograms are plotted for simplicity. As shown in Fig. 13a, with increasing AUs in W1, the BG edges and vertices also increase. It shows that the contention increases with increasing AUs in the blocks. As shown in Fig. 13b in W2, the number of vertices and edges does not change much. However, in the W3, the number of vertices and edges decreases, as shown in Fig. 13c.
In our proposed approach, the BG consists of vertices respective to only conflicting AUs, and non-conflicting AUs are stored in the concurrent bin. While in the approach of Anjana et al. [5], all AUs had corresponding vertex nodes in the BG shown in Fig. 13. So, in W1, it will be 100 vertices in the BG if block consists of 100 AUs and 200 if block consists of 200 AUs. In W2 and W3, it will be 300 vertices. Having only conflicting AUs vertices in BG saves much space because each vertex node takes 2-byte storage space.
[See PDF for image]
Fig. 13
Average number of edges (dependencies) and vertices (AUs) in block graph for mix contract
The average block size in the Bitcoin and Ethereum blockchain is KB [8] and KB [18], respectively, measured for the interval of 1 January 2019 to 31 December 2020. The block size keeps on increasing over time, and so do the number of transactions in the block. The average number of transactions in the Ethereum block is [18]. Therefore, in the Ethereum blockchain, each transaction size is on an average KB ( bytes). We computed the block size based on these simple calculations when AUs vary in the block for W1. Equation (3) is used to compute the block size (B) for the experiments.
3
Where, B is block size in bytes, number of AUs in block, and 200 is the average size of an AU in bytes.To store the block graph BG(V, E) in the block, we used a sparse graph using adjacency list. In the BG, a vertex node takes 2 bytes of storage space, representing the AU . While an edge node requires the same storage space as the state-of-the-art, as the number of edges in these two approaches is equivalent. Hence, we did not consider the storage cost for edges in our comparison. Equation (4) is used to compute the size of BG ( bytes). While Eq. (5) is used to compute the additional space ( percentage) needed to store BG in the block.
4
where, is size of BG in bytes, is size of a vertex node of BG in bytes.5
[See PDF for image]
Fig. 14
Percentage of additional space to store block graph in Ethereum block for mix contract
Figure 14 demonstrates the average percentage of additional storage space required to store BG in the Ethereum block for mix contract on all workloads. We can observe that the space requirement also increases with an increase in the number of dependencies and vertices in BG. However, the space requirement of our proposed approach is smaller than the existing default approach. It can be seen that the space requirements of BG by Opt-BTO BG and Opt-MVTO BG is smaller than Def-BTO BG and Def-MVTO BG miner, respectively. The average additional space required by STM approaches to store BG in Ethereum block is given in the Appendix A Table 5.
The proposed approach significantly reduces the BG size for mix contract as shown in Fig. 14 across all workloads.
The space advantages originate from the bin-based approach combined with the STM approach, where concurrent bin information needs to be added into the block, which requires less space than having a corresponding vertex in BG for each AUs of the block. So, we combine the advantages of both the approaches (STM and Bin) to get maximum speedup with storage optimal BG. The average space required for BG in % w.r.t. block size across all workloads and contracts is , , , and by Def-BTO, Def-MVTO, Opt-BTO, and Opt-MVTO approach, respectively. The proposed Opt-BTO and Opt-MVTO BG are (or ) and (or ) more efficient than Def-BTO and Def-MVTO BG, respectively.
Throughput analysis
The throughput analysis of the proposed concurrent miners and validators, along with state-of-the-art approaches, is discussed here. Figure 15 shows the miners’ throughput, while Fig. 16 shows validators’ throughput for the mix contract.
Figure 15a shows the throughput trend for miners. As shown, the throughput increases with the increasing number of AUs per block in W1, as well it increases with threads on W2 (Fig. 15b). However, to our surprise, it decreases with an increasing number of shared objects on W3, as shown in Fig. 15c. This decline in the throughput is due to the increase in search time for shared objects.
In Fig. 15a, the throughput increases with increasing AUs up to 500 AUs per block and reaches a maximum of 795 tps for the proposed Opt-MVTO miner; after that, there is a drop-down. The throughput remains relatively steady for the serial miner, while it increases for all concurrent miners, which confirms that OptSmart improves the throughput. The proposed Opt-MVTO miner on W1 obtains the maximum throughput of 795 tps at 500 AUs per block, which is higher than that of serial execution. At the same time, it is , , , and higher than that of Def-BTO, Opt-BTO, Def-MVTO and, SpecBin concurrent miner, which implies that the proposed optimization gives maximum benefit over the state-of-the-art. Similar trends can be observed in W2 for miners (Fig. 15b); however, it can be seen that the maximum gain in the throughput is from 10 threads to 20 threads. Thus, it indicates that the sweet spot for maximum concurrency benefits lies between 10-20 threads for miners where Opt-MVTO miner gains a improvement over serial and achieves gain at 20 threads from 10 threads Opt-MVTO miner, which is significant. Nevertheless, this difference narrows down when we further increase the number of threads.
[See PDF for image]
Fig. 15
Miner throughput (transactions per second) for mix contract
[See PDF for image]
Fig. 16
Validator throughput (transactions per second) for mix contract
The throughput trends for W3 (Fig. 15c) for miners are surprisingly different; in W3, when we increase shared objects, the speedup increases as observed in Fig. 7c, and conflicts between AUs decrease as observed in Fig. 13c. Hence, throughput is expected to increase; however, it decreases—the reason being an increase in search time for shared objects. The maximum throughput of 686 tps is achieved by Opt-MVTO miner at 1K shared object, while throughput for serial remains a minimum and changes from 347 to 80 tps from 1k to 6k shared objects. Another interesting observation is that when shared objects increase, the conflict decreases; hence more AUs are added into the concurrent bin; therefore, SpecBin archives 489 tps higher than 468 tps of Opt-MVTO and other miners.
As shown in Figs. 15 and 16, the gain in throughput by validators is relatively more significant than that of miners; nevertheless, trends remain the same for respective workloads. This gain implies that concurrent execution of AUs computationally benefits the validators significantly due to deterministic execution over miners.
There are two critical observations; first, the Def-MVTO starts outperforming the proposed Opt-MVTO dec-validator when AUs increase from 400 to 600 per block in W1 (Fig. 16a); this happens because of the increase in conflicts, and fewer transactions in the concurrent bin, consequently lead to multithreaded concurrent bin execution overhead. The Opt-MVTO dec-validator attains a maximum of 1789 tps while Def-MVTO dec-validator archives 2135 tps at 600 AUs in W1. The serial validator remains steady at 197 to 254, which is very close to the serial miners’ throughput. The second observation is that the Opt-MVTO outperforms over Def-MVTO dec-validator almost on all points while Def-MVTO gets a maximum throughput of 1499 tps in W2 (Fig. 16b).
As shown in Fig. 16c, the maximum throughput is 2145 tps by Opt-MVTO dec-validator at 1k shared object, which is higher than serial validator, higher than Def-MVTO dec-validator. Overall, Opt-MVTO miner and validator beat almost every other miner and validator, respectively. The maximum throughput of 2145 tps by Opt-MVTO dec-validator is higher than that of the maximum throughput of 786 tps by Opt-MVTO miner.
Performance overhead analysis
In Ethereum, the transaction fee depends on the gas price (in Wai, smallest unit of Ether) and gas consumed by the transaction. Gas refers to the amount of computational power required to complete a transaction; to some extent, it can be correlated with fueling in-car operations. The transaction gas limit is specified when the transaction is initiated. During execution, if the gas amount exceeds the limit, then the transaction is canceled, and any update to the system state by the miner is rolled back. However, the transaction is included in the block, and the initiator is charged with the fee.
Table 3. Computational operation gas cost
S. No. | Value | Mnemonic | Gas used | Subset | Notes |
|---|---|---|---|---|---|
1 | 0x01 | ADD | 3 | Verylow | Addition operation |
2 | 0x02 | MUL | 5 | Low | Multiplication operation. |
3 | 0x03 | SUB | 3 | Verylow | Subtraction operation. |
4 | 0x04 | DIV | 5 | Low | Integer division operation. |
5 | 0x0a | EXP | 10 | – | Exponential operation. |
6 | 0x10 | LT | 3 | Verylow | Less-than comparison. |
7 | 0x11 | GT | 3 | Verylow | Greater-than comparison. |
8 | 0x12 | SLT | 3 | Verylow | Signed less-than comparison. |
9 | 0x13 | SGT | 3 | Verylow | Signed greater-than comparison. |
10 | 0x14 | EQ | 3 | Verylow | Equality comparison. |
11 | 0x15 | ISZERO | 3 | Verylow | Simple not operator. |
12 | 0x30 | ADDRESS | 2 | Base | Get address of currently executing account. |
13 | 0x31 | BALANCE | 400 | – | Get balance of the given account. |
The additional gas during normal execution is refunded to the initiator. The concurrent execution of conflicting STM transactions may lead to an abort and re-execution until the transaction commits. We analyzed the total gas used per block to capture the additional gas required per block due to aborts and retries by concurrent miners. The additional gas required per block can be considered a concurrency overhead for miners. In our experiments, the overall gas cost does not represent the intended cost for clients to execute transactions (e.g., extra overhead costs paid by clients for more efficient execution). Instead, it evaluates performance in terms of total gas cost as an evaluation metric of overhead by counting the additional operations performed by miners due to concurrent execution. The gas cost varies depending on operations (as detailed in Appendix-G of the Ethereum yellow paper); however, in the proposed approach, we considered 13 operations as listed in Table 3 for our experiments. The line chart in Fig. 17 shows the amount of gas used per block, while Fig. 18 shows the average number of aborts by the miners for the mix contract.
[See PDF for image]
Fig. 17
The amount of gas used per block for mix contract
[See PDF for image]
Fig. 18
The average number of STM aborts by the concurrent miners for mix contract
From Fig. 17a, we can observe that the gas used per block increases with increasing AUs as expected and remains almost steady with a fixed number of AUs in W2 and W3, as shown in Fig. 17b, c. Nevertheless, the gas used per block is relatively high for the SpecBin approach. Since in SpecBin approach, miner first tries to execute the AUs speculatively in parallel by acquiring the locks, in case if a transaction fails to acquire a lock, the effect of the transaction is rolled back, and the transaction is stored in a sequential bin, which again executed serially in the second phase. In this process, all transactions in the sequential bin partially executed in the first phase, as a consequence, consume gas. Hence we see an increase in gas used per block for SpecBin. There is no STM abort and retry in SpecBin and serial miner; therefore abort count is 0, as shown in Fig. 18. Also, the gas used per block remains the same as the serial miner for all validators due to deterministic execution.
The maximum gas used per block is for SpecBin and for Opt-MVTO miner. The gas used per block for Opt-MVTO miner is reasonably lesser than SpecBin miner, but it is slightly more costly than all other miners. The reason is overleaped execution of AUs in Opt-MVTO miner, which leads to more STM aborts and retry, as shown in Fig. 18. The STM abort and retry for Opt-MVTO miner is maximum among all miners. However, due to greater concurrency, Opt-MVTO miner gives the maximum performance benefit. Aborts increase with increasing AUs and threads in W1 and W2, respectively. In W3, a maximum of 9 aborts at 3k shared objects, there is a downfall due to fewer conflicts afterwards.
[See PDF for image]
Fig. 19
Execution overhead—concurrent approach execution (using single thread) over serial
Execution overhead analysis Due to the computation power heterogeneity in the blockchain network, we can assume that a peer may have the least computation resources and would prefer to execute the transaction serially. Therefore, we performed this experiment to answer how severely the concurrent execution approaches affect the resource constraint miners.
Figure 19 shows the execution overhead due to concurrent approaches at resource constraint peers with single-threaded execution. As shown in Fig. 19a1, the execution time increases with the increasing number of AUs per block because of the more number of transactions to be executed by a thread. Figure 19a2 shows that the execution overhead is maximum for Def-BTO miner, while it is least for SpecBin miner over serial miner. The overhead for all STM miners ranges from to than the serial miner. At the same time, it is for SpecBin miner. Though the execution overhead is significantly less for SpecBin miner, it could not exploit the maximum concurrency available within the block. It indicates that the STM-based approaches have a higher overhead over serial. However, they significantly benefit peers with decent computation power with multithreaded execution.
Additional results are presented in the Appendix as follows: Additional results on the coin, ballot, and auction contract on W1 to W3 are given in Appendix A. Results on Workload 4 when the number of threads varies from 2 to 10 is given in Appendix B.
Discussion
We observed that the speedup for all benchmark contracts follows roughly the same pattern. The speedup is likely to increase across the workloads in the read-intensive benchmarks (coin and mix) due to fewer conflicts. While speedup drop-downs due to high contention and conflict rate in the write-intensive benchmark (ballot and auction). We also observed that there might not be much speedup for concurrent miners with fewer AUs (less than 100) in the block, conceivably due to multi-threading overhead. However, the speedup for concurrent validators generally increases across benchmarks and workloads. The fork-join concurrent validators on W2 is an exception in which speedup drops down with an increase in the number of threads since the fork-join approach follows a master-slave strategy where a master thread becomes a performance bottleneck. We also observed that the concurrent validators achieve a higher speedup than concurrent miners because the concurrent miner executes AUs non-deterministically, finds conflicting AUs, creates concurrent bin and an efficient BG for validators to execute AUs deterministically.
The concurrent miner stores the BG in block to avoid the false block rejection by the concurrent validator. Storing a BG in a block consumes space. Every extra byte in the block increases network bandwidth requirement; consequently, it leads to the space overhead. However, various optimization techniques can be used to reduce the size of the BG, to store it in the block efficiently. It has been noticed that only a few transactions have conflicts in the block; thus, the BG can be efficiently stored, and concurrency aids in obtaining higher throughput.
The malicious miner problem is another important issue that needs some discussion. A malicious miner can cause the blockchain state to become inconsistent by providing an incorrect BG, letting some edges (dependencies) be misplaced or removed; similarly, in the Bin-based approach, conflicting transactions can be kept in the concurrent bin, resulting in inconsistent states in the blockchain due to double-spending. For the STM and Bin-based approaches, these problems are referred to as Edge Missing Block Graph (EMB) and Faulty Bin (FBin) errors, respectively. This problem arises when concurrent validators execute dependent transactions concurrently, and most of them (> 51%) accept a flawed block. To address this problem, validators must detect and reject faulty blocks proposed by malicious miners. Dickerson et al. [13] proposed a lock-based profiling technique to handle EMB errors. Anjana et al. [4] proposed a counter-based Smart Multi-threaded Validator (SMV) that detects EMB and FBin errors and rejects the corresponding blocks. The SMV thread monitors transactions that simultaneously perform conflicting access to the same data items and rejects the block if the conflict is detected. These extra checks for malicious miners may result in a negligible downshift in validators’ speedup.
Further, we observe that the maximum gain in the throughput for the miner is from 10 threads to 20 threads. Thus, it indicates that the sweet spot for maximum concurrency benefits lies between 10-20 threads for miners. There could be two reasons for this: the first is limited hyper-threading, and the second is that the maximum speedup that we can achieve with a given workload is up to 20 threads; what this means is that if we increase the number of threads with given transactions in the block, the additional threads (>20 threads) become idle due to transaction dependencies. We tested the OptSmart performance on a maximum of 60 threads to see how it affects performance when threads are increased beyond the limit supported by the experimental system (56 hyper-threads: 14 cores per socket and 2 threads per core). We found that speed decreases after 56 threads. Further, due to deterministic execution at the validator, the concurrency benefits are more than that of the miner. It implies that concurrent execution of AUs computationally benefits the validators significantly due to deterministic execution over miners.
The suggested method is generic and can be used with any permissioned or permissionless blockchain platform based on the order-execute mechanism, such as Sawtooth [41], EOS [15], Tezos [19], and Ethereum [16]. Another observation is that in the Proof-of-Work (PoW) based blockchains, miner computation time is dominated by PoW, while block latency is restricted by block acceptance rate. Hence, transaction execution time is not on the critical path; however, concurrent execution could benefit validators. Additionally, the number of transactions per block grows over time; for example, in Ethereum, the maximum number of transactions per block has recently increased from 400 [43] to 1400 (Block #13002929 consists of 1431 transactions in the Ethereum blockchain). In addition, the Ethereum community is considering switching from Proof-of-Work to Proof-of-Stake (PoS) [17]. It demonstrates that transaction execution time might not be an essential performance factor for existing PoW blockchains; but, it might become important in the future. The proposed approach will benefit both permissioned and PoS permissionless blockchain systems considerably.
Conclusion and future directions
Conclusion
To exploit multi-core processors, we have proposed the concurrent execution of smart contract by miners and validators, which improves the throughput. Initially, the miner executes the smart contracts concurrently using optimistic STM protocol as BTO. To reduce the number of aborts and further improve efficiency, the concurrent miner uses MVTO protocol, which maintains multiple versions corresponding to each data object. Concurrent miner proposes a block that consists of a set of transactions, concurrent bin, BG, previous block hash, and the final state of each shared data objects. Later, the validators re-execute the same smart contract transactions concurrently and deterministically in two-phase using concurrent bin followed by the BG given by miner, which capture the conflicting relations among the transactions to verify the final state. Overall, the proposed Opt-BTO and Opt-MVTO BG are space optimal. Proposed Opt-BTO, Opt-MVTO concurrent miner and decentralized Opt-BTO, Opt-MVTO concurrent validator achieves notable performance gain over state-of-the-art miners and validators, respectively.
Future directions
There have been some solutions proposed in the literature to prevent malicious miners from missing critical dependencies in the BG or adding dependent transactions to the concurrent bin to cause the EMB and the FBin errors; however, a malicious miner can append a BG with additional edges or add independent transactions to the sequential bin to cause other miners to be delayed. Analyzing the impact of delays and preventing such a malevolent miner would be very interesting.
OptSmart reduces BG size; yet, it is reasonable to question whether the BG size can become a substantial overhead. In Ethereum, the average number of AUs in a block is 300; storing the BG within the block does not consume much space. However, the number of AUs in a block can grow over time, requiring additional space to store the BG, consequently developing techniques to execute AUs concurrently and deterministically without incurring additional storage overhead will be of interest. So, developing a deterministic scheduler that always provides the same ordering while concurrently executing block transactions for both miners and validators with no additional information in the block would be a fascinating challenge. One such solution is a static analysis-based technique that constructs the BG by evaluating the read-write set added with each block transaction. Both miners and validators can generate the BG through independent statical analysis of block transactions. However, static analysis may introduce analysis overhead; additionally, most of the existing smart contract languages are Turing complete; therefore, generating the read-write set before contract transaction execution is challenging.
Another fascinating line of research is analyzing power usage and proposing a power-efficient concurrent execution of block transactions in the blockchain because multi-threading on multi-core systems consumes more power than single-core systems. Also, we have not explored nested contract transaction execution in the proposed solution; using a nested STM to handle intra-transaction concurrency will be an interesting topic to pursue. Finally, as EVM does not support multi-threading, the proposed solution cannot be tested on the Ethereum blockchain. Designing a multi-threaded EVM is thus another research direction. We are planning to integrate and test the OptSmart approach on existing blockchain systems that follow the order-execute architecture and support multi-threading, such as Sawtooth, Tezos, and EOS.
Acknowledgements
We want to thank the anonymous reviewers for their comprehensive review of our manuscript and many insightful comments and suggestions. The project was partially supported by a research grant from the Ministry of Electronics and Information Technology (MEITY) Grant/Award Number: 4(20)/2019-ITEA and 4(4)/2021-ITEA. We would also like to thank the Ministry of Electronics and Information Technology, India, for their support.
1A model in which transactions are first ordered and then executed by the peers [1].
2A paradigm in which transactions are first ordered for concurrent execution and then executed by both miners and validators [1].
3For the correctness of lock-free graph operations, please refer [9].
4The experimental system supports a maximum of 56 hyper-threads; hence we fixed the number of threads to 50 in W1 and W3.
5In the figures, legend items in bold.
6Detailed analysis on other benchmark contracts can be found in the technical report [7] (https://arxiv.org/abs/2102.04875).
7We used histograms and line chart on log scale to differentiate vertices and edges to avoid confusion in comparing the edges and vertices.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Amiri, M.J., Agrawal, D., El Abbadi, A.: Parblockchain: Leveraging transaction parallelism in permissioned blockchain systems. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), pp. 1337–1347. IEEE (2019)
2. Amiri, M.J., Agrawal, D., El Abbadi, A.: Sharper: Sharding permissioned blockchains over network clusters. In: Proceedings of the 2021 International Conference on Management of Data, pp. 76–88. SIGMOD/PODS ’21, ACM, New York, NY, USA (2021). https://doi.org/10.1145/3448016.3452807
3. Androulaki, E., Barger, A., Bortnikov, V., Cachin, C., Christidis, K., De Caro, A., Enyeart, D., Ferris, C., Laventman, G., Manevich, Y., Muralidharan, S., Murthy, C., Nguyen, B., Sethi, M., Singh, G., Smith, K., Sorniotti, A., Stathakopoulou, C., Vukolić, M., Cocco, S.W., Yellick, J.: Hyperledger fabric: A distributed operating system for permissioned blockchains. In: Proceedings of the thirteenth EuroSys conference, EuroSys (2018). https://doi.org/10.1145/3190508.3190538
4. Anjana, P.S., Attiya, H., Kumari, S., Peri, S., Somani, A.: Efficient concurrent execution of smart contracts in blockchains using object-based transactional memory. In: Networked Systems, pp. 77–93. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-67087-0_6
5. Anjana, P.S., Kumari, S., Peri, S., Rathor, S., Somani, A.: An efficient framework for optimistic concurrent execution of smart contracts. In: 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 83–92. IEEE (2019). https://doi.org/10.1109/EMPDP.2019.8671637
6. Anjana, P.S., Kumari, S., Peri, S., Rathor, S., Somani, A.: Entitling concurrency to smart contracts using optimistic transactional memory. In: Proceedings of the 20th International Conference on Distributed Computing and Networking, ICDCN ’19, p. 508. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3288599.3299723
7. Anjana, P.S., Kumari, S., Peri, S., Rathor, S., Somani, A.: Optsmart: A space efficient optimistic concurrent execution of smart contracts. CoRR abs/2102.04875 (2021). https://arxiv.org/abs/2102.04875
8. Bitcoin Block Size. https://www.blockchain.com/en/charts. [Online, Accessed 15-09-2020]
9. Chatterjee, B., Peri, S., Sa, M., Singhal, N.: A simple and practical concurrent non-blocking unbounded graph with linearizable reachability queries. In: Proceedings of the 20th International Conference on Distributed Computing and Networking, ICDCN ’19, pp. 168–177. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3288599.3288617
10. Crain, T., Natoli, C., Gramoli, V.: Evaluating the red belly blockchain. CoRR abs/1812.11747 (2018). http://arxiv.org/abs/1812.11747
11. Dang, H., Dinh, T.T.A., Loghin, D., Chang, E.C., Lin, Q., Ooi, B.C.: Towards scaling blockchain systems via sharding. In: Proceedings of the 2019 International Conference on Management of Data, pp. 123–140. SIGMOD ’19, ACM, New York, NY, USA (2019). https://doi.org/10.1145/3299869.3319889
12. Delmolino, K., Arnett, M., Kosba, A., Miller, A., Shi, E.: Step by step towards creating a safe smart contract: Lessons and insights from a cryptocurrency lab. In: International conference on financial cryptography and data security, pp. 79–94. Springer Berlin Heidelberg, Berlin, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53357-4_6
13. Dickerson, T., Gazzillo, P., Herlihy, M., Koskinen, E.: Adding concurrency to smart contracts. In: Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC ’17, pp. 303–312. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3087801.3087835
14. Dinh, T.T.A., Wang, J., Chen, G., Liu, R., Ooi, B.C., Tan, K.L.: Blockbench: A framework for analyzing private blockchains. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD ’17, pp. 1085–1100. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3035918.3064033
15. EOS. https://eos.io/. [Online, Accessed 17-04-2022]
16. Ethereum. http://github.com/ethereum. [Online, Accessed 17-04-2022]
17. Ethereum: Proof-of-Stake (POS). https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/ (2021). [Online, Accessed 17-04-2022]
18. Ethereum Stats. https://etherscan.io/chart/blocksize. [Online, Accessed 25-01-2022]
19. Goodman, L.: Tezos–a self-amending crypto-ledger white paper. https://www.tezos.com/static/papers/white_paper.pdf (2014)
20. Guerraoui, R., Kapalka, M.: On the correctness of transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’08, pp. 175–184. Association for Computing Machinery, New York, NY, USA (2008). https://doi.org/10.1145/1345206.1345233
21. Guerraoui, R., Kapalka, M.: Principles of Transactional Memory, Synthesis Lectures on Distributed Computing Theory. Morgan & Claypool Publishers (2010). https://doi.org/10.2200/S00253ED1V01Y201009DCT004
22. Gupta, S; Hellings, J; Sadoghi, M. Fault-tolerant distributed transactions on blockchain. Synth. Lect. Data Manag.; 2021; 16,
23. Gupta, S., Hellings, J., Sadoghi, M.: Rcc: Resilient concurrent consensus for high-throughput secure transaction processing. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 1392–1403. IEEE (2021). https://doi.org/10.1109/ICDE51399.2021.00124
24. Gupta, S; Rahnama, S; Hellings, J; Sadoghi, M. Resilientdb: global scale resilient blockchain fabric. Proc. VLDB Endow.; 2020; 13,
25. Hellings, J; Sadoghi, M. Byshard: sharding in a byzantine environment. Proc. VLDB Endow.; 2021; 14,
26. Herlihy, M., Shavit, N.: On the nature of progress. In: International Conference On Principles Of Distributed Systems, OPODIS 2011, pp. 313–328. Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25873-2_22
27. Herlihy, M.P., Wing, J.M.: Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 1990 (1990). https://doi.org/10.1145/78969.78972
28. Kokoris-Kogias, E., Jovanovic, P., Gasser, L., Gailly, N., Syta, E., Ford, B.: Omniledger: A secure, scale-out, decentralized ledger via sharding. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 583–598. SP ’18, IEEE (2018). https://doi.org/10.1109/SP.2018.000-5
29. Kumar, P., Peri, S., Vidyasankar, K.: A timestamp based multi-version stm algorithm. In: Distributed Computing and Networking, pp. 212–226. Springer Berlin Heidelberg, Berlin, Heidelberg (2014). https://doi.org/10.1007/978-3-642-45249-9_14
30. Kuznetsov, P; Peri, S. Non-interference and local correctness in transactional memory. Theor. Comput. Sci.; 2017; 688, pp. 103-116.3661896 [DOI: https://dx.doi.org/10.1016/j.tcs.2016.06.021]
31. Kwon, J.: Tendermint: Consensus without mining. Draft v. 0.6, fall 1(11) (2014)
32. Luu, L., Chu, D.H., Olickel, H., Saxena, P., Hobor, A.: Making smart contracts smarter. In: CCS ’16, pp. 254–269. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2976749.2978309
33. Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S., Saxena, P.: A secure sharding protocol for open blockchains. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 17–30. CCS ’16, ACM, New York, USA (2016). https://doi.org/10.1145/2976749.2978389
34. Luu, L., Teutsch, J., Kulkarni, R., Saxena, P.: Demystifying incentives in the consensus computer. In: Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pp. 706–719. ACM, New York, NY, USA (2015). https://doi.org/10.1145/2810103.2813659
35. McKeen, F., Alexandrovich, I., Berenzon, A., Rozas, C.V., Shafi, H., Shanbhogue, V., Savagaonkar, U.R.: Innovative instructions and software model for isolated execution. Hasp@ isca 10(1) (2013). https://doi.org/10.1145/2487726.2488368
36. Muzzy, E., Rizvi, S., Sui, D.: Ethereum by the numbers. https://media.consensys.net/ethereum-by-the-numbers-3520f44565a9 (2018). [Online, Accessed 15-09-2020]
37. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. https://bitcoin.org/bitcoin.pdf (2009)
38. Papadimitriou, CH. The serializability of concurrent database updates. J. ACM; 1979; 26,
39. Peri, S., Singh, A., Somani, A.: Efficient means of achieving composability using object based semantics in transactional memory systems. In: International Conference on Networked Systems, pp. 157–174. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05529-5_11
40. Saraph, V., Herlihy, M.: An empirical study of speculative concurrency in ethereum smart contracts. In: International Conference on Blockchain Economics, Security and Protocols (Tokenomics 2019), OpenAccess Series in Informatics (OASIcs), vol. 71, pp. 4:1–4:15. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2020). https://doi.org/10.4230/OASIcs.Tokenomics.2019.4
41. Hyperledger Sawtooth. https://sawtooth.hyperledger.org/. [Online, Accessed 17-04-2022]
42. Sharma, A., Schuhknecht, F.M., Agrawal, D., Dittrich, J.: Blurring the lines between blockchains and database systems: The case of hyperledger fabric. In: Proceedings of the 2019 International Conference on Management of Data, SIGMOD ’19, pp. 105–122. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3299869.3319883
43. Shukla, N.: Ethereum’s increased gas limit enables network to hit 44 tps. https://eng.ambcrypto.com/ethereums-increased-gas-limit-enables-network-to-hit-44-tps/amp/. [Online, Accessed 17-04-2022]
44. Solidity Documentation. https://solidity.readthedocs.io/. [Online, Accessed 15-09-2020]
45. Stathakopoulou, C., David, T., Vukolic, M.: Mir-bft: High-throughput BFT for blockchains. CoRR abs/1906.05552 (2019). http://arxiv.org/abs/1906.05552
46. Sukhwani, H., Martínez, J.M., Chang, X., Trivedi, K.S., Rindos, A.: Performance modeling of pbft consensus process for permissioned blockchain network (hyperledger fabric). In: 2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS), pp. 253–255 (2017). https://doi.org/10.1109/SRDS.2017.36
47. Szabo, N.: Formalizing and securing relationships on public networks. First Monday 2(9) (1997). https://doi.org/10.5210/fm.v2i9.548
48. Weikum, G; Vossen, G. Transactional Info Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery. Elsevier; 2002; [DOI: https://dx.doi.org/10.1016/C2009-0-27891-3]
49. Zamani, M., Movahedi, M., Raykova, M.: Rapidchain: Scaling blockchain via full sharding. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 931–948. CCS ’18, ACM, New York, NY, USA (2018). https://doi.org/10.1145/3243734.3243853
50. Zhang, A., Zhang, K.: Enabling concurrency on smart contracts using multiversion ordering. In: Web and Big Data, pp. 425–439. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96893-3_32
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.