Anshuman Kalla
©SHUTTERSTOCK.COM/DIRENKO KATERYNA
Today, the world of information systems and business processes is experiencing a shift in its status quo. Systems designed and operated in a centralized manner are now getting decentralized, either completely or partly, thanks to blockchain technology—specifically, distributed ledger technology (DLT), which is enabling this paradigm shift.
Blockchain technology came into existence through the inception of the Bitcoin cryptocurrency in 2008. As a matter of fact, it was after the success of Bitcoin that researchers and practitioners around the globe identified blockchain technology as the underlying technology driving Bitcoin. As time passed, technocrats started realizing that the potential of blockchain technology goes beyond the fintech sector. Notably, blockchain can resolve many issues associated with the use of centralized systems in various sectors, such as health care, manufacturing, industrial supply chain, file sharing, identity management, and telecommunications.
For instance, ensuring privacy and tight access control of electronic health-care records (EHRs) in the health-care ecosystem is of utmost importance. In other words, only authorized entities should be enabled to share and access EHRs. Moreover, irrespective of security attacks or accidental faults, the availability of health-care services should be guaranteed to various stakeholders, such as patients, doctors, laboratories, hospitals, and medical institutes. While all such stringent service requirements are improbable with a centralized health-care management system, blockchain-enabled decentralized and secure health care has the potential to fulfill them efficiently. Before we dive into DLT and blockchain technology, perhaps it would be interesting to look at the benefits and, more importantly, the issues associated with centralized information systems.
Traditionally, the design and the modus operandi of information systems have been centralized in nature—the reason being that centralized systems offer numerous advantages, such as ease of monitoring and maintenance, higher controllability, increased protection from physical attacks, and no need for multiple infrastructure facilities since all of the resources and database(s) are housed at a predefined and fixed location. Moreover, centralized systems have matured over the years, and well-established legal frameworks support them. Figure 1 depicts advantages and disadvantages of a centralized system.
Fig 1 The centralized system–pros and cons.
Nevertheless, the recent past has revealed many issues that underlie a centralized system. For instance, a centralized system may invoke numerous third-party services to perform its functions. Each of these third-party intermediaries performs its dedicated set of tasks, which increases the overall delay and the cost that users have to bear. Moreover, users are generally unaware of the involved third-party intermediaries. Another critical issue with centralized systems is the single point of failure. Given the intrinsic centrality of such systems, any accidental or attack-driven failure can lead to a complete breakdown or compromised services. Some of the security-related issues with centralized information systems are as follows:
DLT aims at secure storage of a distributed ledger and decentralized governance. A ledger is a book of records (i.e., a database), which, if built and maintained by a digital system, is called a digital ledger. Organizations and businesses that provide services involving (or incurring) economic values and perform transactions with digital assets need to maintain a digital ledger. When a digital ledger is shared with multiple entities (also known as nodes) in a system, it is referred to as a shared ledger. A distributed ledger is a type of shared ledger. In the distributed ledger paradigm, a digital ledger is shared with multiple nodes such that each node stores an identical copy of the digital ledger. Nodes are connected to form a network of nodes. The peculiarities of such a network of nodes are as follows: 1) any node in the network can add new transactions to the distributed ledger, and 2) the network is expected to have malicious or compromised nodes.
To ensure the security of a distributed ledger in such a network, DLT exhaustively uses cryptographic techniques. If a new transaction added by a node is legitimate, all of the other nodes in the network must also update their local copies of the ledger with the same transaction. This process ensures global synchronization of the distributed ledger and decentralized governance. On the other hand, if the new transaction is illegal, then DLT needs a mechanism to reject this transaction from getting updated at all of the other nodes. This mechanism is called a consensus mechanism, and using it, the nodes establish an agreement before adding a new transaction in the distributed ledger. Once a transaction is written to a distributed ledger, it is cryptographically sealed. This means the transactions committed to a distributed ledger cannot be deleted or modified. Therefore, a distributed ledger is immutable (or tamper resistant), and its size keeps growing with time. Figure 2 shows the fundamental components of DLT and its types.
Fig 2 DLT components and types.
The last decade has witnessed significant growth in the applicability of blockchain technology in various domains. Despite its popularity, there is no single standardized definition of blockchain to date. Therefore, this section aims to answer the fundamental question: What is blockchain? To develop a firm understanding, in what follows, we explain blockchain from three perspectives (see Fig. 3).
Fig 3 Viewing blockchain from three perspectives.
First and foremost, blockchain is the most popular type of DLT among others, and it has received all-around attention from industry and academia. At the core of blockchain lies a distributed ledger that is shared and synchronized with all of the nodes in the system. It is important to note that, although sometimes the two terms DLT and blockchain are used interchangeably, they are not the same. DLT is a broad class, and blockchain is one prominent member of this class. Examples of the other members are directed acyclic graphs (DAG), hashgraph, holochain, and tangle (see Fig. 2).
Second, from the technological perspective, blockchain is a technology that is a unique, powerful, and profitable amalgamation of underlying principles and technologies. It exhaustively uses cryptographic concepts and principles, such as asymmetric cryptography, hashing, digital signature, and Merkle trees. Peer-to-peer (P2P) technology is used to connect nodes to create a blockchain network. Moreover, blockchain is governed by a consensus mechanism that allows nodes in the blockchain network to agree on the current state of the distributed ledger. In some use cases, such as Bitcoin cryptocurrency, economic models are used to incentivize the participating nodes, thereby ensuring maximum decentralization. Therefore, in essence, blockchain is a technology that is driven by a gamut of underlying technologies.
Third, from the data structure viewpoint, blockchain represents how transactions are grouped and how the overall database is structured. Blockchain follows a specific data structure to create and update the distributed ledger. A set of transactions appearing in a given time window are validated and clubbed in a data unit called a block. Each block contains a finite number of legitimate transactions. Every block is timestamped, and its cryptographic hash value is computed. A newly created block is logically chained with the most recent block in the ledger. The chaining happens by inserting the hash value of the most recent block’s header in the newly created block (explained in detail in the next section). As the blocks are created, they are chronologically and cryptographically connected. In essence, the data structure used for the ledger is a chain of blocks, and so the name is blockchain.
Now that we know blockchain follows a typical data structure to build a distributed ledger, it is time to ponder the following questions:
Note that the pertinence of these questions, their precise answers, and the actual block mining process depends on numerous factors, such as the type of blockchain and the consensus mechanism used. For a first-hand understanding of the process involving creating, mining, disseminating, verifying, and adding a new block, we consider the Bitcoin blockchain, which uses the proof-of-work (PoW) consensus mechanism. A typical block structure used for the Bitcoin blockchain is shown in Fig. 4. A block comprises two parts: the header and body. The block header contains a set of fields, whereas the body consists of transactions and optionally smart contract(s).
Fig 4 The block structure. B represents size of the field in bytes.
Users of a blockchain-based system are connected to a blockchain network. A blockchain network comprises nodes connected in a P2P fashion. Depending on the type of blockchain, the platform used, and the specific use case, a node may refer to a miner, validator, orderer, or relay node. To understand the mining process, in this section, the term node refers to a miner. Moreover, both the terms node and miner are used interchangeably. As the name suggests, the job of a miner is to mine a new block. Depending on the settings, miners are equipped with high-end computing and storage facilities if mining is chosen to be computationally intensive. This typically is the case when the consensus mechanism used is PoW.
The entire process of block mining is roughly divided into six steps, as depicted in Fig. 5. These are explained in the following sections.
Fig 5 Various steps for creating, mining, verifying, and appending a new block.
Let us assume that a user Alice wants to send x bitcoin to a user Bob. Step 1 is to create a transaction that reflects Alice’s intention. Alice performs this by using an appropriate wallet and logging in to her account. Blockchain employs public key cryptography to assign each user a pair of public and private keys. An account identifier (or number) is usually derived from a user’s public key. To send x bitcoin to Bob, Alice selects Bob’s account and enters the value x to transfer. Alice triggers the transaction by clicking the pay or finish button, and the transaction is digitally signed using her private key.
In this step, the transaction created in step 1 is broadcasted to all of the nodes in the blockchain P2P network. To do so, Alice’s wallet sends out the transaction to the nodes to which it is directly connected. At any given time, a large number of nodes can participate in a blockchain network. However, any given user is directly connected to very few nodes.
Every node receiving the transaction performs a set of tasks. First, it validates the sender (i.e., Alice) by checking its digital signature. Second, it confirms the existence of the recipient (i.e., Bob) by looking up his address. Third, it verifies the feasibility of the transaction by ensuring Alice has x bitcoin with her. Fourth, it adds the verified transaction to the pool of unconfirmed transactions. The transaction pool or memory pool (mempool) at each node holds all of the unconfirmed transactions that occur in a given time window. Finally, each node sends the transaction to all of the other connected nodes. Eventually, the transaction is broadcasted, and all of the nodes receive that transaction.
The next step is the mining of the new block. Mining is a process in which miners compete to create a new block. The mining process starts with the active miners independently selecting a set of unconfirmed transactions from their respective mempools. The selected transactions form the body of a new block (under creation). Moreover, these transactions are hashed to create a Merkle tree.
In simple words, a Merkle tree is an inverted tree of cryptographic hash values. The leaf nodes of the Merkle tree are labeled with the hash values of the selected transactions. To compute a label for a node at any higher level, the hash values of the child nodes are concatenated and again hashed. This process of concatenating hash values of child nodes and computing a new hash continues until we get one hash value at the root level, called the Merkle root hash. The Merkle root hash forms one of the fields in the block header (see Fig. 4). The advantage of using the Merkle tree is that it provides a compact and secure representation of all of the selected transactions.
Next, the hash value of the most recent block in the ledger is inserted in the previous block hash field of the new block under creation. The timestamp field is provided with the Unix epoch time when the mining starts.
Assuming the PoW consensus algorithm, the competitive challenge for miners is to solve a computationally intensive puzzle. The aim of this puzzle is to find a hash for the new block’s header, provided that the hash value is smaller than a given value, called the target. Interestingly, the value of the target is dynamically determined from the current difficulty. The difficulty is the relative level of computation required to solve the puzzle to mine a new block. Difficulty equal to one indicates the level of computation needed to mine the first block (know as the genesis block). In that sense, a difficulty of five means that the computation required to mine a new block is five times more than that of the genesis block.
The higher the difficulty level, the smaller the target value. A smaller target value means a higher number of zeros for leading bits in the target value. Furthermore, the nbits field holds a compact representation of the target value used for mining a block.
In summary, the current difficulty level determines the target value, which determines the value for the nbits field. The difficulty level is adjusted according to the active number of miners and the block generation rate. The adjustment keeps the block generation time almost the same irrespective of the change in the collective mining power (known as the hash rate) of the blockchain network.
To achieve a block hash value smaller than the target value, miners change the nonce field—this is to say that a miner tries a random number for the nonce field and computes the hash value of the block’s header. If the computed hash value is smaller than the target, the mining process is finished. Otherwise, the process continues with a different random number assigned to the nonce field. The process is computationally intensive because it uses brute force to search for a magic number for the nonce field that produces the desired block hash. The miner that solves this puzzle earliest is the winning miner.
It is worth noting that heavy computation and huge energy consumption are specific requirements of the Bitcoin blockchain because it uses the PoW consensus mechanism. Examples of some of the lightweight consensus algorithms that are energy efficient are proof of stake (PoS), proof of authority, and practical Byzantine fault tolerance (PBFT).
This step is about broadcasting the newly mined block to all of the other nodes in the network. Every node performs a set of validations upon receiving a new block. These validations include validating transactions in the block; correctness of the header fields, such the as previous block hash; nbits; Merkle root hash; and recomputation of the block header’s hash value with the provided nonce value (to ensure that it meets the current difficulty level).
Once all (or at least the majority of) the nodes validate and append the newly received block to their local copy of the ledger, the blockchain is said to have reached a consensus. At this point, the transactions in the newly appended block are declared to be confirmed.
Logically, the last step is minting new bitcoins and disbursing rewards to the winning miner. As soon as a new block is committed, the blockchain system experiences a supply of new bitcoins. These coins and the transaction fees (received from the users of the confirmed transactions in the new block) get credited to the account of the winning miner. Currently (i.e., in 2022), the Bitcoin system creates 6.25 new bitcoins when a new block is mined.
Although this explanation is based on the Bitcoin blockchain, the stepwise illustration gives a decent understanding of how transactions are generally processed and block mining happens. Today, there exist several consensus algorithms other than PoW, such as PoS, delegated PoS (dPoS), proof of activity, proof of elapsed time, PBFT, federated Byzantine fault tolerance, stellar consensus protocol, Reliable, Replicated, Redundant, and Fault-Tolerant (RAFT), and ripple.
Moreover, various classifications have been proposed to categorize these consensus algorithms. One of the classifications is proof-based and voting-based consensus algorithms. Proof-based consensus is (generally) used when a large number of nodes participate in the blockchain P2P network, and one of the nodes has to prove itself worthy of adding a new block. Such consensus algorithms are better suited for public blockchains, and PoW, PoS, DPoS, and proof of activity.
Voting-based consensus requires nodes to communicate to cast their votes (i.e., opinions about a proposed new block and transactions within it), and then a majority of nodes should collectively decide to add the new block. Such a consensus is viable when a limited number of nodes participate. It is therefore suitable for private or consortium types of blockchains. Examples of voting-based algorithms are PBFT, RAFT, and ripple.
Furthermore, it is a myth that all of the consensus algorithms in blockchain are always computation intensive and energy hungry. Authentication-based consensus is yet another class of consensus algorithms that are lightweight and suitable for resource-constrained devices like the Internet of Things. PUFchain and AEchain are examples of such a type of consensus algorithm. (For more details, check the “Read More About It” list at the end).
In this section, some of the questions that will be answered are as follows: What are the different types of blockchain? What are the distinctive characteristics of different blockchains? Blockchains have been classified in different ways depending on numerous factors, such as read, write, and commit permission settings; server hosting; conditions to join a blockchain network; and storing data (on or off the blockchain).
Blockchains are classified into four basic types: public, private, permissionless, and permissioned. These are distinguished based on the permissions to perform read, write, and commit operations as well as the type of servers used. An entity with read permission can see all of the transactions stored in the distributed ledger. Write permission allows an entity to create and broadcast transactions to the other nodes in the blockchain network. The commit right allows a node to append a new block and confirm all of the transactions within that block. Commit and write are often used interchangeably because the write operation includes the sense of commit.
Any entity can join the network in a public blockchain and perform the read operations. Moreover, public servers are usually used for hosting public blockchains. On the contrary, entities in private blockchains need permission to join and perform read operations. Accordingly, private servers are used to host private blockchains and to enforce restrictions. An entity could be a user or miner willing to join a network. In essence, the read permission and the type of server hosting govern the difference between public and private blockchains.
The distinction between permissionless and permissioned blockchains is based on the write and commit permissions. Permissionless blockchain denotes a setting where all of the participating entities (by default) can perform write and commit operations. However, write and commit permissions are granted to only a few of the total nodes participating in a permissioned blockchain.
Blockchains can be further classified by combining these four basic types of blockchain. These types are depicted in Fig. 6 and discussed as follows:
Fig 6 Types of blockchains and their characteristics.
Blockchain has emerged as a promising technology. It is envisioned to lead the future world of decentralization and disintermediation. Its salient features are immutability, disintermediation of third parties, enhanced security with nonrepudiation, transparency with pseudonymity, provenance, and auditability (Fig. 7). In recent years, blockchain has been evolving toward an extensive ecosystem. Some of the exciting roles offered by such an ecosystem are the blockchain infrastructure provider, platform provider (i.e., Blockchain as a service), decentralized application provider, and mining solutions like cloud mining and mining as a service. Continuous growth in the blockchain landscape has proven to be beneficial for many existing use cases and applications, such as smart health care, education, public governance, and banking. Figure 7 provides a panoramic view of the existing real-life use cases and futuristic intriguing applications.
Fig 7 Blockchain- and smart contract-enabled existing and futuristic use cases and challenges. AI: artificial intelligence; BaaS: blockchain as a service; ID: identification; Maas: mining as a service.
Nevertheless, as with any other evolving technology, blockchain also suffers from various issues. These include low scalability in terms of throughput (transactions per unit time) and storage capacity; the possibility of privacy leakage; the nonexistence of legal frameworks and standard bodies; interoperability among disparate blockchains; and vulnerability to security attacks targeting different aspects, like the consensus mechanism, P2P network, and smart contracts (especially given the rise of quantum computing). Use case-specific efforts are required to effectively mitigate these challenges to leverage the maximum potential of blockchain technology.
• S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” Decentralized Bus. Rev., p. 21,260, Oct. 2008. [Online] . Available: https://assets.pubpub.org/d8wct41f/31611263538139.pdf
• G. Hileman and M. Rauchs, “Global cryptocurrency benchmarking study,” Cambridge Centre Alternative Finance, vol. 33, pp. 33–113, Sep. 2017.
• B. Carson, G. Romanelli, P. Walsh, and A. Zhumaev, “Blockchain beyond the hype: What is the strategic business value,” McKinsey Company, vol. 1, pp. 1–13, Jun. 2018. [Online] . Available: https://www.mckinsey.com/∼/media/McKinsey/Business%20Functions/McKinsey%20Digital/Our%20Insights/Blockchain%20beyond%20the%20hype%20What%20is%20the%20strategic%20business%20value/Blockchain-beyond-the-hype-What-is-the-strategic-business-value.pdf
• T. Riasanow, F. Burckhardt, D. Soto Setzke, M. Böhm, and H. Krcmar, “The generic blockchain ecosystem and its strategic implications,” 2018. [Online] . Available: https://www.researchgate.net/profile/Tobias-Riasanow/publication/325677489_The_Generic_Blockchain_Ecosystem_and_its_Strategic_Implications/links/5b1d66350f7e9b68b42bfd60/The-Generic-Blockchain-Ecosystem-and-its-Strategic-Implications.pdf
• H. Vranken, “Sustainability of bitcoin and blockchains,” Current Opinion Environmental Sustainability, vol. 28, pp. 1–9, Oct. 2017, doi: 10.1016/j.cosust.2017.04.011.
• N. E. Ioini and C. Pahl, “A review of distributed ledger technologies,” in Proc. OTM Confederated Int. Conf. ‘Move Meaningful Internet Syst.’, Cham: Springer, 2018, pp. 277–288.
• A. Verma et al., “Blockchain for Industry 5.0: Vision, opportunities, key enablers, and future directions,” IEEE Access, vol. 10, pp. 69,160–69,199, Jun. 2022, doi: 10.1109/ACCESS.2022.3186892.
• A. Upadhyay, S. Mukhuty, V. Kumar, and Y. Kazancoglu, “Blockchain technology and the circular economy: Implications for sustainability and social responsibility,” J. Cleaner Prod., vol. 293, Apr. 2021, Art. no. 126130, doi: 10.1016/j.jclepro.2021.126130.
• G. d S. R. Rocha, L. de Oliveira, and E. Talamini, “Blockchain applications in agribusiness: A systematic review,” Future Internet, vol. 13, no. 4, Apr. 2021, Art. no. 95, doi: 10.3390/fi13040095.
• A. Kalla, C. De Alwis, P. Porambage, G. Gür, and M. Liyanage, “A survey on the use of blockchain for future 6G: Technical aspects, use cases, challenges and research directions,” J. Ind. Inf. Integr., vol. 30, Nov. 2022, Art. no. 100404, doi: 10.1016/j.jii.2022.100404.
• S. P. Mohanty, V. P. Yanambaka, E. Kougianos, and D. Puthal, “PUFchain: A hardware-assisted blockchain for sustainable simultaneous device and data security in the internet of everything (IoE),” IEEE Consum. Electron. Mag., vol. 9, no. 2, pp. 8–16, Mar. 2020, doi: 10.1109/MCE.2019.2953758.
• S. Khan, W.-K. Lee, and S. O. Hwang, “AEchain: A lightweight blockchain for IoT applications,” IEEE Consum. Electron. Mag., vol. 11, no. 2, pp. 64–76, Mar. 2022, doi: 10.1109/MCE.2021.3060373.
Anshuman Kalla (anshuman.kalla@ieee.org) is a professor in the Department of Computer Engineering, Chhotubhai Gopalbhai Patel Institute of Technology, Uka Tarsadia University, Gujarat 394350, India. He is a Senior Member of IEEE. For more information, please visit https://sites.google.com/site/kallanshuman/.
Digital Object Identifier 10.1109/MPOT.2023.3246230