Danielle Mallare, Nachiket Dani, Tamara Bonaci
©SHUTTERSTOCK.COM/ARTEMISDIANA
Today, most of us have more accounts with corresponding usernames and passwords than we can count. Online banking, shopping, streaming, and social media, among other things, require the user to authenticate, or prove, their identity to utilize the provided service. Keeping track of a multitude of accounts becomes burdensome, and many of us turn toward the usage of weak or recycled credentials that pose a security risk. On the other hand, to avoid said security risks, many of us trade our privacy for convenience and use social identities to log in, giving the companies that manage these identities the ability to track our behavior.
Whether online or in person, we are often required to prove claims about ourselves and end up in a cumbersome quest to find the proof needed, such as when one needs to prove residency of a town and must produce various forms of evidence showcasing that the town is, in fact, their legal place of residence. We oftentimes end up revealing more personal information than necessary, such as showing our driver’s license to prove we are of a certain age or uploading a student ID to demonstrate student status.
A self-sovereign identity (SSI) solution—a digital identity that is owned and controlled by an individual, is not tied to any central authority, and allows an individual to easily prove claims about themselves—has the potential to solve many of these issues. This article explores decentralized identifiers (DIDs) and their use with distributed ledger technology, cryptography, and verifiable credentials to create a modern SSI that is both secure and privacy preserving. We share examples of DID methods and discuss their uses for authentication and proving claims about ourselves.
The need to prove our identity is ubiquitous. We find ourselves authenticating numerous times a day, whether it be to log in to an account to pay a bill, to view our bank balance, or even to view an article on a website. With so many accounts to keep track of, many of us turn to using weak or recycled credentials, putting our accounts and the information contained within those accounts at risk. On the other hand, many of us use social identities, trading our privacy for a more secure and convenient solution and giving the entity controlling these accounts the ability to track our behavior or even revoke the identity at any time. We also frequently associate our identity with various credentials and facts about ourselves. For example, we may be required to prove residency in a certain town to receive benefits or prove to a future employer that we have earned a degree. When proving facts about ourselves, we often end up on a cumbersome hunt for evidence, revealing additional—and unnecessary—information to the verifier. As a case in point, to prove residency in a specific town, the prover needs to assemble a stack of documents including utility bills and present their driver’s license as proof. The driver’s license alone has more than just a person’s legal address; it contains their height, eye color, and other facts unrelated to the need to prove they reside in a specific town.
DIDs, according to Avellaneda et al. (2021), provide a mechanism for identity management that is as convenient as logging in with a social identity or presenting a physical credential, without having to sacrifice privacy and security (Avellaneda et al., 2021). They leverage distributed ledger technology and cryptographic primitives to offer a solution for implementing an SSI, giving the individual control over what they share—and, most importantly, control—as well as ownership of their own identity.
In contrast to social identities, such as the social media accounts that many of us use today, an SSI is a digital identity that an individual has full control of. It is independent of any organization or centralized authority. The individual dictates all elements of the identity, including what information is divulged and with whom it can be shared. One crucial benefit of SSIs, according to the Sovrin Foundation (2018), is that it grants individuals “the same freedoms and personal autonomy to the Internet in a safe and trustworthy system of identity management.â€
The following are considered the main technologies used to implement an SSI:
Each of these plays a role in ensuring that individuals maintain full ownership over their identity and can chose what, to whom, and when to disclose any part of that identity:
DIDs, as defined by Allen et al. (2021), are globally unique identifiers that are “designed to enable individuals and organizations to generate their own identifiers using systems they trust.†Each entity (person or organization) can create as many DIDs as they want, creating separate personas for various purposes, such as one for online forums or another for authentication into an online utilities account. DIDs allow us to define ourselves within different contexts and limit our disclosure of personal information to what is necessary for the context at hand. For example, an account with an online forum does not need to know our full name and date of birth. Perhaps the only requirement to post on the forum is that we are at least 18 years of age. One could create a DID simply for use when logging into the online forum and associate with this DID a verifiable presentation containing a verifiable credential with the proof that we are above 18 years of age.
Some of the main design goals for DIDs, as purported by the World Wide Web Consortium (W3C) (Allen et al., 2021) are as follows:
Decentralized ledger technology (DLT) is a main component of DIDs. With DLTs, data are replicated across multiple computing nodes. Typically, DLTs use consensus protocols to ensure that a single source of truth is maintained. There is no central authority responsible for the maintenance of data and, consequently, no single point of failure. Blockchains are one type of DLT. In a blockchain, records are linked to one another by maintaining a reference to the hash of the previous record. Most blockchains use a proof-of-work mechanism as their consensus algorithm to uphold data integrity and maintain consistency across all nodes in participation. Since DLTs give us the ability to store and maintain information without the need for a single controlling entity, they are a main pillar for use with DIDs in implementing an SSI solution.
DIDs themselves are “simple text strings consisting of three parts†(Allen et al., 2021), as shown in Fig. 1.
Fig 1 A simple example of a DID from the W3C’s August 2021 proposed recommendation (Allen et al., 2021).
The scheme identifier for a DID is always the string “did.†The method-specific identifier is generated as prescribed by the DID method. We will discuss DID methods and method-specific identifiers in subsequent sections.
A DID document, as defined by the W3C, is a “set of data describing the DID subject, including mechanisms, such as cryptographic public keys, that the DID subject can use to authenticate itself and prove its association with a DID†(Allen et al., 2021). All DIDs must resolve to a DID document through a process called DID resolution. All DID implementations must define a resolve function that takes in a DID as input and outputs a DID document and optional metadata pertaining to the document. The resolve function makes use of the DID method’s read operation. DID documents themselves can be serialized into a JavaScript Object Notation (JSON) representation, for example, that can be parsed by applications seeking to utilize the document for identification purposes.
Two important concepts with regard to DIDs are DID subjects and their controllers. A DID subject is the entity, whether it be an individual or an organization, that the DID is serving as an identifier for. A DID controller is the entity that has permission to update and make changes to a DID document. In many cases, the DID subject and controller are the same entity. One instance where this may not be the case is when a DID subject is a minor. In this case, the DID controller may be the minor’s legal guardian. The DID subject is specified in a DID document with the “id†property. The DID controller for the DID document is given in the document’s “controller†property.
There are many organizations creating their own DID methods. DID methods are, essentially, definitions of DID implementations that adhere to the specifications for a DID as set forth by the W3C. A DID method must define a method name that is utilized within the DID string itself to identify the method. A method must also prescribe how to create, update, read, and delete a DID. The “id†component of the DID must be unique across all DIDs within the method, and methods must also be designed to ensure interoperability among various implementations (Allen et al., 2021).
To maintain the requirement that DIDs are globally unique, a DID method should be registered with the W3C in the DID Specification Registries. There are many such DID methods in the Specification Registries currently. Most methods are based on the blockchain technology that will store and maintain the generated DIDs. The DID method-specific identifier is often derived from the address or other account information from the blockchain that the DID’s associated public key is published on. For example, the BTCR DID method, “did:btcr,†is a method that supports DIDs on the Bitcoin blockchain, and the method-specific identifier is derived from the TxRef encoding, or a reference to a transaction’s position within the Bitcoin blockchain (Allen et al., 2021). The SelfKey method, “did:selfkey,†supports DIDs on the Ethereum blockchain. The method-specific identifier is a concatenation of the Ethereum address and a nonce (SelfKey Foundation, 2019).
There are also ledger-agnostic DID methods, such as the Pkh method demonstrated by Chang et al. (2021), “did:pkh,†that generates DIDs based on a chain-agnostic improvement proposal (CAIP) 10 keypair (Chang et al., 2021). CAIPs describe standards for those wishing to work with a variety of blockchains. CAIP-10 is an account ID specification created in March 2020 to identify accounts in any blockchain specified in CAIP-2, the corresponding blockchain ID specification. The method-specific identifier consists of a string containing a blockchain ID, which consists of the blockchain used and the account address (ChainAgnostic, 2021).
To explain the concepts further, we introduce an example DID method, which we will call sample, that utilizes the Bitcoin blockchain. The method-specific identifier will simply be a Bitcoin address generated from a private key. A DID created with the sample method will have the following structure:
did:sam:123456789abcdefghi
The “sam†represents the identifier for the DID method, and the final part corresponding to the method-specific identifier is a valid Bitcoin address. In this example, we only describe the create and read operations.
To create a new DID implementing the sample method, one must first generate a private key. This private key will be utilized to derive a Bitcoin address and corresponding public key. Bitcoin uses the “secp256k1†elliptic curve. Therefore, to start the process of creating a Bitcoin address, we first need to create a public key with the elliptic curve digital signature algorithm (ECDSA). Applying ECDSA to a private key yields a pair of 32-byte integer values that represent the x- and y-coordinates of a point on the elliptic curve. This coordinate is the corresponding public key. After creating the public key, the key is encrypted first with SHA-256, and then we apply RIPEMD-160 to the result. SHA-256 and RIPEMD-160 are two types of hash functions, or functions that take an input of arbitrary length and produce a fixed-length output. Another feature of hash functions, or “one-way functions,†is that, once we apply the hash function to an input and obtain the output, there is no way to reverse the process and recover the original input, making these types of functions excellent candidates for securing data.
After encryption, to ensure that the address will be used on Bitcoin’s mainnet, as opposed to its testnet, the bytes 0x00 are added to the encrypted public key. To create the address, we concatenate the mainnet key with a checksum. Finally, the resulting hexadecimal address is then encoded into a base 58 address. This result is our Bitcoin address generated with our initial private key. After the address has been generated, we use our Bitcoin address to publish the corresponding public key as a transaction on the Bitcoin blockchain (Badretdinov, 2018).
Reading a DID involves locating the specific DID on the Bitcoin blockchain and resolving the DID to a DID document. In our method, to resolve a DID to its DID document, after the transaction corresponding to the DID is retrieved from the blockchain, a DID document is initialized as a JSON document with the “id†property set to the DID itself. We add the “verificationMethod†property to the JSON document and construct the verification method object as follows:
An example of DID resolution to a DID document for the following sample DID, did:sam:123456789abcdefghi, is given in Fig. 2.
Fig 2 An example DID with the verification method and authentication properties.
In a world where we find ourselves logging in to multiple accounts per day, DIDs provide a secure, private, and convenient solution to prove our identity. Instead of creating a different username and password for each account, many people find using social identities easier to use. However, this ease of use comes with a tradeoff: our privacy. Using a social identity to create an online retailer account, for example, may inadvertently give the administrator of that identity insight into our spending habits. DIDs give a convenient solution without the need to sacrifice one’s privacy.
Within a DID document, one or more verification methods can be specified. These can include cryptographic public keys that, along with a digital signature created with the individual’s private key, can be used to verify an individual’s identity. Each verification method in a DID document is required to have a value for the “id,†“type,†and “controller†properties. The value of the “id†property must follow the DID URL syntax, which is similar to the syntax seen with web URLs. The type must specify exactly one verification method type. For example, “JSONWebKey2020†refers to the JSON web key signature suite created in 2020. Finally, the controller value should be the URI of the entity with control over the verification method. This is often set as the DID of the individual in control over the DID document itself. Depending on the verification method type, the verification method will contain additional information needed for any process wishing to use it. This additional information is contained in a verification material property, such as PublicKeyJwk, for example, which is a map representing the information associated with a JSON web key. This map contains various key value pairs that include information regarding the curve, key type, and other parameters associated with the key.
The example in Fig. 3 illustrates the “JSONWebKey2020†verification method utilizing an elliptic curve of type “secp256k1†and coordinates specified by the x- and y-values. This information is utilized to identify the public key, the point (x, y) on the elliptic curve, that can be used for verification. Notice that the “controller†property value is a DID and that the “id†value for this verification method contains a fragment (the part following the hashtag symbol). DID documents can contain multiple verification methods, and fragments are used to index them within the document (Allen et al., 2021; Steele, 2021).
Fig 3 An example verification method. Note that this example only contains the verificationMethod property and is not depicting a full DID document.
Authentication is a type of verification relationship. Verification relationships are defined by the W3C to “express the relationship between the DID subject and a verification method†(Allen et al., 2021). Authentication describes how the DID subject can authenticate themselves, including, for example, how the subject can log in to a banking website and view their balance or log in to their account with a utility provider to pay a bill. If the authentication property is present within a DID document, its value must be a list containing at least one verification method. The verification method may be embedded within the authentication property, or it can be a reference to a verification method defined elsewhere within the DID document. Figure 4 shows an authentication property containing a reference to the verification method defined in Fig. 3.
Fig 4 An example of the authentication property containing a referenced verification method. Note that this example only contains the authentication property and is not depicting a full DID document.
Suppose we want to use this verification method within the authentication property to log in to a utilities account and pay a bill. To do so, we first disclose our DID to the utility’s website. This would be done in a secure manner utilizing software that could be, for example, an app on our mobile device. The website would then use the DID resolution process to resolve the DID we gave to its corresponding DID document. The website then finds the “authentication†property within the DID document and sends a challenge as prescribed by the verification method found within the value of the authentication property. For example, if the verification method was prescribed as in Fig. 3, the website will create a challenge with the elliptic curve public key. The challenge will then be completed by our application utilizing the corresponding private key. This challenge may involve asking our application to sign with the private key in which the website will verify the signature with the public key contained within the verification method property. This process is visualized in Fig. 5. Only the holder of the corresponding private key can complete the challenge and authenticate. Because authentication uses a cryptographic algorithm, it is a secure and safe mechanism to use (Microsoft Corporation, 2018).
Fig 5 The authentication process with DIDs. Auth: authentication. (Source: Hamilton-Duffy, 2021.)
Verifiable credentials are digital counterparts to the physical credentials we hold in everyday life, such as a driver’s license, student ID card, or diploma. They provide a secure and private mechanism for proving claims about us (Tykn, 2021; Affinidi, 2021). Cryptographic methods, such as digital signatures, ensure the security and trustworthiness of the information. The ability to generate a verifiable presentation from a verifiable credential allows an individual to selectively disclose personal information and maintain their privacy. For example, when an individual needs to prove they are a resident of a town to receive certain benefits, instead of producing a stack of utility bills and their license, which includes their birth date and other unnecessary personal information, the individual can create a verifiable presentation from a verifiable credential issued by the town government that indicates they are, in fact, a legal resident.
There are two entities associated with verifiable credentials: the issuer, which is the entity issuing the credential, and the subject, which is the entity to whom the credential is being issued. Verifiable credentials can be issued from an organization, such as a university or a government’s department of licensing, to an individual or organization identified with a DID (Tykn, 2021; Affinidi, 2021). The “CredentialSubject†property is an object containing information about the entity to whom the verifiable credential is issued. This object often includes an “id†property that corresponds to the DID of the subject. Additionally, the credential contains a “proof†property describing the cryptographic method that a verifier can use to prove that the credential has truly been issued to the subject (Chadwick et al., 2021). A simple example of a verifiable credential is shown in the next section. For further examples, we recommend exploring the W3C specification for verifiable credentials found in Chadwick et al. (2021).
Suppose, for example, that, to obtain a student discount when shopping online, we need to prove our student status. The university may have already issued us verifiable credentials indicating we are students at the institution along with other information. The verifiable credential is shown in Fig. 6.
Fig 6 Example verifiable credential using an RSA signature for proof. The credential is issued by the entity with DID “did:sam:987654321ihgfedcba†to the entity with DID “did:sam:123456689abcdefghi.â€
This credential is tied to our identity by including our DID in the “id†property of the “CredentialSubject.†Note that the credential contains additional information that is not needed by the website to grant us the student discount. To minimize the amount of personal information we disclose about ourselves, we use our identity application, which could be, for example, an app on our mobile device, to create a verifiable presentation that solely consists of a derived credential containing proof that we are, in fact, students at Brilliant University. The verifiable presentation, like the verifiable credential, contains a “proof†property with information about how a verifier can use cryptographic methods to confirm that the claims made are truly about the proving entity (Microsoft Corporation, 2018; Chadwick et al., 2021).
Verifiers utilizing a verifiable credential or verifiable presentation to determine if a claim made is true will do three things:
Verifying the identities of both the prover and the issuer of the credential follows a similar protocol to the one followed when an entity wants to authenticate with their DID. Since the DIDs of both the issuer and the subject are contained within the verifiable presentation, the verifier can easily extract these DIDs and utilize the read operation to obtain the mechanism for verification and leverage that mechanism to verify the identity (Microsoft Corporation, 2018; Chadwick et al., 2021).
Other privacy-preserving mechanisms that are supported by verifiable credentials include the ability to use ZKPs to make claims about oneself without revealing any personally identifiable information. Additionally, verifiable presentations can be derived from multiple verifiable credentials, giving an individual full freedom and flexibility to choose exactly what is shared with a verifier (Chadwick et al., 2021).
In this digital age, our rights to privacy and control over our identities are becoming crucial as companies are collecting ample amounts of data and are learning more and more about us from these data. DIDs open a door to the new concept of an SSI, an identity that is not controlled by a third party but owned by the individual or organization whom the identity is for. Verifiable credentials used in conjunction with DIDs give us the ability to selectively disclose what we want about ourselves in a manner that is secured with cryptography, giving us the power to safely reveal only what is necessary when proving claims about ourselves.
• C. Allen, D. Longley, D. Reed, M. Sabadello, M. Sporny, and O. Steele, “Decentralized identifiers (DIDs) v1.0 core architecture, data model, and representations,†World Wide Web Consortium, W3C Proposed Recommendation, Cambridge, MA, USA, Aug. 2021. [Online] . Available: https://w3c.github.io/did-core
• O. Avellaneda et al., “Decentralized identity: Where did it come from and where is it going?†IEEE Commun. Standards Mag., vol. 3, no. 4, pp. 10–13, Dec. 2019, doi: 10.1109/MCOMSTD.2019.9031542.
• “What is self-sovereign identity?†Sovrin Foundation, Provo, Utah, Dec. 2018. [Online] . Available: https://sovrin.org/faq/what-is-self-sovereign-identity
• “SelfKey DID method,†SelfKey Foundation, 2019. [Online] . Available: https://github.com/SelfKeyFoundation/selfkey-did-ledger/blob/develop/DIDMethodSpecs.md
• T. Badretdinov, How to Create a Bitcoin Wallet Address From a Private Key. FreeCodeCamp, Jul. 2018. [Online] . Available: https://www.freecodecamp.org/news/how-to-create-a-bitcoin-wallet-address-from-a-private-key-eca3ddd9c05f
• O. Steele, “JSON web signature 2020,†World Wide Web Consortium, Cambridge, MA, USA, Jul. 2021. [Online] . Available: https://w3c-ccg.github.io/lds-jws2020
• “Decentralized identity, own and control your identity [White Paper] ,†Microsoft Corp., Redmond, WA, USA, 2018. [Online] . Available: https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RE2DjfY
• “Verifiable credentials, the ultimate beginner’s guide,†Tykn, 2021. [Online] . Available: https://www.dock.io/post/verifiable-credentials
• “What are verifiable credentials (VCs), demystified,†Medium, Mar. 2021. [Online] . Available: https://academy.affinidi.com/what-are-verifiable-credentials-79f1846a7b9
• D., Chadwick, D. Longley, and M. Sporny, “Verifiable credentials data model v1.1,†World Wide Web Consortium, Cambridge, MA, USA, Nov. 2021. [Online] . Available: https://www.w3.org/TR/vc-data-model
• W. Chang, C. Lehner, J. Caballero, and J. Thorstensson. “did:pkh method specification.†GitHub. Accessed: May 15, 2023. [Online] . Available: https://github.com/w3c-ccg/did-pkh/blob/main/did-pkh-method-draft.md
• “Chain agnostic improvement proposals.†GitHub. Accessed: May 15, 2023. [Online] . Available: https://github.com/ChainAgnostic/CAIPs
• K. Hamilton-Duffy, R. Grant, and A. Gropper, “Use cases and requirements for decentralized identifiers,†World Wide Web Consortium, Mar. 17, 2021. [Online] . Available: https://www.w3.org/TR/did-use-cases
Danielle Mallare (dmallare7@gmail.com) recently graduated from Northeastern University, earning a master’s degree in computer science. Previously, she was a college mathematics instructor for several years. She is currently employed as a software development engineer at Amazon Web Services CloudHSM.
Nachiket Dani (nachiketdani4587@gmail.com) recently graduated from Northeastern University, earning a master’s degree in computer science. Previously, he was an engineer at Johnson Controls and Capgemini. He is currently a software development engineer at Amazon Selection and Catalog Systems.
Tamara Bonaci (t.bonaci@northeastern.edu) earned her B.S. degree from the University of Zagreb and her M.S. and Ph.D. degrees from the University of Washington. She is an assistant teaching professor in the Khoury College of Computer Sciences, Northeastern University, Seattle, WA 98109 USA, and an affiliate assistant professor with the Department of Electrical and Computer Engineering, University of Washington, Seattle, WA 98195 USA. Her research interests include security, privacy, and the societal impact of emerging technologies, with a special interest in biomedical technologies.
Digital Object Identifier 10.1109/MPOT.2023.3272774