Rethinking Blockchain HSM Security Through a Confidential Computing Approach

Published on

Mar 8, 2023

“I love the smell of a data breach in the morning,” said no one ever. While a cheap joke may provide a momentary smile, the reality is that data breaches and the loss of high-value digital assets are no laughing matter, especially when they could have been prevented.

Blockchain ecosystems are built on advanced cryptographic processes that must operate with confidentiality, integrity, and high assurance at all times. When keys, code, and sensitive data are exposed and subject to theft or manipulation, the consequences can be severe, including data loss, reputational damage, fraud, and disruption. The theft of keys can go undetected for a long time, resulting in significant damage and value extraction by attackers.

In both traditional and newer distributed financial networks, there can be thousands of keys in use across various applications, gateways, signing nodes, transaction services, and smart contract systems, creating a vast attack surface that is a top priority for attackers. Unfortunately, breaches involving the theft of keys can result in massive losses. For example, the Vulcan network suffered a breach in December 2021 where 96 keys were stolen, leading to a loss of $140 million. Similarly, the Ronin network was attacked in early 2022, resulting in the theft of keys and reported losses of over $500 million.

Even in more traditional payment systems, key security is paramount. For example, at Post Bank in South Africa, insiders stole an HSM master key, enabling them to create fake transactions. This theft led to the re-issuance of over 12 million customer debit cards at a cost of over 1 billion Rand (US$57 million). These incidents underscore the critical importance of properly securing keys. Doing so can prevent major losses and pay off in a big way.

One strategy organizations often use to protect their encryption keys is to assign key custodians, employees tasked with managing, handling, and safeguarding access to the keys. However, this creates complex and cumbersome manual processes, as multiple custodians need to coordinate efforts to retrieve keys from secure rooms or safes and present them when unlocking a system. This messy process is further complicated by remote or distributed operations, which became more common during the pandemic. Waiting days for custodians to fly in, open a safe, apply a key, and then leave is not an option for many organizations, leading to increased risks to keep keys more readily available.

When a key is stolen, the consequences can be severe and vary depending on its purpose and scope. An attacker can use a stolen key to decrypt encrypted data, authenticate as a legitimate user, sign fraudulent transactions, or manipulate processes like consensus to their advantage. Even more concerning, a stolen key can be used undetected to decrypt data such as encrypted backups or TLS traffic or to impersonate someone else. The impact of a stolen key can be significant and potentially lead to financial loss or compromise of sensitive information.

So, how can keys be protected while still being accessible? It's not a straightforward answer. If a software-based approach is used to encrypt a key, what key is used to protect it? Requiring individuals to provide that key is not a scalable solution. Additionally, if a key is stored in memory, how can it be guaranteed that the code isn't being observed or tampered with? Can we be confident that no malware has been injected into the build-to-runtime DevOps process and is observing the key or its operations, potentially leading to theft and compromise?

The proposed solution to protect keys while keeping them available is to use secure, isolated hardware devices. However, this solution raises several concerns that need to be addressed. For instance, the hardware device needs to meet the specific needs of the application and be able to scale effectively. It should also be elastic and easy to deploy near the application to keep latency low. Additionally, there needs to be a clear plan for managing the hardware device and its necessary algorithms, as well as a backup plan in case of data loss due to tampering. Finally, the hardware device should be capable of protecting both code and data, not just keys.

Hardware Security Modules (HSMs) have been the traditional approach for protecting keys and internal firmware in a physically isolated box, often filled with epoxy and ready to trigger memory erasure if anyone attacks it with a drill. For example, a payment application might call a Payment HSM, which retrieves an encrypted key from a database, decrypts it inside the HSM, and uses it to validate a debit card PIN number using the associated card verification code in the HSM for the given payment scheme. HSMs are required for most payment networks, as they must use a FIPS 140-2 or PCI PTS Validated HSM device. However, this is a relatively static and old-school centralized environment that harks back to the late 80s and 90s, way before cloud and modern applications. The algorithms used in payments mostly involve AES with 128-bit keys and Triple-DES, which is outdated and not suitable for modern applications. In addition, HSMs may not be easily scalable or elastic, and their deployment may require significant resources and management.

In the blockchain world, agility and decentralization are critical, which poses a problem for traditional Hardware Security Module (HSM) solutions. HSM platforms are designed to meet centralized, traditional requirements and algorithms, and are often slow to update. Blockchains may use algorithms and protocols that aren't always available in HSMs and require automation for data operations and transactions to happen wherever they're needed. Additionally, securing protocols and processing code in a proprietary HSM can be extremely difficult or even impossible, and manual key processes such as backing up and restoring keys can be a burden for distributed blockchain businesses. While hardware-based protection is the right approach, HSM limitations can make them impractical or uneconomical for blockchain ecosystems.

Fortunately, confidential computing offers a modern and powerful hardware-based approach to security that can address these limitations. Confidential computing provides strong CPU hardware protections over entire workloads and offers hardware-based roots of trust that can be used to avoid the manual key entry or "secret zero" problem, enabling fully automated deployments. With confidential computing, inherent trust can be established in the computing environment before code and secrets are presented to them encrypted. Once this trust is proven, data is encrypted in memory by the CPU and decrypted only by the confidential processor automatically.

Confidential computing technologies, such as Intel SGX, AMD SEV, and AWS Nitro Enclaves, are now widely available in major clouds. These technologies allow for the isolation of code, data, keys, and are purposefully designed for isolation from attack, providing much of the protection offered by an HSM, but on an elastic basis wherever sensitive data or code needs to be processed.

In some blockchain operations, applications must securely commit to the blockchain with integrity. Often, this requires integrity over the entire application, file, data, keys, and process. This is not always realistic for an HSM, and would require re-architecting the application to break operations into smaller chunks to pass to an HSM API. Refactoring and redesigning proven applications and dealing with arcane HSM interfaces can be a huge burden, and it may also leave gaps in end-to-end integrity, which can significantly impact trust. Plus, there's the problem of securing the HSM access credentials in memory too.

However, confidential computing fundamentally changes the game for sensitive data processing. It provides a super secure solution that protects all the code too, with the entire traditional application, microservice, or cloud-native Kubernetes application running protected in hardware. With a fully trusted compute ecosystem, we have a true end-to-end security model - in hardware. Such "enclaves" of trust, privacy, and security make entire workloads invisible to attackers, insiders, privileged admins, etc. For the first time, the code and data are protected in use without complication while running isolated and encrypted - even from the clouds themselves, by design.

Confidential computing has been made possible by advancements in hardware, but a software layer is also necessary to make it accessible and widely used. Anjuna Confidential Computing Platform unlocks confidential computing CPU technology across various clouds and chipsets, delivering unmatched security and performance. By using Anjuna, organizations can run sensitive workloads in high-trust environments, where data is always encrypted and code is verified for authenticity, without the need to modify their applications. This approach provides flexibility and accelerates the time to realize value.

Now those sensitive processes, smart contract systems, key managers, encryption, signing, and other systems can run secure and isolated - anywhere. That's a remarkable capability. Perhaps one day, the traditional payment HSM world will catch on too. Then all we'll need to worry about is enjoying the aroma of freshly brewed coffee purchased using modern payment blockchain mechanisms, with back-end systems operating free from the risks of cyber attacks, ensuring privacy over our data, and with agility and trust that bring unprecedented opportunities to do new things - in the cloud, at the edge, or anywhere else.

‍