Runtime Protection for Secrets Management

Hashicorp Vault is one of the most popular secrets-management solutions. It helps manage secret parameters, cryptographic keys and authentication tokens and credentials centrally, providing visibility and control over access policies and tokens. While it solves the operational nightmare for DevOps teams, it is a high-value target for attackers. Instead of lateral movement within the organization and taking control over hosts one-by-one, gaining access to the host that runs the secrets-management solution could provide access to the whole infrastructure. While organizations invest much energy and resources into securing the hosts that run secrets-management solutions like Vault, it still doesn’t solve the runtime security problem. Hardening the host via traditional means would not protect the application from zero-day privilege escalation attacks, from an insider threat, or an untrusted infrastructure provider.

 

In-Memory Protection of the Master Secret Key

Vault encrypts the contents of its storage backend (whether it is memory, filesystem, Consul, etc.) using a Master Secret Key, which is not present on the host until Vault is unsealed. Unsealing is the process of providing this Master Secret Key to Vault, whether it is reconstructed from Shamir secret-shares, retrieved from a KMS service or from a HSM (Hardware Security Module), after which Vault becomes operational and can serve client requests. This Master Key is essentially the entry ticket to the goldmine of secrets, cryptographic keys and authentication credentials managed by Vault.
However, from this moment on, the Master Secret Key resides in memory and is vulnerable to memory scanning attacks (similar to what the Mimikatz does with Windows passwords). Once the Master Key is retrieved from memory, it can be used to decrypt the stored secrets. This is not specific to Vault, but rather a general problem with runtime security for any secrets-management solution or sensitive application.

Anjuna closes this gap with its Runtime Protection for Vault. By executing the Vault server inside a secure enclave, it eliminates the possibility of scraping Vault’s memory even when the attacker has root access or physical access to the machine’s memory.

vault-config.png

Protecting the Auto-Unseal Process

Vault starts in a sealed state, where the storage backend is protected via encryption and secrets are inaccessible. In order to become operational, Vault needs to be unsealed. It can either be done manually, by inputing the unseal tokens which are essentially Shamir secret-shares of the Master Key used to encrypt the key-value store, or automatically, by connecting to an HSM (Hardware Security Module) or a Cloud KMS (Key Management System). This automatic unsealing of Vault is called the auto-unseal feature, and is useful in automated environments where manual unsealing is prohibitive to operations. In addition, manual input of the unseal token can be subject to key-logging attacks, or memory scraping such as the one described in the previous section.
However, to initiate auto-unsealing, Vault needs to authenticate itself to the HSM using a secret such as a PIN-code. That PIN-code becomes the secret zero on which security relies. It is often stored in plaintext in a Vault configuration file (.hcl), and we end up with a having the Master Key secured using a costly HSM infrastructure, while the access to that HSM is unprotected.

Anjuna solves this secret-zero problem by sealing the Vault configuration such that it is only accessible to an instance of Vault running in a secure enclave, prohibiting even an attacker with root-access to obtain the authentication credentials.

We released the Secure Unseal solution for Vault as a free tool and it is available for download through https://docs.anjunasecurity.com/vault-unseal.

vault-config-tls.png

Application Identity Protection

Application identity traditionally relies on the security of the host, the VM or the container where the application is running. Privileged (admin/root) access to the machine allows an attacker to assume the application’s identity and interact with clients on its behalf, posing as the authentic service. For instance, Vault can (and should) be configured to use TLS. To that end, it uses a private key (.key) and a certificate (.cert) stored on the disk in order to authenticate itself to connecting clients.
An attacker that is able to access files on the storage is able to get ahold of the private key file (.key) and spin-up a malicious Vault server instance. Unsuspecting clients will interact with it and send the attacker sensitive data such as unseal tokens, secrets, cryptographic keys and tokens and so on.

Anjuna solves this by encrypting the key files with a seal key that is only accessible to a Vault instance running inside a secure enclave. The private key is protected at-rest and at runtime, eliminating any possibility to steal it and maliciously assume Vault’s identity.

Runtime Protection for Vault and Consul

Anjuna presented as part of the first HashiTalks online event—A 24-hour continuous series of presentations from the worldwide HashiCorp User Group (HUG) community and from HashiCorp engineers as well. The event took place from February 21-22, 2019.

The session explored our Runtime Security solution based on secure enclaves, such as Intel Software Guard Extensions (SGX). While there is tremendous promise in Intel SGX, adoption so far has been limited to very specific products where development teams were able to put in significant engineering effort to secure small (and sensitive) parts of their applications. Moreover, the lack of straightforward interoperability with modern high-level languages like Go further limits the usability of Secure Enclaves.

In this talk, we demonstrated a way to secure HashiCorp Vault from attackers that have complete control of the host server, by loading the application into a Secure Enclave. The user experience remains unhindered since all APIs and interaction with the Vault server remain as they were. Lastly, the talk will explain how to establish trust between the protected Vault instance and remote Vault clients using an attestation mechanism that is elegantly integrated into HTTPS.

Container Security Hole: RunC Breakout to Root

We can’t think of a better way to put it then the ZDNet article [1]:

One of the great security fears about containers is that an attacker could infect a container with a malicious program, which could escape and attack the host system. Well, we now have a security hole that could be used by such an attack: RunC container breakout, CVE-2019-5736.

Researchers Adam Iwaniuk and Borys Popławski have discovered a vulnerability that applies to multiple container frameworks, including Docker (Swarm and Kubernetes clusters are affected), LXC and Apache Mesos. It enables an attacker to break out of a container, and gain root privileges on the host. That, in tun, could enable an attacker to attack the rest of the infrastructure, or to turn to other containers on the host, and compromise the security and privacy of business critical applications.

To understand the industry impact of this vulnerability, it is best to see the quote by Scott McCarty, technical product manager for containers at RedHat:

The disclosure of a security flaw (CVE-2019-5736) in runc and docker illustrates a bad scenario for many IT administrators, managers, and CxOs. Containers represent a move back toward shared systems where applications from many different users all run on the same Linux host. Exploiting this vulnerability means that malicious code could potentially break containment, impacting not just a single container, but the entire container host, ultimately compromising the hundreds-to-thousands of other containers running on it. While there are very few incidents that could qualify as a doomsday scenario for enterprise IT, a cascading set of exploits affecting a wide range of interconnected production systems qualifies...and that's exactly what this vulnerability represents.

It means that managed container services where your application is likely running alongside applications of other users, may not provide appropriate confidentiality and privacy guarantees until the vulnerability is patched.

RunC maintainer Alexa Sarai responded to the disclosure and provided a patch that mitigates the vulnerability. LXC and Mesos containers might still be vulnerable. More importantly though, the general problem underlying this particular case, a lack of adequate isolation between containers on the same host, remains a huge concern.

This disclosure follows shortly after the finding of a major Kubernetes vulnerability disclosed in December and adds up to a series of sobering realizations regarding container security.

Mitigation with Anjuna

Using Anjuna, one doesn’t have to throw the baby (the ease of deploying applications with containers) with the bathwater (container security issues). Anjuna enables to easily run your container applications inside a secure enclave that provides hardware-grade isolation from anything else on the host, regardless of the infrastructure security, or the security of the container framework. The secure enclave technologies leveraged by Anjuna ensure that even an attacker that successfully escaped the container, and obtained root-level access to the host would not be able to access your application’s data at runtime or at-rest.

References

[1] https://www.zdnet.com/article/doomsday-docker-security-hole-uncovered/
[2] https://thenewstack.io/critical-vulnerability-allows-kubernetes-node-hacking/

Why protection against root users is important

Security company Qualys has recently disclosed vulnerabilities in Linux’s Systemd, the default service manager daemon for many Linux distributions [1]. They effectively enable a non-privileged user to obtain root privileges. This follows another disclosure, from about 7 months ago, related to a different Systemd vulnerability. Thus, an attacker would be able to access any sensitive workloads on the host by leveraging those vulnerabilities. The disclosures were assigned CVE-2018-15688, CVE-2018-15687 and CVE-2018-15686. Rather than discussing the specifics of these vulnerabilities, we want to talk here about the more general problem with relying solely on the OS to secure your sensitive applications.

The underlying problem is the lack of isolation between an application and privileged accounts or the operating system. While the root account needs to be able to configure the host, there is no reason for it to be able to peek into one’s application data. To that end we advise running your applications inside secure enclaves while also sealing the persistent state of the application. Secure enclaves, such as Intel SGX, can guarantee that the application’s memory and persistent state are accessible only to the application, and not even an administrator would be able to access them.
Anjuna provides an easy way to run an entire application inside a secure enclave without the need to rearchitect it. This approach essentially decouples the application security from the security of the host on which the application is executed, and tightens the security perimeter to be around the application itself rather than having to deal with a larger security perimeter that is hard to address.

If you are interested in learning more about how Anjuna can help protect against similar kinds of threats, you are welcome to reach out to us through our website.

References

  1. New Linux Systemd security holes uncovered: https://www-zdnet-com.cdn.ampproject.org/v/s/www.zdnet.com/google-amp/article/new-linux-systemd-security-holes-uncovered/?amp_js_v=0.1

SGX, or how I stopped worrying about the microchip hack

Bloomberg Businessweek had recently reported on a hardware supply chain allegedly infiltrated by government sponsored hackers, who, according to the report, installed a malicious microchip into motherboards to spy on US-based companies.
The accused government has denied any involvement in this episode and the companies that, according to the report were targeted, denied finding evidence for it. No one has yet come forward to share the technical details of the implanted microchip. But even without these details, we can discuss the potential implications of such an attack, and how the technology developed by Anjuna can help customers regain confidence in their application security in light of similar threats.

First, let us examine a few possibilities for hardware assisted security compromises, and what such an implant could potentially do. Disclaimer: we do not go into discussing the physical properties of such a chip, and whether it can be as small as illustrated in the Bloomberg article.
We assume the chip can read data passing from the processor (CPU) to the memory (DRAM), and in addition tamper with that data. This provides the capability to either exfiltrate data, or potentially redirect the execution flow of applications, or even the operating system, making it easily exploitable. For instance, if a the piece of code that checks the permissions of an account is circumvented, a regular user may gain the capabilities of a superuser or administrator. Another avenue would be to exfiltrate sensitive data that is only supposed to reside in memory (and not in persistent storage) by using a hardware implant that can access both the memory bus, and the I/O to the hard-drive or SSD. One example where it can be an issue is applications that store data at-rest in an encrypted form, and then decrypt it in memory in order to operate on it (secrets management applications are such an example).

It is no easy operation to compromise a supply chain and the assembly line, but it is feasible, especially for advanced attackers. In a world where supply-chain hardware attacks are possible, we need to consider a new security paradigm where we accept that hardware may be compromised and move toward a zero trust model.
Our goal should be minimizing trust in various hardware components, and tightening the security perimeter around our applications. At the hardware level, the best we can do (while still allowing for general-purpose computation) is to minimize trust to the processor chip, the CPU. At the software level, minimizing trust to the application itself, and removing the trust in the operating system and the storage is a great trade-off between usability and security - enabling applications to focus on the business logic, without worrying about anything running alongside them on the host.

Anjuna does just this with its Runtime Security solution based on Intel® Software Guard Extensions (SGX). It creates a software perimeter around the application that eliminates the need to trust the operating system, the hypervisor or the host machine - protecting it from zero-days, physical access and privileged users, be it hackers or insiders.
Intel® SGX essentially enables a completely new and unprecedented security model. Any data that leaves the CPU boundary and is written to the memory (and could potentially be intercepted by a hardware implant) is encrypted using a key that is generated in the processor. For readers who are familiar with the Cold-Boot and RowHammer attacks - the Intel® SGX technology can prevent those as well.
Anjuna enables to easily take advantage of the security guarantees of Intel® SGX, with a runtime that integrates transparently into the user’s infrastructure, without the need to rearchitect or modify the protected applications. Using the solution to protect critical applications can restore the confidence that the data is protected, regardless of the origin of your hardware appliances.

Foreshadow

Security researchers from KU Leuven, Technion, University of Michigan and University of Adelaide, have recently published their findings on an attack called Foreshadow, that compromises the security of Intel SGX enclaves. Once again, speculative execution was exploited in a manner similar to the Meltdown attack.
In our previous post, we explained how Intel Software Guard Extensions (SGX) protect against a Meltdown attack attempted from the untrusted part of a process hosting an enclave. However, the Foreshadow team showed how one can use a Meltdown-like technique to read the victim enclave's memory, conditioned on sensitive data already being in the cache. The official disclosure of the vulnerabilities, named L1 Terminal Fault by Intel, took place on August 14th, 2018, and they were filed as CVE-2018-3615, CVE-2018-3646 and CVE-2018-3647.

In this post, we explain the implications of the attack, its mitigations via microcode updates, and how Anjuna helps ensure that a backend application is running on an updated hardware secured against this attack.

The Foreshadow attack on Intel® SGX

SGX enables the creation of "secure enclaves", that enable an application to protect its sensitive data even from the kernel or a user with root access to the machine, effectively limiting the attack surface to the secure enclave interfaces, rather than everything else on the host.
The Foreshadow attack, discovered by Van Bulck et al. [2], leverages speculative execution and a cache-timing side-channel to extract information from enclaves.

MITIGATIONS

Intel mitigates this attack by flushing the L1 cache upon enclave exit events, which invalidates the requirement that Foreshadow relies on, that secrets should be present in the L1 cache in order for the attacker to be able to circumvent abort-page dynamics.
This update to the behavior of enclaves has caused a recent increment in the Security Version Number. In addition, to prevent attestation forgery, the old EPID group was invalidated and is no longer authorized by the Intel Attestation Service (IAS).
Intel released a security advisory for the issue, which it calls L1 Terminal Fault.

In order to make sure previously stored data will not be accessible (by reverting a microcode update), we recommend re-sealing and re-signing the data with new keys that can only be accessed by an up-to-date enclave with microcode updates deployed.

IMPLICATIONS FOR ATTESTATION

Attestation is a cornerstone capability of TEEs (Trusted Execution Environments), which enables a server application to provide a cryptographic proof to a client communicating with it, that it indeed runs inside a secure enclave, and assures that it is safe to send over sensitive data to it, or trust that its outputs are authentic. There are multiple ways to perform attestation, but all of them must rely on secure primitives to achieve their guarantees.
The most practical implication of Foreshadow is on the EPID-based Remote Attestation of enclaves. EPID attestation is based on a group signature scheme that enables to sign an enclave report, attesting to the initial state of the enclave, and cryptographically proving to a remote verifier that the report was generated by an authentic SGX enclave. The signing is performed using a key that is embedded in the processor and can only be accessed by the Quoting Enclave (QE). The extraction of this key enables an attacker to forge an attestation for any enclave identity. 

Anjuna does not use EPID-based attestation when anonymous attestation is not required, and unlinkability is not needed. Instead, we use standard PKI for most of our customer's scenarios. It is therefore sufficient to regenerate certificates and provision the enclaves with the corresponding private keys. The increment in the CPU Security Version Number (CPUSVN) guarantees that those new private keys are inaccessible to platforms where the security updates were not deployed. As such, the attestation mechanism is a perfect way to ensure that client applications are communicating with backend instances on properly patched servers.

BENEFITS OF SGX IN LIGHT OF FORESHADOW

Somewhat counterintuitively, we believe that some of the implications of Foreshadow actually strengthen the case for Intel® SGX, and specifically for hardware-based remote attestation. As mentioned in the original Foreshadow paper [2], and in Foreshadow-NG by Weisse et al. [3],  the L1 Terminal Fault has implications reaching far beyond enclaves, to inter-VM scenarios, hypervisor memory inspection, separate processes running on the same physical core with hyper-threading, etc.
While the mitigations provided by Intel address the issues, users must ensure that the updates were actually deployed on the infrastructure they are running on. In this sense, Intel® SGX can be extremely helpful in providing a signed report that notifies a remote client whether updates have been deployed, and which security features are activated on the host (for instance, whether HyperThreading has been turned off or not). Similarly to previous microarchitectural attacks like Spectre, Foreshadow actually strengthens the need for such attestation, even for scenarios where the operating system is trusted.

Architecture and Microarchitecture

We can reason about what goes on in a processor on various levels. One of them is the architectural level, that defines the semantics of executing programs, and the other is the micro-architectural level, which is how these semantics are actually implemented underneath.
On the architectural level, things have been pretty solid - the semantics of Intel® Software Guard Extensions were formally verified by Subramanyan et al. [1]. However, on the micro-architectural level things have been discovered to be a bit shakier during the past year and a half. After several years during which cache-timing side-channel attacks were studied and perfected by researchers from academia and industry, the big blow came with the discovery of the Meltdown and Spectre attacks, concurrently and independently by Jahn Horn from Google's Project Zero, and by Paul Kocher (co-founder of Cryptography Research) and other researchers from academia and industry.
These attacks have shown that the assumptions about memory isolation between either different protection rings, or between different processes, are violated by micro-architectural side-channels that leak information across the architectural boundaries. For instance, Meltdown uses the fact that kernel memory is accessed speculatively from an unprivileged protection ring, leaving measurable effects in the processor's cache that leak content that should only be accessible to the privileged protection ring.

A silver lining in this story is the decoupling between the architecture and the microarchitecture. The Foreshadow paper explicitly states:

““We want to emphasize that Foreshadow exploits a microarchitectural implementation bug, and does not in any way undermine the architectural design of Intel SGX and TEEs in general. We strongly believe that the non-hierarchical protection model supported by these architectures is still as valuable as it was before.”

— Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution

Dealing with the complexity of Intel® SGX

Anjuna helps its users to handle the complexities of SGX, and provides the tools that enable to take the necessary steps to restore security and privacy following such vulnerabilities, by migrating data to new enclave versions, and making sure new attestation keys are re-provisioned.

We are excited to see that the brightest minds in the security field from academia are researching Intel's SGX technology, over time helping make it one of the most secure solutions for application security and privacy.

REFERENCES

  1. Subramanyan, Pramod, et al. "A formal foundation for secure remote execution of enclaves." Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2017.

  2. Bulck, J. Van, Minkin, M., Weisse, O., Genkin, D., Kasikci, B., Piessens, F., Silberstein, M., Wenisch, T., Yarom Y., Strackx, R. (2018). Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution. In 27th USENIX Security Symposium (USENIX Security 18).

  3. Weisse, O., Bulck, J. Van, Minkin, M., Genkin, D., Kasikci, B., Piessens, F., Silberstein, M., Wenisch, T., Yarom Y., Strackx, R.(2018). Foreshadow-NG : Breaking the Virtual Memory Abstraction with Transient Out-of-Order Execution. Technical Report.

Meltdown & Spectre and what it means for Intel SGX

We are occasionally asked about the implications of the recently disclosed Meltdown and Spectre CPU vulnerabilities on the security of Intel® SGX and secure enclaves. This post answers the questions we encounter and addresses the concerns regarding application security in light of these vulnerabilities.

Meltdown & Spectre 101

What Meltdown and Spectre have in common is that both techniques exploit speculative execution. Speculative execution is intended to speed-up programs by performing multiple instructions in parallel, often before the program can know for sure that the instruction is indeed needed, or valid in the current execution context (to be precise, Meltdown exploits out-of-order execution, but we consider it speculative since in that context the instruction is actually invalid). The objective of both attack techniques is to infer the content at a physical memory location that should not be accessible to the attacker. For instance, the attacker could try to have a peek into the memory space of the operating-system's kernel, or into the memory of another process.

MELTDOWN

Meltdown exploits the fact that the CPU executes instructions in the pipeline out of order, including potentially prefetching a memory address that should not be accessible in the attacker's execution context. When it gets to the stage of committing the result it would actually check whether the memory access was legal and discard the instruction in case it wasn't. However, at this point it is too late, since certain subtle information, such as the time it took to access the memory is already available to the attacker. The Meltdown paper [Lipp et el.] illustrates it with the following toy-example:

raise_exception();
access(probe_array[data * 4096]);

The result of the memory-read in the second line is not supposed to be committed since an exception is raised right before that. However, due to out-of-order execution, the memory address is nevertheless accessed. It has the side-effect that the content is read into the cache, which is observable by an attacker using a cache side-channel.

SPECTRE

Spectre exploits conditional branch prediction to execute instructions that would not be executed normally, and pull otherwise inaccessible memory content into the cache. The CPU is "trained" to expect a certain sequence of instructions to happen. For instance, in the following example, as long as the if condition is fulfilled, the assignment to y is executed.

if (x < array1_size) {
    y = array2[array1[x] * 256];
}

By executing the block multiple times, the attacker could train the branch-predictor to assume that the line would be executed (regardless of the value of x). The attacker then triggers the execution of this block in the context of the victim process, or the kernel, with values of x larger than array1_size, in order to access memory addresses that are outside the bounds of array2. Combining it with a cache side-channel, the attacker is able to infer the content seen by victim process.

An important observation, made by the researchers who disclosed Spectre,  is that speculative execution was only observed when the destination address was accessible by the victim thread. It implies that, unlike in Meltdown, exploitable code has to actually be present in the executable memory of the victim. That makes it non-trivial for attackers to execute a Spectre attack, since they first need to identify vulnerable gadgets within the target.

So what does it mean for secure enclaves?

While leveraging similar techniques, and having the common theme of virtual memory protection  bypass, Meltdown and Spectre are, in fact, somewhat different. Spectre exploits vulnerable code (gadgets) in another process, or in the kernel. Meltdown, on the other hand attempts to access normally inaccessible virtual addresses within the attacker's process itself, i.e. within the same "execution context". Thus, Meltdown and Spectre have different implications for enclaves.

ANJUNA'S SECURE-RUNTIME CAN PROTECT CRITICAL APPLICATIONS AGAINST THE MELTDOWN ATTACK USING ENCLAVES

An enclave has a different execution context in the sense that it gets access to decrypted content within the Enclave Page Cache (EPC), whereas any other execution context only gets to see the content in its encrypted form. An attacker that attempts to read from an inaccessible memory location using Meltdown, will be redirected to an abort page, which has no value. In this Git repository, we provide sample code that demonstrates how SGX protects against Meltdown.
Eventually, Intel is likely to address Meltdown with a micro-architectural solution, but until that happens the recommendation is to deploy OS-level protections such as KAISER. In zero-trust environments, one cannot count on OS-level protection to protect against the Meltdown exploit, since we need to take into account powerful attackers who might disable KAISER and KASLR, and enable mapping of kernel pages into the attacker's process memory.

Spectre is currently outside the threat model of Intel® SGX. However, Spectre can be mitigated by making sure that the code running inside the enclave does not provide easily exploitable gadgets, and applies protective measures on the application level. Intel has recently released guidelines that can help harden code running within enclaves against Spectre. In addition, it is important to monitor public announcements that point to potentially exploitable gadgets within known applications, in as much as we, security practitioners, are used to monitor CVEs that disclose traditional vulnerabilities. Working closely with our customers, we carefully inspect the applications that we help secure, to make sure they are hardened against side-channel attacks.