An in-depth look at these vulnerabilities and how they affect you
Recently, several parallel groups of researchers announced the discovery of two vulnerabilities in modern CPUs from Intel, AMD, and ARM-based chip manufacturers that threaten the performance and security of modern computer systems. The Meltdown vulnerability has received the most headlines, but the other, named Spectre, is more universal and pernicious. Both vulnerabilities result from performance optimizations in modern CPU design that have become commonplace since the 1990s. Their discovery calls into question many commonly-held assumptions about data security in our computing environment and raises questions about the existence of similar types of vulnerabilities that have not been publicized or discovered. Unfortunately, current solutions for Meltdown degrade the levels of CPU performance we have come to expect, while Spectre will have adverse impacts on multi-tenant, cloud-based systems that have become prevalent through providers like AWS, Google, and Microsoft Azure.
Both of these vulnerabilities result from architectural features of CPU chips that have been introduced to improve the performance of modern systems. These CPUs are the very heart of the tech industry today. Much of the world has benefitted greatly from the secular improvements in their performance. They have enabled many advanced capabilities we have come to rely upon in everyday life. It is also apparent that these improvements have come at the expense of security flaws and that more attention needs to be paid to ensuring these systems are secure at their foundation.
Modern CPU Architecture
To best understand these vulnerabilities, we must first understand some of the architecture that underlies modern CPUs, as well as some of the core security features and assumptions that underlie modern computing.
The traditional simple model of stored program CPUs is that of an engine that serially executes instructions stored in memory, and that operates on data also stored in memory. Computer memory can be accessed randomly by the CPU through an address scheme that associates physical addresses with storage locations in memory. The CPU also has available to it a set of registers that are accessible faster than computer memory. These registers provide a quicker set of sources and destinations for computations. Instructions can operate on a combination of these register and memory locations, allowing immediate and longer-term storage of results.
The contents of the registers and of the memory are referred to as the architectural state of the computer. In addition, the architecture may also define a set of bits or flags that are used for memory protection, operating system services, and error handling. One of these bits, for example, is set to protect operating system memory, or kernel memory, from non-privileged programs. If kernel memory is accessed without this bit being set, an exception (error) is raised. This protects the kernel from unauthorized access. These features are also part of the architectural state that is fully visible to the software running on the CPU.
Over time, there have been a number of optimizations applied to this basic model of computing, and three types are relevant to understanding Meltdown and Spectre: virtual memory, cache, and speculative execution.
Virtual memory refers to a division between the model of address space that a software program sees and the physical memory of the computer. Because fast random-access memory (RAM) was historically very expensive, facilities were designed into the hardware to allow the addressable memory space available to a software program to be significantly larger than the actual physical RAM on the computer. To make up the difference between the size of virtual memory and physical RAM memory, slower memory systems, such as rotating disk memory, were used to expand storage for programs and data. In the late 1960s, computers began to be equipped with hardware that would map the address space the software program sees (the virtual address space) to the larger space provided by the combination of physical RAM and disk memory (i.e., the physical address space).
Cache refers to special memory accessed by the CPU at a faster access rate than RAM memory, enabling faster processing speeds. Cache memory provides another layer in the memory hierarchy that is slower than register access, but faster than RAM access. In modern CPUs, there are up to three layers of successively faster cache memory to speed access to instructions and data in memory. Figure 1 presents the relative speed of different levels in the memory hierarchy, along with the speed of other operations.
Figure 1. Relative Speeds Computer Processes
While virtual memory is managed by the operating system, cache loading and flushing are under the control of the CPU itself and not under control of the software. Which brings us to the third type of performance optimization in modern CPUs, speculative execution. Underlying these optimizations is another layer of architecture that defines the CPU, called the micro-architecture. The micro-architecture is a set of features and mechanisms that are not directly visible to the software running on the CPU. Rather, it comprises a set of operations (micro-operations), registers, cache, and execution units out of which the program-visible architecture is built. To better understand this micro-architecture, think of a CPU instruction, for example:
ADD R1, R2
This instruction adds the contents of register R1 to the contents of register R2 and deposits the result in R1. This instruction can be broken down into a set of micro-operations which take place when the ADD instruction is executed. These behind-the-scenes micro-operations are invisible to the software.
In this case, they might be:
MOV R1 -> Adder Input 1
MOV R2-> Adder Input 2
MOV Adder Output -> R1
Over time, this micro-architecture becomes very complex. It even includes the pre-fetch and execution of multiple instructions at once to enable faster execution times (e.g., instead of executing each instruction sequentially, it executes the next three instructions at once). Because program logic may make calculations that depend on the results of prior calculations, the results of some of these pre-executions may be invalidated by prior instruction executions. This leads to the idea of a micro-architecture that can support simultaneous execution of future instructions and rollback of results if the earlier calculations are invalidated. This is called speculative execution.
Speculative execution applies not only to operations that do not affect the flow of control of a program, like adds, subtracts, multiplies, loads, and stores; it can also be applied to branching logic that changes the flow of control. This is done through branch prediction, which statistically profiles likely branch paths based on prior execution and selects which path a branch is most likely to take. If the prior instructions cause a different path to be taken, the instructions following the predicted branch are abandoned and the resulting architectural state is rolled back. Figure 2 shows the complexity of the micro-architecture of a modern CPU that supports the scheduling of pre-executed instructions, instruction rollback capabilities, and multiple levels of cache memory.
Figure 2. Micro-architecture of a Modern CPU
These micro-architectural optimizations radically increase the complexity of CPU operation beyond the simple engine model of stored program CPUs described at the outset. Most important for understanding the newly discovered vulnerabilities, these optimizations are designed to be invisible to the running software.
The Meltdown and Spectre vulnerabilities share a common characteristic: they both depend on optimizations in the micro-architectural state becoming visible to the software. That is, both are unintended side effects of optimizations, such as speculative execution, changing the value of cache memory in such a way that a cleverly-written program can detect the changed values.
In the Meltdown vulnerability, speculative execution of an instruction that accesses kernel virtual memory causes an exception (an error) that can be intercepted by the attacking program before the operating system notices. If the attacking program can insert a value into a data array in the cache before the operating system handles the error, a program executing in the normal fashion outside the kernel can detect this value by measuring how fast values in the data array can be accessed (the cached value is accessed faster than the non-cached values). The effect is that protected kernel memory, which normally is not accessible to non-operating-system-privileged software can be exposed. And that means that secrets kept in the operating system kernel, such as passwords, can be stolen by a malicious programmer.
The Meltdown vulnerability only affects (millions of) Intel CPUs. There is already an operating system fix for this vulnerability that is based on a patch for a previously-known vulnerability. Unfortunately, this fix slows down kernel system calls, which, depending on the type of program that is running, can cause a 5-25% decrease in performance! This performance hit will impact large enterprise servers and SaaS/Cloud infrastructure most since those users will depend more heavily on CPU performance than individual PC or mobile phone users. Think of it as a tax on performance that will only be ameliorated by a new generation of CPU chips that don’t have this vulnerability.
The Spectre vulnerability is more problematic. It is also enabled by speculative execution, in this case of branch instructions. Unlike Meltdown, these branch instructions do not have to be in the operating system kernel. Rather, it allows instructions to be able to access data across process boundaries, which violates the separation of different users’ data that is assumed in multi-tenant systems.
The Spectre vulnerability is a characteristic of Intel, AMD, and ARM-licensed chips, so it affects virtually all CPUs on the market today. Because it is a design flaw in the CPU that can run outside of the operating system, it cannot be patched through the operating system as with Meltdown. Remediation of Spectre requires some very clever programming to avoid branches in sensitive areas of code.
Because it affects isolation of data between processes, it is particularly impactful on multi-tenant cloud providers such as AWS, Azure, and Google. While workarounds are already being developed by these vendors, they increase code complexity and undermine the perception of security in cloud-oriented multi-tenant systems. As with Meltdown, only a new generation of CPU chips will fully remediate this potential security hole.
At the conclusion of the Spectre paper, the team that discovered the vulnerability writes:
There are trade-offs between security and performance. The vulnerabilities…arise from a longstanding focus in the technology industry on maximizing performance. As a result, processors…have evolved compounding layers of complex optimizations that introduce security risks. As the cost of insecurity rise, these design choices will need to be revisited.
Meltdown and Spectre are examples of attack vectors arising from the complexity of today’s CPU micro-architectures and the failure of chip designers and manufacturers to adequately understand the security impacts of performance optimizations. It is quite likely that more examples of this type of vulnerability exist in the wild. Because these vulnerabilities are not easily remediated and may have significant performance impacts on current CPUs, the effects of these discoveries will be with us for years to come. Perhaps knowing this will forever change how we look at the computing environment we so depend upon.
[Contributed by Rob Gurwitz, Executive Partner]