程序代写 ISBN 978-1-4503-6240-5/19/04. . . $15.00 https://doi.org/10.1145/3297858.33

Heterogeneous Isolated Execution for Commodity GPUs

School of Computing, KAIST Daejeon, Republic of Korea

Copyright By PowCoder代写 加微信 powcoder

Adrian of Computer Science, Columbia University
Taehoon of Computing, KAIST Daejeon, Republic of Korea

Department of Computer Science, Columbia University
Jaehyuk Huh

School of Computing, KAIST Daejeon, Republic of Korea
Keywords Trusted execution, Heterogeneous computing, GPU security
ACM Reference Format:
, , Taehoon Kim, , and Jaehyuk Huh. 2019. Heterogeneous Isolated Execution for Commodity GPUs. In 2019 Architectural Support for Programming Languages and Operating Systems (ASPLOS ’19), April 13–17, 2019, Providence, RI, USA. ACM, , NY, USA, 14 pages. https: //doi.org/10.1145/3297858.3304021
1 Introduction
In conventional CPU-based computation, hardware-based trusted execution environments (TEE) such as Intel SGX and ARM TrustZone have been providing trusted and iso- lated computing environments to user applications. Such hardware-based TEEs reduce the trusted computing base (TCB) of the computation to the processor and critical code running in TEE. With the TEE support, security-critical ap- plications can be protected from compromised privileged software as well as hardware-based attacks to the memory and system buses, to provide secure computation running on untrusted remote cloud servers.
With increasing use of general purpose GPU computing from traditional high performance computing to data center acceleration and machine learning applications, securing the GPU computation has become critical to protect secu- rity sensitive data [34, 45, 56, 57]. However, although even more and more critical data are processed in GPUs, trusted computing is yet to be supported in GPU computation. In the current system architecture, high performance discrete GPUs communicate with CPUs through I/O interconnects such as PCI Express (PCIe) buses, and the GPU driver which is part of the operating system controls the GPUs [25]. As the privileged operating system can fully control the hardware I/O interconnects and GPU driver, computing in GPUs is vulnerable to potential attacks on the operating system [8]. Beyond the GPU-based computing, the proliferation of vari- ous accelerator-based computing models has been increasing
Traditional CPUs and cloud systems based on them have em- braced the hardware-based trusted execution environments to securely isolate computation from malicious OS or hard- ware attacks. However, GPUs and their cloud deployments have yet to include such support for hardware-based trusted computing. As large amounts of sensitive data are offloaded to GPU acceleration in cloud environments, ensuring the security of the data is a current and pressing need. As de- ployed today, the outsourced GPU model is vulnerable to attacks from compromised privileged software. To support isolated remote execution on GPUs even under vulnerable operating systems, this paper proposes a novel hardware and software architecture, called HIX (Heterogeneous Iso- lated eXecution). HIX does not require modifications to the GPU architecture to offer protections: Instead, it offers se- curity by modifying the I/O interconnect between the CPU and GPU, and by refactoring the GPU device driver to work from within the CPU trusted environment. A result of the architectural choices behind HIX is that the concept can be applied to other offload accelerators besides GPUs. This work implements the proposed HIX architecture on an emulated machine with KVM and QEMU. Experimental results from the emulated security support with a real GPU show that the performance overhead for security is curtailed to 26% on average for the Rodinia benchmark, while providing secure isolated GPU computing.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from
ASPLOS ’19, April 13–17, 2019, Providence, RI, USA
© 2019 Association for Computing Machinery. ACM ISBN 978-1-4503-6240-5/19/04. . . $15.00 https://doi.org/10.1145/3297858.3304021

ASPLOS ’19, April 13–17, 2019, Providence, RI, USA
I. Jang, et al.
the demands for higher-level of security supports for accel- erators under the vulnerable privileged software.
In existing architectures, both of the code and data in GPUs can be compromised by a privileged adversary. Recent work has demonstrated that the integrity of GPU code can be subverted by disrupting and replacing the code at runtime with an off-the-shelf reverse engineering tool [13]. In addi- tion to code, data in GPU can potentially be uncovered and leaked [45]. GPU data vulnerable to confidentiality attacks comprises both the communication data being transferred to and from a GPU, and the data being processed within a GPU. The susceptibility of GPUs to confidentiality and integrity attacks stems from the lack of access control to their inter- faces such as the I/O interconnects and memory-mapped I/O addresses.
To support secure computing in GPUs, this paper proposes a novel hardware and software architecture for isolating GPUs even from the potentially malicious privileged soft- ware (OS and hypervisor). The proposed architecture, called Heterogeneous Isolated eXecution (HIX), requires minor ex- tensions to the current PCIe interconnect implementation and the TEE support in CPUs. The goal of HIX is to extend the security guarantees, namely confidentiality and integrity of user data, of TEE technologies to heterogeneous com- puting environments. At the time of writing, none of these technologies protect accelerators in heterogeneous systems from privileged software attacks; they only protect the code and data in trusted “enclaves” running on the processors. In this work, we expand the scope of a widely used trusted isolation technology, Intel SGX, to secure general purpose accelerators, in particular GPUs.
Our proposed architecture consists of four main hardware and software changes. First, key functions of the GPU driver are removed from the operating system (OS) and relocated in a separate process in its own GPU enclave. The GPU en- clave is an extension of the current SGX enclave, designed to exclusively manage the GPU. Second, the PCIe intercon- nect architecture is slightly modified to prevent the OS from changing the routing configuration of the interconnect, once the GPU enclave is completely initialized. Third, the memory management unit (MMU) is augmented to protect the mem- ory mapped GPU I/O region from unauthorized accesses. Fourth, the CPU counterpart process of a GPU application runs on an SGX enclave, and the SGX enclave sets up a trusted communication path to the GPU enclave, which is robust even against privileged adversaries.
To support the secure execution environments for GPUs without any GPU modification, HIX does not provide the protection against direct hardware-based attacks, as PCIe buses and the memory of GPUs are exposed to such hardware attacks in the current architecture. Although the security level is lower compared to the hardware TEEs for CPUs, HIX can be extended to other accelerators without requiring any
modification of the accelerators themselves, if the accelerator is connected via I/O interconnects.
We evaluate the proposed architecture in terms of security and performance. We have implemented a prototype for HIX on KVM and QEMU, adding extra instructions for the GPU enclave and separating the GPU driver from the operating system. The prototype using the emulation connected to a real GPU shows that the performance degradation intro- duced by HIX secure GPU computation is 26% compared to the conventional unsecure GPU computation for the bench- marks from the Rodinia suite.
We summarize the main contributions of this work as follows:
• We provide an attack surface assessment of GPU com- putation. We identify key GPU components that can be attacked from privileged software: PCIe interconnect, memory mapped I/O region, and GPU driver.
• We augment the design of the PCIe interconnect to block any routing change after the GPU initializa- tion, and to further guarantee the address mapping immutability of the memory mapped I/O region to the GPU.
• We extend the current SGX interface to support the GPU enclave, which runs the GPU driver in a secure way. The MMU design is extended to protect the GPU memory mapped I/O region from unauthorized ac- cesses.
• We implement a prototype on an emulated system with KVM and QEMU to evaluate the performance overhead of HIX. Although it is implemented in the emulated system due to the required changes in hard- ware, it faithfully reflects necessary changes in hard- ware interfaces and software architectures.
The rest of the paper is organized as follows. Section 2 describes the current architecture of SGX, PCIe, and GPU dri- ver. Section 3 discusses the threat model. Section 4 presents the proposed architecture. Section 5 discusses the security analysis and shows performance results. Section 6 presents the prior work and Section 7 concludes the paper.
2 Background
HIX is designed on top of Intel SGX architecture and the PCI Express standard. We provide a brief overview of these technologies in this section.
2.1 Intel Software Guard Extensions (SGX)
Intel SGX is a hardware-based protection technology that provides a trusted execution environment (TEE) called an enclave, protected even from the privileged software and di- rect hardware attacks. SGX protects the enclave memory and execution contexts to support the strong isolated execution. The SGX hardware-based isolated execution is augmented

Heterogeneous Isolated Execution for Commodity GPUs
ASPLOS ’19, April 13–17, 2019, Providence, RI, USA
PCIe Root Complex
MMIO access
DMA access
MMIO Virtual Address
System Address Map
Main Memory
MMIO Physical Address
Virtual Address Space
Untrusted Data, Code, etc
Figure 1. SGX enclave memory mapping structure
by an attestation service that verifies the integrity of the code running on the enclave [1, 35].
The main memory is untrusted under the SGX threat model, and thus, SGX provides memory encryption and ac- cess restriction mechanisms to protect a small region of main memory for enclaves, called the enclave page cache (EPC). Although SGX uses the virtual memory support provided by the untrusted OS, it protects EPC pages from unautho- rized accesses with hardware-based verification. Figure 1 illustrates the structure of SGX address space. In the figure, ELRANGE (Enclave Linear Address Range) is the protected vir- tual address range in the enclave, and the pages in the range are guaranteed to be mapped to EPC pages. When an enclave is created, the system software registers the virtual address and corresponding EPC physical address of a page in the pro- tected memory using EADD SGX instruction. During handling of the EADD instruction, the hardware stores the mapping information in the enclave page cache map (EPCM) to verify future accesses to the page during address translation in MMU [9].
2.2 PCI Express Architecture
Modern GPUs are connected to the system via the PCI Ex- press (PCIe) interface. The PCIe interface facilitates memory- mapped I/O (MMIO) access to PCIe devices for software. Since the MMIO mechanism maps the hardware registers and memory of a device to the system memory address space for software, this enables the software to transparently access the PCIe devices using regular memory addresses. Figure 2 illustrates how the system routes device access requests to the device by using the system memory address map [49]. CPU is responsible for distinguishing accesses to the MMIO regions from main memory accesses. It uses its internal hard- ware registers which are initialized by BIOS at system boot time, to route access requests for MMIO appropriately [19].
When the address of a memory access is for the MMIO region, the PCIe root complex takes the request. As PCIe devices are attached to the system as a tree, where the PCIe root complex is its root, the root complex creates a PCIe transaction packet and routes it to the desired device, using the hardware routing registers [5, 43]. These registers are also initialized by the BIOS at system boot time to cover the entire physical address ranges of attached devices.
Modern PCIe devices use direct memory access (DMA) to directly read or write the main memory without CPU intervention. The DMA arrows in Figure 2 show how the
Figure 2. I/O path in PCI Express system architecture
system routes the DMA request. An input/output memory management unit (IOMMU) can be used to translate device addresses to physical addresses for DMAs [42].
2.3 Controlling GPU in Software
Given the underlying hardware I/O path described in Sec- tion 2.2, the software is able to control the GPU by writing commands to a GPU command buffer in the GPU MMIO region. Once a virtual address is assigned to the GPU MMIO physical address, the OS or a user process can access the GPU through the MMIO virtual address, if the MMIO virtual address is accessible from the OS or process [47]. The data such as GPU binary codes or input data can be transferred to the GPU via MMIO or DMA, while DMA is optimized for bulk data transfers [15].
3 Threat Model
3.1 Attacker Model and Assumptions
The adversarial model we address is a privileged adversary with the goal of breaking confidentiality and integrity of the data to be processed by GPUs. We focus on attack vec- tors comprising the hardware and software I/O data path between a user application to the GPU. We assume that the adversary has privileged software control over the target system. Specifically, the adversary can control all the privi- leged software components such as the OS kernel and device drivers within the kernel space. In addition to being capa- ble of controlling code execution of these components, the adversary is also able to inspect and observe data in main memory and manage the system address map, a set of infor- mation indicating where main memory and MMIO access requests should be routed. We also assume that the CPU package and GPU card are trusted, and the GPU has its own separate device memory.
3.2 Out of Scope
Consistent with the defense scope of SGX, we do not consider physical attacks to the CPU package and side channel-based attacks [9]. It is not our goal to defend against implemen- tation bugs in user code to be run within the enclaves and

ASPLOS ’19, April 13–17, 2019, Providence, RI, USA
I. Jang, et al.
Table 1. Required hardware and software changes for HIX.
User Enclave
Privileged Attacker
HIX Components
Protected Path by HIX
Thwarted Attacks
Software Hardware
GPU Enclave
GPU Driver
Compromised OS
GPU Enclave Meta-data
SGX-enabled CPU
PCIe Root Complex
SW HW HW HW HW SW
Changed Component
GPU enclave
GX instructions Internal data structures MMU page table walker PCIe root complex Inter-enclave communication
Sole GPU control 4.2
Section HW support for GPU enclave 4.2
Figure 3. HIX architecture overview
GPUs [11]. Availability attacks such as not to schedule a specific process are not in our scope.
Apart from the limitations we inherit from Intel SGX, HIX has several limitations specific to PCIe devices and I/O interconnect architecture. Physical attacks on the PCIe inter- connects and GPUs, such as directly injecting PCIe packets in the I/O communication path with a special hardware or ac- cessing the GPU memory physically, are out of scope of HIX. This is an inherent trade-off we make because this study is based on unmodified GPU hardware. Using the PCIe peer-to- peer transaction functionality with a GPU protected by HIX is not available. While the latest GPUs support on-demand page-fault mechanism in GPUs [10, 16], the GPU computing model that HIX supports is restricted to the conventional model, which requires all the data to be in the GPU device memory before a GPU kernel execution. In addition, we do not address availability attacks against GPUs in the form of resource exhaustion or denial-of-service attacks. We discuss the limitations in more detail in Section 5.6.
MMIO region, protecting the GPU MMIO from the malicious OS.
Secure hardware I/O path: The GPU enclave manages the GPU exclusively by sending commands and data through MMIO, and thus the communication through MMIO must be secured from the OS and other applications. It requires several hardware extensions to the SGX support as well as the PCIe architecture. First, similar to the enclave memory protection, the OS is not allowed to change the virtual to physical address mapping for the GPU MMIO region, once the mapping is established for the GPU enclave. Second, any accesses other than from the GPU enclave to the GPU MMIO region must be prohibited. Third, the GPU MMIO mapping and routing configuration in the PCIe root complex must not be changed once the GPU enclave is initialized. Finally, the DMA data from/to the GPU must be protected from the malicious OS.
Trusted application-to-GPU communication: For secure GPU computation, GPU requests are transferred from the user enclave to the GPU enclave, and the GPU enclave sends the corresponding command to the GPU on behalf of the user enclave. HIX leverages attestation and symmetric en- cryption to ensure the secure communication between the user and GPU enclave.
Table 1 summarizes the required hardware and software changes. With the hardware and software changes, HIX pro- vides trusted GPU services to user enclaves, supporting the confidentiality and integrity of their sensitive data and the secure execution on them.
4.2 GPU Enclave
As illustrated in Figure 3, central to the HIX design is the user-mode GPU enclave, which is responsible for two func- tions: (1) sole control over the GPU, and (2) sole user access interface to the GPU. To reduce the attack surface, HIX sepa- rates the critical functionality for controlling the GPU from the OS-resident driver, and isolate it within the GPU enclave. The role of the remaining part of driver in the OS is reduced to offering benign kernel services such as assigning new virtual addresses for MMIO regions allocated to the GPU enclave. During its initialization, the GPU enclave resets the GPU state to eliminate possible untrusted GPU programs loaded in the GPU. A required extension for SGX to sup- port the GPU enclave is to allow the GPU enclave to access
HIX Architecture Architecture Overview
A key tenet in the HIX design is securing the command and
data path from the user application to a GPU at the software and hardware levels. In a typical unprotected setting, the GPU driver is part of the operating system (OS), and the I/O path to the GPU through MMIO is controlled by the OS. However, in the proposed HIX architecture, the GPU driver is separated from the OS, running in a secure enclave. The OS cannot affect the MMIO mapping and routing to the GPU. To provide the secure computing, the following software and hardware components must be supported.
Isolated GPU management with GPU enclave: For se- cure GPU computing under the vulnerable OS, HIX separates the GPU driver from the OS space. The GPU driver runs on a TEE enviro

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com