UFO: The Ultimate QoS-Aware CPU Core Management for Virtualized and Oversubscribed Public Cloud

Abstract

Public clouds typically adopt (1) multi-tenancy to increase server utilization; (2) virtualization to provide isolation between different tenants; (3) oversubscription of resources to further increase resource efficiency. However, prior work all focuses on optimizing one or two elements, and fails to considerately bring QoS-aware multi-tenancy, virtualization and resource oversubscription together. We find three challenges when the three elements coexist. First, the double scheduling symptoms are $10 imes$ worse with latency-critical (LC) workloads which are comprised of numerous sub-millisecond tasks and are significantly different from conventional batch applications. Second, inner-VM resource contention also exists between threads of the same VM when running LC applications, calling for inner-VM core isolation. Third, no application-level performance metrics can be obtained by the host to guide resource management in realistic public clouds. To address these challenges, we propose a QoS-aware core manager dubbed UFO to specifically support co-location of multiple LC workloads in virtualized and oversubscribed public cloud environments. UFO solves the three above-mentioned challenges, by (1) coordinating the guest and host CPU cores (vCPU-pCPU coordination), and (2) doing fine-grained inner-VM resource isolation, to push core management in realistic public clouds to the extreme. Compared with the state-of-the-art core managers, it saves up to 50% of physical cores under the same co-location scenario, and is able to cover up to 5.7x more co-location scenarios under the same amount of core resources.

Publication
In the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI ‘24), USENIX.