Analytically-Driven Resource Management for Cloud-Native Microservices

Abstract

Resource management for cloud-native microservices has attracted a lot ofrecent attention. Previous work has shown that machine learning (ML)-drivenapproaches outperform traditional techniques, such as autoscaling, in terms ofboth SLA maintenance and resource efficiency. However, ML-driven approachesalso face challenges including lengthy data collection processes and limitedscalability. We present Ursa, a lightweight resource management system forcloud-native microservices that addresses these challenges. Ursa uses ananalytical model that decomposes the end-to-end SLA into per-service SLA, andmaps per-service SLA to individual resource allocations per microservice tier.To speed up the exploration process and avoid prolonged SLA violations, Ursaexplores each microservice individually, and swiftly stops exploration iflatency exceeds its SLA. We evaluate Ursa on a set of representative and end-to-end microservicetopologies, including a social network, media service and video processingpipeline, each consisting of multiple classes and priorities of requests withdifferent SLAs, and compare it against two representative ML-driven systems,Sinan and Firm. Compared to these ML-driven approaches, Ursa providessignificant advantages: It shortens the data collection process by more than128x, and its control plane is 43x faster than ML-driven approaches. At thesame time, Ursa does not sacrifice resource efficiency or SLAs. During onlinedeployment, Ursa reduces the SLA violation rate by 9.0% up to 49.9%, andreduces CPU allocation by up to 86.2% compared to ML-driven approaches.

Quick Read (beta)

loading the full paper ...