117
In the complex world of online services that rely on interconnected microservices, ensuring these components have the necessary resources for optimal performance is crucial. Traditionally, analyzing extensive logs and traces from these systems has been the method to model performance, but collecting too little or too much data can present challenges for accurate performance modeling. This highlights the need for a more efficient approach. Our study introduces a streamlined two-phase method that leverages gradient boosting algorithms to pinpoint key data features essential for predicting CPU and memory demands accurately. By focusing on feature importance, we were able to significantly reduce the amount of data required for analysis—by more than 69\%—without compromising, and in some cases enhancing, the accuracy of our models. This evaluation was significantly strengthened by employing a comprehensive dataset provided by Alibaba, illustrating the practical application and validation of our method in a real-world, large-scale microservice environment. Further analysis on our results reveal that most of the identified features for for data volume reduction were mostly focused on the critical aspects of a microservice architecture, notably inter-service communication and resource access patterns. Our findings demonstrate that by concentrating on the most influential features of the microservice architecture trace data, it is possible to maintain, and potentially improve, system performance modeling with substantially less data, presenting a promising research direction for resource optimization in large-scale microservice performance modeling.
Article ID: 2024L31
Month: May
Year: 2024
Address: Online
Venue: The 37th Canadian Conference on Artificial Intelligence
Publisher: Canadian Artificial Intelligence Association
URL: https://caiac.pubpub.org/pub/dh3zxquj