Case Study of an Automated Approach to Managing Collections of Autonomic Systems
Thomas J. Glazier,
David Garlan and
Bradley Schmerl.
In Proceedings of the 2020 IEEE Conference on Autonomic Computing and Self-organizing Systems (ACSOS), Washington, D.C., 19-23 August 2020. Presentation Video.
Online links: Plain Text
Abstract
Many applications have taken advantage of cloud
provided autonomic capabilities, commonly auto-scaling, to harness
easily available compute capacity to maintain performance
against defined quality objectives. This has caused the management
complexity of enterprise applications to increase. It is
now common for an application to be a collection of autonomic
sub-systems. However, combining individual autonomic systems
to create an application can lead to behaviors that negatively
impact the global aggregate utility of the application and in
some cases can be conflicting and self-destructive. Commonly,
human administrators address these behaviors as part of a design
time analysis of the situation or a run time mitigation of the
undesired effects. However, the task of controlling and mitigating
undesirable behaviors is complex and error prone. To handle the
complexity of managing a collection of autonomic systems we
have previously proposed an automated approach to the creation
of a higher level autonomic management system, referred to as a
Meta-Manager. In this paper, we improve upon prior work with
a more streamlined and understandable formal representation of
the approach, expand its capabilities to include global knowledge,
and test its potential applicability and effectiveness by managing
the complexity of a collection of autonomic systems in a case
study of a major outage suffered by the Google Cloud Platform. |
Keywords: Meta-management, Self-adaptation.
|
|