Diagnosing architectural run-time failures
Paulo Casanova,
David Garlan,
Bradley Schmerl and Rui Abreu.
In Proceedings of the 8th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, 20-21 May 2013. Received SEAMS 2013 Best Paper Award.
Online links:
Abstract
Self-diagnosis is a fundamental capability of self-adaptive
systems. In order to recover from faults, systems need
to know which part is responsible for the incorrect behavior. In
previous work we showed how to apply a design-time diagnosis
technique at run time to identify faults at the architectural level
of a system. Our contributions address three major shortcomings
of our previous work: 1) we present an expressive, hierarchical
language to describe system behavior that can be used to
diagnose when a system is behaving different to expectation; the
hierarchical language facilitates mapping low level system events
to architecture level events; 2) we provide an automatic way to
determine how much data to collect before an accurate diagnosis
can be produced; and 3) we develop a technique that allows the
detection of correlated faults between components. Our results
are validated experimentally by injecting several failures into a
system and accurately diagnosing them using our algorithm. |
Keywords: Diagnosis, Self-adaptation.
@InProceedings{2013:Casanova/Garlan,
AUTHOR = {Casanova, Paulo and Garlan, David and Schmerl, Bradley and Abreu, Rui},
TITLE = {Diagnosing architectural run-time failures},
YEAR = {2013},
MONTH = {20-21 May},
BOOKTITLE = {Proceedings of the 8th International Symposium on Software Engineering for Adaptive and Self-Managing Systems},
PDF = {http://acme.able.cs.cmu.edu/pubs/uploads/pdf/darf.pdf},
ABSTRACT = {Self-diagnosis is a fundamental capability of self-adaptive
systems. In order to recover from faults, systems need
to know which part is responsible for the incorrect behavior. In
previous work we showed how to apply a design-time diagnosis
technique at run time to identify faults at the architectural level
of a system. Our contributions address three major shortcomings
of our previous work: 1) we present an expressive, hierarchical
language to describe system behavior that can be used to
diagnose when a system is behaving different to expectation; the
hierarchical language facilitates mapping low level system events
to architecture level events; 2) we provide an automatic way to
determine how much data to collect before an accurate diagnosis
can be produced; and 3) we develop a technique that allows the
detection of correlated faults between components. Our results
are validated experimentally by injecting several failures into a
system and accurately diagnosing them using our algorithm.},
NOTE = {Received SEAMS 2013 Best Paper Award},
KEYWORDS = {Diagnosis, Self-adaptation} }
|