Adaptive root cause analysis and diagnosis
Date
2010-12-06T21:59:03Z
Authors
Zhu, Qin
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this dissertation we describe the event processing autonomic computing reference architecture (EPACRA), an innovative reference architecture that solves many important problems related to adaptive root cause analysis and diagnosis (RCAD). Along with the research progress for defining EPACRA, we also identified a set of autonomic computing architecture patterns and proposed a new information seeking model called net-casting model.
EPACRA is important because today, root cause analysis and diagnosis (RCAD) in enterprise systems is still largely performed manually by experienced system administrators. The goal of this research is to characterize, simplify, improve, and automate RCAD processes to ease selected tasks for system administrators and end-users. Research on RCAD processes involves three domains: (1) autonomic computing architecture patterns, (2) information seeking models, and (3) complex event processing (CEP) technologies. These domains as well as existing technologies and standards contribute to the synthesized knowledge of this dissertation.
To minimize human involvement in RCAD, we investigated architecture patterns to be utilized in RCAD processes. We identified a set of autonomic computing architecture patterns and analyzed the interactions among the feedback loops in these individual architecture patterns and how the autonomic elements interact with each other. By illustrating the architecture patterns, we recognized ambiguity in the aggregator-escalator-peer pattern. This problem has been solved by adding a new architecture pattern, namely the chain-of-monitors pattern, to the lattice of autonomic computing architecture patterns.
To facilitate the autonomic information seeking process, we developed the net-casting information seeking model. After identifying the commonalities among three traditional information seeking models, we defined the net-casting model as a five stage process and then tailored it to describe our automated RCAD process.
One of the main contributions of this dissertation is an innovative autonomic computing reference architecture called event processing autonomic computing reference architecture (EPACRA). This reference architecture is based on (1) complex event processing (CEP) concepts, (2) autonomic computing architecture patterns, (3) real use-case workflows, and (4) our net-casting information seeking model. This reference architecture can be leveraged to relieve the system administrator’s burden of routinely performing RCAD tasks in a heterogeneous environment. EPACRA can be viewed as a variant of the IBM ACRA model—extended with CEP to deal with large event clouds in real-time environments. In the middle layer of the reference model, EPACRA introduces an innovative design referred to as use-case-unit—a use case is the scenario of an RCAD process initiated by a symptom—event processing network (EPN) for RCAD. Each use-case-unit EPN reflects our automation approach, including identification of events from the use cases and classifying those events into event types. Apart from defining individual event processing agents (EPAs) to process the different types of events, dynamically constructing use-case unit EPNs is also an innovative approach which may lead to fully autonomic RCAD systems in the future.
Finally, this dissertation presents a case study for EPACRA. As a case study we use a prototype of a Web application intrusion detection tool to demonstrate the autonomic mechanisms of our RCAD process. Specifically, this tool recognizes two types of malicious attacks on web application systems and then takes actions to prevent intrusion attempts. This case study validates both our chain-of-monitors autonomic architecture pattern and our net-casting model. It also validates our use-case-unit EPN approach as an innovative approach to realizing RCAD workflows. Hopefully, this research platform will be beneficial for other RCAD projects and researchers with similar interests and goals.
Description
Keywords
Event Processing Autonomic Computing Reference Architecture (EPACRA), Autonomic computing architecture patterns, Net-casting Information Seeking Model