Business Scenario-Based Performance Monitoring and Diagnosis

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

We're perfecting a methodology and a set of tools that allow to quantify the performance of complex systems, establish correlations and ultimately point out where the performance problems are coming from.

We applied the methodology to synchronous requests-based systems (web sites), and we're working on extending it to asynchronous message processing systems. At a very high level, the approach consists in defining business scenarios - we use those to provide a better representation of the overall system's performance instead of lower level HTTP requests - apply load, produce data and analyze data while attempting to establish correlations with the evolution in time of the "primary" resources of the system (CPU cycles, memory, open file descriptors) or even higher level resources, such as application locks. A certain amount of statistical analysis and automated reasoning goes into the analysis stage and the results are higher-order synthetic data, such as graphs that make sense to humans or even root cause suggestions. We're working hard to increase the incidence of the latter, on the expense of the former.

One interesting aspect of the methodology is that it works with "synthetic load" - the type of load applied by load generators such as JMeter, NeoLoad or LoadRunner - or with the production load itself.

The system attempts to be smart while injecting instrumentation into the active elements of the system. It is capable of identifying those elements and applying instrumentation automatically. While none of the Operation guys we know would even consider unleashing and letting it go instrument their production on its own, the feature comes in handy in performance lab environments, where the tedious nature of applying instrumentation to an ever changing configuration becomes the bane of performance testing and eventually kills the drive to "stay current".

Subjects