APM under the microscope

Application performance management puts valuable metrics for budget planning or troubleshooting at your fingertips -- but success starts with your OS

APM (application performance management) is a lot like exercise: Everyone knows they need it, but only a fraction practice it consistently, citing high costs and lengthy deployment times as excuses.

In these days of doing more with less, however, those who give APM a closer look can reap significant rewards. Supported by counters -- both existing metrics and custom counters exposed by in-house developers -- APM solutions monitor everything from CPU utilization and memory consumption to network throughput. Most include some kind of analysis component to interpret the counter data, highlight and log discrepancies, trigger alerts, and so on. The analyzed APM data can then be used to manage capacity, troubleshoot problems, ensure performance levels, or even plan future growth.  

Today's IT pros can tap a wealth of tools and resources that tell exactly what's going on under the hood. Depending on your OS starting point, you may be able to stitch together a workable solution using nothing more than what's sitting aroundyour IT lab.

In this article, we examine the state of the APM art as it relates to three leading OS platforms: Microsoft Windows Server, Sun Solaris, and Linux. Each takes a slightly different approach to the task of tracking application run-time behavior, so IT shops must shape their respective APM strategies to fit their chosen platform.

Counters, Counters Everywhere

The heart of any APM solution is the metrics counter. This unimposing entity is the ultimate insider, delivering critical data on the darkest regions of the run-time environment's underbelly. Whether woven into the OS kernel or tacked on to a virtual machine class library, metrics counters are what make systemwide APM possible.

Counters also say a lot about the ideology of the OS developer. For example, Solaris provides a wealth of counters for exploring the inner workings of Sun's own hardware, while Windows channels its broad array of system and application counters through a central library (the Performance Data Helper DLL), making them accessible to even casual developers.

The move to browser-based application architectures introduced an entirely new layer to the APM stack: the Web application server. With an emphasis on code portability and platform abstraction (through the aforementioned virtual machine model), Web application servers require their own instrumentation facilities that may or may not mesh with those of the underlying host OS, making APM trickier.

As the complexity level spirals upward and IT faces the challenge of managing parallel APM architectures, the ability of a given vendor solution to integrate these disparate resources -- without forcing customers to seek third-party help -- is rapidly becoming a key APM selling point.

How do you gauge your APM options? Consider three variables: deployment cost (time and effort as well as dollars), ease of use, and APM functionality. Put those factors together, and platform cohesiveness trumps best-of-breed in many situations.

Microsoft Windows Server System

Microsoft has a long tradition of bundling functionality with its operating system releases, and APM is no exception. The Windows Server platform oozes with metrics richness. From the core of the NT Executive (the kernel) to the most obscure run-time subsystem ("cache read pin hits %," for example), everything has a counter.

But perhaps even more important than the number of counters is the way in which Microsoft makes them available to the developer. For legacy Windows applications that predate the .Net framework, there's the PDH (Performance Data Helper) library, a single point of reference for accessing metrics counters across the OS. Virtually every major subsystem is represented, and Windows developers can easily extend their own applications to expose custom metrics through the PDH interface.

Incidentally, PDH is also used by the graphical PerfMon (Performance Monitor) application (part of the core Windows OS) to graph and log metrics data, so writing a PDH-enabled application has the added benefit of exposing custom counters to PerfMon.

With the advent of the .Net framework, Microsoft extended its pervasive metrics model into the realm of the Web application server. All aspects of framework operation, from the JIT (Just in Time) compiler for the CLR (Common Language Runtime) virtual machine to the ASP .Net Web services libraries, are fully instrumented and exposed to the developer, both through the legacy PDH library and the newer WMI (Windows Management Instrumentation) interfaces.

Plus, when Microsoft's ETW (Event Tracing for Windows) solution debuts later this quarter, you'll literally be able to track an application's execution path as it crosses various local and network boundaries (but not Web services boundaries), regardless of the underlying development model.

It's this kind of bi-directional APM support -- with tight integration between legacy OS and current generation Web application server resources -- that helps Microsoft win hearts and minds in the developer community. With so much functionality built into the base OS (everything described above is available on the Windows product CD or as a free download) many IT shops can make do with out-of-the-box functionality and avoid investing in a costly third-party APM framework.

Linux

If Windows Server System represents the pinnacle of single-vendor APM integration, Linux, in its various permutations and distributions, represents the polar opposite: a sometimes chaotic virtual playground for those who cherish the do-it-yourself spirit.

To be sure, there are vertical APM solutions you can find if you're looking for them -- ISM's PerfMan for Linux and portions of IBM's Tivoli suite come immediately to mind. But even the big-name vendor tools often build on obscure utilities that have become immortalized as part of the greater Linux pantheon.

For example, PerfMan, a powerful commercial metrics tracking and analysis solution, wraps around sysstat, an open source command-line front end to the Linux/proc pseudo file system that is written and maintained by one spectacularly talented young programmer in France, Sebastien Godard. Likewise, many of IBM's Linux performance initiatives revolve around OProfile, an open source kernel driver/probing utility that has its origins in a computer science student's attempts to get extra credit toward his master's degree.

The preceding examples illustrate why Linux is both exciting and terrifying to enterprise IT decision makers. It's exciting because there are so many useful freebies that can help trim up-front software cost and terrifying because there is no single point of accountability for many of the key building blocks that comprise APM functions.

Unlike the application server and database categories, which are fully stocked with traditional (in other words, well-supported and accordingly priced) solutions from first-tier vendors, APM has largely been farmed out to the open source community. This is great if you're into hacking Linux to serve your own purposes -- all of the aforementioned open source tools fall under the GNU General Public License. It's not so hot if you need cohesion across a multitiered run-time environment.

Which brings us to the issue of application serving under Linux. With few exceptions, the majority of large-scale Linux projects involve the use of one or more third-party Web application server platforms. That's because there is no real native Linux application server facility, unless you count Web scripting languages, such as PHP.

So although vendors may provide rich APM tools for their respective environments, these tools still know little about what's happening below them at the OS level. By that same token, OS-level APM tools such as sysstat know nothing about the Web app server beyond that it is a process that can be tracked using the /proc pseudo file system. Contrast that with Microsoft's Windows Server platform, which offers a fairly robust application server environment from day one. Moreover, the run-time environments in Windows, whether they be legacy Win32/COM-based or part of the newer .Net framework, are wired together through a consistent set of APM plumbing.

It's this disconnect and lack of cohesion, coupled with the aforementioned accountability factors, that can short-circuit the Linux cost advantage equation and send even die-hard open source advocates running for proprietary APM frameworks.

Sun Solaris

So far I've explored the two extremes of the APM model: single-vendor, vertical integration (Windows) and multivendor/author, horizontal integration (Linux). Sun's Solaris Operating Environment Version 9, in its third major release of the past decade, falls somewhere in the middle.

On one hand you have Sun, the open source advocate: holder of the Java keys, proponent of heterogeneous Web standards, and most recently, Linux cheerleader. On the other hand, you have Sun, the classic big iron player: lots of great functionality but key pieces are tied to proprietary hardware platforms.

For example, Sun introduced a powerful new APM solution, Sun Solaris 9 Resource Manager, to complement its latest platform release. Resource Manager is an IT planner's dream; it provides detailed feedback on application resource usage, allowing system administrators to audit run-time behavior for health analysis, charge-back accounting, and more. Combined with the Sun ONE Studio 7 development environment, Resource Manager makes for a cohesive, end-to-end APM package that spans both native Solaris and J2EE application types.

There's just one problem. Both products were initially only available on Sun's proprietary Sparc processor platform, which meant you had to invest in both the company's software architecture and its hardware offerings to realize full APM benefits. (Resource Manager has since been ported to Solaris for x86 with Release 9.)

The resulting message is mixed. Sun pays lip service to the open source development model, but it tends to keep best-of-breed APM solutions close to the hardware home. Microsoft, with its hardware-agnostic, "we'll run anywhere it makes sense" strategy, looks positively transparent by comparison.

The fact that Sun is the source for all things Java is a double-edged sword: It gives them street cred with the open source crowd, but it also raises customer expectations regarding Sun's strategy and behavior. If Sun really wanted to demonstrate openness, they would port the majority of these

APM products to Linux (promises have been made) or at least make them consistently available for Solaris on x86. However, things are improving with Release 9.

Meanwhile, IT shops need to consider the potential for unforeseen proprietary tie-ins when evaluating Sun's APM strategy. Such is the challenge when dealing with a vendor struggling to reconcile its role as an open source leader with the need to move high-margin, big iron hardware.

Shaping an APM Strategy

I said at the outset that the platform with the most cohesive APM facilities would likely also make the most sense economically. Thanks to a predilection toward bundling extra functionality with its OS releases, Microsoft wins this argument hands-down.

But base OS functionality is only part of the APM equation. In many situations, a single-vendor solution may not be a viable choice given an organization's installed base, for example bringing Microsoft's Windows-centric model into a shop full of legacy Unix boxes.

Thankfully, a plethora of third-party APM tools promises to make sense of the chaos … for a price. I'll take a look at some of these tools -- specifically, those geared towards custom application development and instrumentation -- in future articles.

Copyright © 2004 IDG Communications, Inc.