Jump to content

Large Enterprise Business

Products & Services
Support & Drivers
Solutions

How to eliminate problems before they occur

 
Content starts here A tale of how to more intelligently manage people and technology
Placement on page:  Top of page

Running the front line of IT operations requires near-perfect data analysis, decisive action and flawless technical execution. As the triage unit in the IT organization, the front line of operations must spot and diagnose IT system failures before they undermine the business. Of equal importance, IT operations coordinates the numerous teams that manage and fix these systems when problems occur.

But running the front line of IT operations without effective tools and information is like a surgeon operating without instruments or ability to monitor the patient. It puts the health of the business in jeopardy.

Fortunately, business service management (BSM) software has emerged as a solution to these challenges. BSM tools deliver numerous benefits to IT operations teams, including:

  • Speed and accuracy pinpointing and fixing system failures before they become business service problems
  • Greater efficiency with second- and third-level IT experts because the correct teams are dispatched to fix problems at the outset
  • Better prioritization of IT issues by seeing the links between technology and business services

So what’s the payoff? Attainment of the IT mantra heard everywhere: business alignment.

To better understand the benefits of BSM software, let’s look at two hypothetical scenarios–one with BSM software and one without.


Life without BSM tools

Rex, our main character, is an IT operations manager at Old First Bank, a fictitious institution. From an IT operations service desk, he monitors several critical business services, diagnoses IT alerts and mobilizes IT teams when problems occur.

For Rex, accurate information is critical, and time is money. One of the business services he monitors is an interbank funds-transfer service.

It’s 4:30 p.m. on a Friday, and Rex is winding things down before leaving at 5:00. Aside from the usual false alarms, all indicators suggest his systems are running fine.

Thirty minutes earlier, Rex’s colleague, Jim, initiated a multimillion-dollar interbank fund transfer for one of Old First Bank’s biggest clients. Such high-value transfers usually take only five minutes. So completing it before the 5:00 p.m. deadline, after which large financial penalties are incurred, shouldn’t be a problem.

At 4:45 p.m., Jim reports that the multimillion-dollar transfer he began at 4:00 p.m. still hasn’t completed. The fraud check, a step that usually takes two minutes, is still processing. Apparently, one of the 20 underlying systems and infrastructure elements that enable the transfer is causing a delay. Thirteen minutes remain before the transfer deadline.

After speaking with Jim, Rex views his event console and list of alerts in hopes of identifying the culprit. Meanwhile, the clock is ticking. Nine minutes remain before the fund-transfer deadline.

After weeding through a thicket of high-priority alerts, Rex makes his best guess at five possible causes:

  • An unresponsive data base
  • A new application server
  • A congested network path
  • A recently installed router
  • The new front-end interface for the fund-transfer application

Five minutes remain.

Meanwhile, Jim still sees “processing fraud check” on his computer screen. There’s been no update from Rex, and it appears Old First Bank will incur a major financial penalty and, more importantly, a damaged reputation for not processing the transfer on time.

With no way to correlate a single system alert to the fund-transfer application, Rex can’t pinpoint the cause. So he sends incident tickets to all four associated IT teams: networking, database administrators, application development and business services.

He hopes they’re not working on urgent projects, and he desperately wishes that one of their fixes will resolve the problem.

At 5:02 p.m., the boss calls. The transfer didn't go through and Old First Bank has been hit with a big financial penalty. It then occurs to Rex that three other high-value transfers were initiated around the same time. If those transfers failed, Old First Bank’s corporate reputation is in serious peril.


Life with BSM tools

It’s 1:00 p.m. on Friday afternoon, and Rex just returned to his desk from lunch. An alert appears from his business transaction monitoring tool. It looks like the fraud check in the high-value fund-transfer’s service is taking longer than the maximum threshold of two minutes.

Rex checks the list of scheduled transfers. One is scheduled for 1:30 p.m., and three others for 4:00 p.m. Fortunately, the business transaction monitoring tool proactively notified Rex that the fraud check's processing time was unusually slow.

Rex thinks, “Okay, isolate the problem, mobilize the right IT team, get a downtime estimate and alert the business stakeholders so they can take an alternate course of action.”

To pinpoint the cause of the problem, Rex pulls up his discovery and dependency mapping software, that visually links all of Old First Bank’s IT systems – hardware, software and everything in between – to the business services they enable.

In seconds, he can see the hierarchy of systems supporting the high-value fund-transfer service, allowing him to conduct a quick service impact analysis. There’s a large server that’s powering a database, which is accessed by a J2EE application server. And that application server is responsible for the fraud check.

Rex looks at his consolidated event management console, which uses advanced technology to remove irrelevant failure events. Sure enough, the database is malfunctioning.

Rex raises an incident ticket and routes it to Sally on the database administration team to notify her of the problem. She diagnoses it quickly, explaining that the database is running exceedingly slow. It’s slowing down the fraud check. Sally says they need to rebuild the database indexes, which will take about 25 minutes.

Checking his watch, Rex sees that will put him five minutes past the payment scheduled for 1:30 p.m.

Rex quickly calls the business team responsible for the 1:30 payment to notify them of the problem.

“Thanks for the heads up, Rex, and no problem,” they respond. “We have about 20 minutes. That gives us plenty of time to make the payment through our contingency account funds, which are on a separate back-up system.”

Next, Rex posts a notice on the fund-transfer application interface, explaining the problem and alerting users that it will be back online at 1:35 p.m.


Do you know a "Rex" in IT operations?

The story of Rex saving the day at Old First Bank is one example of how BSM software can help maintain critical business services. Any business or organization—be it in healthcare, manufacturing, e-commerce, finance or government—that depends on IT for success could benefit from BSM software.

HP Software offers a full suite of business service management software tools to help improve the availability of your critical business applications. Learn more


  Your feedback is important to us. Was this article useful/informative?  
   
   Not at all(1) Neutral(3) Definitely(5)