Ready to learn Cloud Computing? Browse Cloud Computing Training and Certification Track courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.
Disconcerting fact: Most IT departments use an average of nine—yes nine—different tools to monitor their environment.
In other words, they have given themselves another nine different boxes and nine different applications to manage. And the really weird part? Very rarely do these apps even talk to each other.
Wait, it gets worse. In settings where hybrid and public clouds are part of the mix, the number of monitoring tools can go much higher.
Managing multiple monitoring tools is not only cumbersome, but it’s also incredibly time-consuming. The IT team is too busy reacting to beeps and alerts to focus on its core mission of supporting the business. For IT to play a more strategic role and provide measurable value, a better approach is needed, one that’s based on the business impact of the events being monitored.
Most IT monitoring tools are geared toward element monitoring, which is a very granular approach that focuses on devices, not outcomes. When an alert is received, it only identifies the device that is experiencing an issue. Omitted is the impact that this incident could have on the business. That leaves IT scratching its head, wondering how to react and prioritize, since all device-level issues could potentially have far-reaching effects—although most of them do not.
From the business’ point of view, it doesn’t matter much if a switch or a VPN connection fails unless important business applications that users depend on are impacted. For a more holistic view, and to clearly understand how an incident will impact critical business functions, IT needs to understand the relationship between its infrastructure components and the business functions they support. To get at this, one effective technique is “dependency trees,” which are developed by mapping IT elements to the services they provide. Then, when an alert is received, IT can instantly determine whether it is true threat or just “noise.”
Reducing noise
To operate more efficiently, IT needs to eliminate most of the noise generated by these systems. Noise is the root of all evil in IT, because it saps time, money and resources. For example, if a network links multiple sites and one of those links is temporarily unavailable, the monitoring systems will generate hundreds of event signals, even though they all stem from just a single and relatively minor incident. And once the link comes back up after 30 seconds, the system monitors will generate hundreds more.
Compounding this, many IT groups still process events manually. This is extremely time consuming and can blind managers to the truly critical issues, which are buried in the cacophony. Using techniques like dependency trees and automated scripts, noise levels can be significantly reduced, and I have seen some organizations eliminate up to 90 percent of their distractions.
But to make the most of this, IT also needs a more efficient way to respond to the events that do matter. This can be provided by what’s known as the “single pain of glass/one-click away method” for triage and troubleshooting. It prevents IT from having to chase down passwords, IP addresses, and circuit IDs from some spreadsheet or SharePoint site by placing them “one click away.” This saves even more time, effort, and money.
Looking ahead, IT needs to consider how it can provide more agility and greater resilience for the entire organization. For instance, contextual awareness will become critical with software-defined networks. Think about an executive seeking to confer with a warehouse manager who has a lower QoS profile, for example. One way around this would be to build a new profile for the warehouse worker during the call and then tear it down when the call terminates. This can be done automatically but will require different monitoring feeds, including NetFlow information, depending on device types, wired or wireless connectivity and so forth.
The overarching point here is that IT monitoring is no longer just an “up and down” function. We all need to plan for the future, and streamlined monitoring solutions are paramount to ensuring business growth and transformation.