The application of artificial intelligence (AI) and machine learning (ML) to IT infrastructure and operations (I&O) – known as AIOps – is a hot trend in the enterprise and service provider world and for good reason: it can turbocharge IT operations. Done correctly, AIOps increases IT operational accuracy, agility and efficiency.
For this reason, AI and AIOps are high on many CIO/CISO agendas in financial services, healthcare, retail, manufacturing, service providers, government and other segments.
The challenge is that it’s very easy for AIOps initiatives go wrong if the IT team skips over the necessary data considerations. While AIOps is put into action more on the operational side of IT, the high-quality data it relies upon comes from the IT infrastructure, particularly the network infrastructure. And without that foundation of accurate, precise and consistent input data, the move to AIOps provides little value – and can even lead IT teams astray.
How AIOps Works
AIOps uses AI/ML and analytics to consolidate alerts, events, issues, trouble tickets, etc., and provides actionable insights to the IT team or takes corrective action automatically on their behalf. In this sense it provides focus, eliminating the noise and fatigue that comes with triaging a never-ending barrage of alerts, and instead looking below the surface to identify what is actually going on with the underlying systems. In many cases the AI/ML is actually better and faster than humans in detecting anomalies, recognizing patterns, predicting events, and narrowing down root causes – hence providing critical insights for taking corrective actions.
It also provides actionable context, giving IT teams the ability to act quickly and decisively. Taken to its limit, this contextual insight from correlated and analyzed data can ultimately enable automated optimizations and corrections…in short, “autonomous network operation”. It can also help baseline the system’s performance and compare key metrics as changes are made to identify improvements – resulting in an ongoing process of correcting, assessing and tuning based on “closed-loop” automation. By continuously and automatically crunching through reams of data, AIOps eliminates a lot of guesswork, biases and finger-pointing while reducing mean time to resolution (MTTR) and improving operational efficiency.
Data, however, is AIOps’ Achilles Heel, because without the right data in sufficient quantity at the appropriate frequency and high quality, the AI/ML can’t provide accurate analysis. In fact, it can potentially have a detrimental effect by reaching erroneous conclusions.
How Data Drives AIOps
There is a story your systems tell about what has, is and will happen throughout your infrastructure; it lies in the continuous stream of data generated by network and security systems. If you can read and interpret the story, you can fully understand what is going on. At its simplest level, that’s what AIOps does.
Operationally, AIOps data flows from its source through an extract, transform and load (ETL) pipeline, which normalizes it for processing and feeds it into a data lake for analysis. As with most AI/ML based systems, the quality of this data – its consistency, reliability, completeness, accuracy and precision – matters as much as the data itself. For example, too few sources or data that is inconsistent or incomplete results in blind spots and skewed decisions which could lead to bad experiences and lost business.
Essential Data Ingredient You Cannot Miss
If data is what fuels AI/ML and analytics, then when it comes to AIOps it’s the network data in the purest form – i.e., network packet data – that matters. Since the network connects everything from applications to IoT devices to end-users in the larger organization, it encompasses a lot of information about people, systems and experiences. Conversely, the network is often blamed first when problems arise, even if the root cause is located elsewhere – making the network data even more important to prove itself non-guilty.
In this sense the network serves as a proxy for the application performance, security and end user experience of the overall IT ecosystem. Specifically, high-resolution network packet data can feed AIOps with insight on problems; information flows for users, applications, cloud, IoT, etc.; security threats, including malicious activity, and of course application performance and the related end user experience.
But how do you gain access to high-quality packet or flow data? It starts with deploying devices such as network TAPs or virtual TAPs that mirror the raw network traffic. The collected data is centrally aggregated and consolidated using a Network Packet Broker (NPB) which processes, normalizes and forwards the high-resolution network packet data in real-time to the AI/ML systems for AIOps. Some AIOps systems may also consume network flow data, which can be generated from the collected packet data by specialized appliances inserted after the packet broker.
Rolling the AIOps
Beginning an AIOps pilot deployment by focusing on network data acquisition makes for a solid crawl-walk-run approach. This simplifies the initial deployment, as the availability and ubiquity of network TAPs and packet brokers minimizes the complexity and risk of data acquisition, ensures that the data is consistent, reliable, complete, accurate and precise, and allows the addition of other data sources later as AIOps shows value.
It’s quickly becoming an article of faith that AIOps’ use of AI/ML and analytics can help IT I&O teams become more efficient in the face of ever-increasing alerts, complexities and workloads by providing focus and clarity on the most important issues. Moreover, AIOps speeds up time to resolution by eliminating manual workflows. But the key to AIOps success (and ROI across the board for IT functions including DevOps, AppOps, SecOps, CloudOps and NetOps) is to start with a solid foundational network-centric data approach.