This article is contributed. See the original author and article here.
Microsoft partners and customers have been building metrics monitoring solutions for AIOps and predictive maintenance, by leveraging the easy-to-use time-series anomaly detection Cognitive Service: Anomaly Detector. Because of its ability to analyze time-series individually, Anomaly Detector is benefiting the industry with its simplicity and scalability.
We are pleased to announce the new multi-variate capability of Anomaly Detector. The new multivariate anomaly detection APIs in Anomaly Detector further enable developers to easily integrate advanced AI of detecting anomalies from groups of metrics into their applications without the need for machine learning knowledge or labeled data. Dependencies and inter-correlations between different signals are now counted as key factors. The new feature protects your mission-critical systems and physical assets, such as software applications, servers, factory machines, spacecraft, or even your business, from failures with a holistic view.
Imagine 20 sensors from an auto engine generating 20 different signals, e.g., vibration, temperature, etc. The readings of those signals individually may not tell you much on system-level issues, but together, could represent the health of the engine. When the synergy of those signals turns odd, the multivariate anomaly detection feature can sense the anomaly like a seasoned floor expert. Moreover, the AI models are trained and customized for your data such that it understands your business. With the new APIs in Anomaly Detector, developers can now easily integrate the multivariate time-series anomaly detection capabilities as well as the interpretability of the anomalies into predictive maintenance solutions, or AIOps monitoring solutions for complex enterprise software, or business intelligence tools.
“Medical device production demands unprecedented precision. For this reason, the Siemens Healthineers team uses Multivariate Anomaly Detector (MVAD) in medical device stress tests during the final inspection in the production. We found MVAD easy to use and work almost out of the box with promising performance. With the ready-to-use model, we don’t need to develop a custom AD model, which ensures a short time to market. We plan to expand this technology also to other use cases. It is made easy due to good integration into our ML platform and processes.” – Dr. Jens Fürst, Head Digitalization and Automation at Siemens Healthineers
To better understand the health and condition of the aircraft and foresee and fix potential problems before they occur, Airbus deployed Anomaly Detector, part of Cognitive Services, to gather and analyze the telemetry data. It began as a proof of concept of the aircraft-monitoring application by loading telemetry data from multiple flights for analysis and model training. “Early tests have shown that for many cases, the out-of-the-box solution works beautifully, which helps us deploy our solutions faster. I would say that we save up to three months on development for our smaller use cases with Anomaly Detector.”
Marcel Rummens: Product Owner of Internal AI Platform, Airbus
Time-series anomaly detection is an important research topic in data mining and has a wide range of applications in the industry. Efficient and accurate anomaly detection helps companies to monitor their key metrics continuously and alert for potential incidents on time. In many real-world applications like predictive maintenance and SpaceOps, multiple time-series metrics are collected to reflect the health status of a system. Univariate time-series anomaly detection algorithms can find anomalies for a single metric. However, it could be problematic in deciding whether the whole system is running normally. For example, sudden changes of a certain metric do not necessarily mean failures of the system. As shown in Figure 1, there are obvious boosts in the volume of TIMESERIES RECEIVED and DATA RECEIVED ON FLINK in the green segment, but the system is still in a healthy state as these two features share a consistent tendency. However, in the red segment, GC shows an inconsistent pattern with other metrics, indicating a problem in garbage collection. Consequently, it is essential to take the correlations between different time series into consideration in a multivariate time-series anomaly detection system.
In this newly introduced feature, we productized a novel framework — MTAD-GAT (Multivariate Time-series Anomaly Detection via Graph Attention Network), to tackle the limitations of previous solutions. Our method considers each univariate time-series as an individual feature and tries to model the correlations between different features explicitly, while the temporal dependencies within each time-series are modeled at the same time. The key ingredients in our model are two graph attention layers, namely the feature-oriented graph attention layer and the time-oriented graph attention layer. The feature-oriented graph attention layer captures the causal relationships between multiple features, and the time-oriented graph attention layer underlines the dependencies along the temporal dimension. In addition, we jointly train a forecasting-based model and a reconstruction-based model for better representations of time-series data. The two models can be optimized simultaneously by a joint objective function.
The magic behind the scenes can be summarized as follows:
- A novel framework to solve the multivariate time-series anomaly detection problem in a self-supervised manner. Our model shows superior performances on two public datasets and establishes state-of-the-art scores in the literature.
- For the first time, we leverage two parallel graph attention (GAT) layers to learn the relationships between different time-series and timestamps dynamically. Especially, our model captures the correlations between different time-series successfully without any prior knowledge.
- We integrate the advantages of both forecasting-based and reconstruction-based models by introducing a joint optimization target. The forecasting-based model focuses on single-timestamp prediction, while the reconstruction-based model learns a latent representation of the entire time-series.
- Our network has good interpretability. We analyze the attention scores of multiple time-series learned by the graph attention layers, and the results correspond reasonably well to human intuition. We also show its capability of anomaly diagnosis.
Multivariate anomaly detection API overview
This new feature has a different workflow compared with the existing univariate feature. There are two phases to obtain the detection results, the training phase, and the inference phase. In the training phase, you need to provide some historical data to let the model learn past patterns. Then in the inference phase, you can call the inference API to acquire detection results of multivariate time-series in a given range.
Create and train model using training data
Get model info including training status and parameters used in the model
List models of a subscription
Submit inference task with user’s data, this is async
Get anomalies + root causes (the contribution scores of each variate for each incident)
Delete an existing multivariate model according to the modelId
Export Multivariate Anomaly Detection Model as Zip file
- Learning more from our documentation
- Artificial Intelligence for developers
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.