Introduction to Azure Anomaly Detection API
What is an Anomaly?
Anomaly comes from the Greek word ‘anomolia’ which means uneven. It can be defined as something that deviates from what is standard, normal, or expected. Deviations can either be on the positive side or negative side. Examples include:
Spike in body temperature
A movie theatre without seats
A chocolate candy without sugar
Irregular heartbeat – Arrhythmia
All of these are examples of anomalies. Anomalies are abnormal but labelling them as good or bad isn’t so simple. The context will greatly determine that. Any variation in body temperature or heartbeat is bad but a candy without sugar is healthier.
Businesses can find benefits in several ways such as:
Detecting malfunctioning equipment through IoT monitoring
Security benefit when the number of requests suddenly increase (DoS Attack)
The sudden spike in requests can degrade performance of compute resources such as Database, VM disks, Azure Analysis servers and more
Conversely, a sudden decrease in compute utilisation indicates that maybe a resource that should be running has stopped due to unknown reasons
Cost spike if a huge resource such as Azure cluster is left running
Introduction to Azure Anomaly Detection API
Azure Anomaly API, part of its Cognitive Services, was introduced to detect anomalies in data fed to it. Anomaly detection is crucial for any business. Any anomaly detected either helps prevent bill-shock moments or maintains the image of the business in the market.
Azure uses machine learning models and does time series analysis on data for this task. Details abstracted from the user and the service are exposed via a REST API – Anomaly API. The API can perform the following types of operations:
1. Batch Detection – Batch Operation
Detects anomalies in a batch of time-series data. The service will generate a model using this model and analyze each data point within it.
2. Streaming Detection – Streaming Operation
Detects anomalies in streaming data. Data is continuously fed to the API that will be used to generate a model. API will analyze the last data point for anomalies.
How to Use the Service
Steps to follow:
- Create Anomaly Detector Resource in Azure from the portal
- The Endpoint, appended by the type of operation we want to perform (discussed in the previous section) will be provided in the resource overview. Authenticate API call with the key
- Load time-series data
- If there are anomalies, they will be in the response’s IsAnomaly key. This key will have the index of the data-point in the data that is an anomaly.
Things to Take Care of in the API
The API has some important parameters that have to be considered when making the request. They are:
Parameter to determine the periodicity of the data. It can be either: minutely, hourly, daily, weekly, monthly, or yearly. More information can be found here.
It takes integral values from 0 to 99. The higher the value, the higher is the sensitivity of the model to anomalies. Its value will have to be determined as per the use case.
One can either make calls from code and frameworks or this platform by Azure.
During experimentation, I found that the API has certain conditions which if not abided by, will return HTTP Error Code 400 – BAD REQUEST. They are:
- The request takes time-series data. It also implies that data needs to be in order
- API expects a minimum of 12 data points. Any less and is too small of a data-set to generate a model and detect anomalies.
Azure Anomaly Detection is a great service. It not only detects spikes or dips but also trend change and off-cycle softness, all exposed in one single API endpoint. If you are using your custom-made tool to detect anomalies, Azure’s offering is definitely worth a check. It uses an ensemble of the following algorithms for detection:
- Fourier Transformation
- Extreme Studentized Deviate (ESD)
- STL Decomposition
- Dynamic Threshold
- Z-score detector
This service is one of those good ones that you didn’t know you needed until you use them.
Whether your system is ingesting data frequently or you have time-series data sitting idle in your Data Warehouse, Azure Anomaly Detection API can probably add good value to the business.
Architect a Flow Around Anomaly Detection
A super simple system architecture can look like:
- Data is stored in Data Stores like: disk, database, or warehouse
- A timer trigger function that reads data periodically
- Makes REST call to Anomaly Detection API
- Notify Business on Teams/Slack/Email of any Anomalies who can take actions accordingly