ITOps 1.4.0 consists of the following three key features and minor enhancements.
Contents
The framework allows the users to:
Leverage an existing framework for building new integrations
Build on top of existing ones by creating their own configurations.
This follows a no code approach where the user are not dependent on the technical team to get the integration done. This approach is aimed at giving more freedom to the users and to support more integrations by providing a robust platform that can be customized as required.
For more information, refer ITSM Integration for ITOps
ITOps now has the capability to continually monitor and form patterns of data based on alert volumes and other configurable parameters. Therefore, it also has the capability to identify any changes in such patterns which might indicate a surge scenario.
When a surge is detected, the tool can flag them separately and run surge specific correlation rules. This would override all other correlation policies that are applicable in a non-surge scenario. It gives the users the ability to set custom rules
and correlation policies to handle surges and it works by clustering all surge alerts into a single cluster for easy viewability and processing.
This ensures lower noise generation within the system and an effective method to identify and fix
surge scenarios.
An alert surge start/end is detected based on the volume of alerts received in specific duration in ITOps and the surge configurations.
Surge is detected over a time series data of certain percentiles of alert count that comes into ITOps at fixed
time intervals. In order to do this, ITOps has a scheduler - analytics scheduler, which runs at fixed intervals (default setting is to run every 1 minute). It calculates the count of alerts newly added into ITOps in that interval.
ITOps uses
2 percentiles on alert count – one to detect start of surge and the other to detect surge end. These are termed as the Surge Start Percentile and Surge End Percentile. The Analytics scheduler calculates the surge start percentile and surge end percentile
against the alert count over a fixed interval. This fixed interval is determined by the Surge Analytics interval (defaulted to 10 mins). When the calculated value of surge start percentile exceeds the surge start percentile threshold, surge is detected.
Once surge is detected, surge is on till the surge end percentile falls below the surge end percentile threshold.
Surge Patterns are a feature that would help users to identify normal alerts that come into the system during surge and handle them
in the usual way. Users can input multiple patterns that are likely to be identical for surge alerts and if the system identified that the percentage of alerts following any one of the patterns exceeds the Surge Pattern Match threshold, that pattern
is set as the criteria for surge alerts. Any alert that do not meet the criteria is considered a normal alert. For example, if Surge Pattern is set as nodeName and Surge Pattern Match threshold is set as 80, ITOps will check if 80% or more of the
alerts during surge have the same node name. If they do and say the node name is SWITCH100, ITOps identifies all alerts from SWITCH100 as surge alerts and alerts from other devices will be treated as normal alerts.
To enable surge detection, the following properties should be set in project configuration screen.
Surge Start Percentile – A numeric value that determines the percentile of alert count to be monitored for detecting surge start. e.g. 50 stands for 50th percentile
Surge Start Percentile Threshold – A numeric value that determines the threshold value for surge start percentile. If the start percentile calculated at a point of time exceeds this, surge is detected.
Surge End Percentile - A numeric value that determines the percentile of alert count to be monitored for detecting surge end. e.g. 20 stands for 20th percentile
Surge End Percentile Threshold - The threshold value for surge end percentile. If the end percentile calculated at a point of time when surge is ON exceeds this, surge comes to an end.
Surge Patterns – Individual alert fields or combination of fields that may be identical in alerts that contribute towards surge. Combination of fields can be specified by separating each field with |. Multiple patterns can be given here separated by comma. e.g. - field1, field1|field2, field3|field4, field5
Surge Pattern Match Threshold – A numeric value that determines the percentage of alerts that should have identical surge pattern so that ITOps detects a pattern among surge alerts. e.g. 80 means 80% is the threshold
Surge Analytics Interval – The time interval in minutes over which the percentile calculations are done. Defaulted to 10.
Surge First Run Count – The minimum number of records [on alert count] that should be captured before surge check can effectively happen. Defaulted to 10.
Surge First Run Count Interval – Alert count is calculated for the time interval that the scheduler ran last and the current time. For the first run, since we do not have a previous run, the user has to specify the time interval in minutes for which alert count should be calculated. This should ideally be the same as the interval given for analytics scheduler.
Ignore Surge Without Pattern – Checkbox that determines how to handle the alerts during a surge if no pattern is detected. User can choose to consider all alerts coming in during the surge window as surge alerts or treat them all as normal alerts.
In ITOps 1.4, concept of templates for ticket creation is introduced. Installation engineers can, and should define templates to use during ticket creation step.This is a mandatory step without which ticket creation will fail and alerts will go to correlation incomplete.
With the introduction of templates, it is possible to cater to needs of different ITSM tools or customer specific rules during ticket creation scenario.
/api/ticketTemplate is the API to use for creating ticket templates. For example, here are the steps to define templates for ticket creation, for a customer with alerts from Solarwinds, Verba, Forescout and Prognosis.
For more information, refer Creating Ticket Template.
Following are the enhancements available in the platform:
Custom project settings for each project to allow for ITSM configurations at project level.
This allows the user to set custom ITSM configurations at a project level.
Ability to reassign a ticket once it is already assigned to a team member.
Enables reassigning feature to change assignment from one user to another for practical reasons.
Ability to perform bulk assignments from the alert console.
Helps users save time by enabling bulk assignments to a person or group instead of having to do it individually
Ability to perform bulk acknowledgements.
Enables to perform multiple acknowledgements at the same time using this feature.
Ability to specific the number of alerts per page in alert console.
Provides the capability to select the number of alerts that needs to be seen from a dropdown featuring ranges from 10-100.