|
|
(142 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
− | == Use Cases ==
| |
− | # Create a new incident
| |
− | # Display all incidents in Ops Console
| |
− | # Display all open, acknowledged or resolved incidents in Ops Console
| |
− | # Display all open, acknowledged or resolved incidents assigned to a user in Ops Console
| |
− | # Acknowledge an incident in Ops Console
| |
− | # Resolve an incident in Ops Console
| |
| | | |
− | == Concepts ==
| |
− | * Incidents
| |
− | ** Incidents are created when an alarm transitions to the ALARM or UNDETERMINED state and are associated with an alarm.
| |
− | ** Incidents enable alarms to
| |
− | *** Track status
| |
− | *** Be assigned to users
| |
− | *** Commented on by users
| |
− | ** There are three statuses of an incident
| |
− | *** OPEN: When an incident is created it is in the OPEN state.
| |
− | *** ACKNOWLEDGED: When an incident is being worked on it is ACKNOWLEDGED.
| |
− | *** RESOLVED: When an incident is closed, it is resolved.
| |
− | ** Some of the concepts around incidents are "borrowed" from PagerDuty. See https://developer.pagerduty.com/documentation/rest/incidents.
| |
− | * Alarm
| |
− | ** There are three states of an alarm
| |
− | *** OK
| |
− | *** ALARM
| |
− | *** UNDETERMINED
| |
− | * Alarm state transition event
| |
− | ** An event that is created by the Threshold Engine when the alarm transitions state.
| |
− | * Assignment/Owner
| |
− | ** The user that the incident is assigned to.
| |
− | * Comment
| |
− | ** A comment on an incident.
| |
− | * Actions
| |
− | ** Similar to alarm definition actions in Monasca, incidents can also have actions which occur when an incident is modified.
| |
− |
| |
− | == Incident Lifecycle ==
| |
− | This section describes the lifecycle of an incident.
| |
− |
| |
− | Alarm state transition events are processed as follows:
| |
− | # To ALARM
| |
− | ## Open a new incident for the supplied alarm, or adds an alarm state transition event to an existing incident.
| |
− | ### If an incident doesn't exist for the alarm, or the status of the incident has been RESOLVED, a new incident is created with the incident status as OPEN.
| |
− | ### If there exists an incident with a status of OPEN or ACKNOWLEDGED for the alarm, the alarm state transition event is added to the existing incident, and the status is not modified.
| |
− | # To OK
| |
− | ## Adds an alarm state transition event to an existing incident.
| |
− | ### If an incident doesn't exist for the alarm, or the status of the incident has been RESOLVED, nothing is done.
| |
− | ### If there exists an incident with a status of OPEN or ACKNOWLEDGED for the alarm, the alarm state transition event is added to the existing incident, and the status is not modified.
| |
− | # To UNDETERMINED
| |
− | ## Open a new incident for the supplied alarm, or adds an alarm state transition event to an existing incident.
| |
− | ### If an incident doesn't exist for the alarm, or the status of the incident has been RESOLVED, a new incident is created with the incident status as OPEN.
| |
− | ### If there exists an incident with a status of OPEN or ACKNOWLEDGED for the alarm, the alarm state transition event is added to the existing incident, and the status is not modified.
| |
− |
| |
− | Acknowledge incident
| |
− | # Modify the incident to ACKNOWLEDGED.
| |
− | # If an incident is acknowledged, it won't generate any additional notifications, even if it receives new alarm state transition events.
| |
− |
| |
− | Resolve incident
| |
− | # Modify the incident to RESOLVED.
| |
− | # If an incident is resolved, it won't generate any additional notifications.
| |
− |
| |
− | Assign or reassign incidents are processed as follows:
| |
− | # When an incident is created it is initially unassigned. It can then be assigned or reassigned later.
| |