Trupeer AI - Create professional product videos and guides

AI-Powered Incident Management in Atlassian Service Collection

Kate Clavet-D'Amelio
Oct 24, 2025

19 Views
0 Comments
0 Reactions
Loading video...

AI-Powered Incident Management in Atlassian Service Collection

This document provides a comprehensive look into managing incidents using AI within Atlassian Service Collection. It showcases how alerts are processed, classified, and managed to optimize the response and resolution process, ensuring efficiency and reducing alert fatigue for responders.

Step 1

Begin by opening the Alert queue within the Atlassian Service Collection to access the AI-powered incident management interface.

Screenshot

Step 2

Observe the alerts displayed. These are organized based on similarity in the AI view, with an additional classification for each alert.

Screenshot

Step 3

The classification model, using historical data and general information, assesses which alert groups require immediate responder attention. You have the option to confirm or reject the classification as signal or noise.

Screenshot

Step 4

The model improves over time as responders classify unclassified groups and accept suggested classifications. Alerts are de-duplicated within groups to notify the responder only once, indicating which groups need their attention.

Screenshot

Step 5

Consider the example of an alert group with 20 alerts. Instead of 20 notifications, responders receive a single notification, mitigating alert fatigue. For instance, although classified as P2, this group is likely noise due to scheduled maintenance.

Screenshot

Step 6

If a group indicates a potential signal, such as high CPU utilization, it's crucial for responders to review it closely. Proceed to explore this alert group further.

Screenshot

Step 7

Upon entering, you'll find AI-generated suggestions powered by ROVO to assist responders in resolving the alert group more efficiently.

Screenshot

Step 8

The system recommends responders based on past similar incidents and suggests reviewing the history of alert groups. Each individual alert can be examined for more details.

Screenshot

Step 9

When creating an incident, AI assists by providing a pre-filled summary and description.

Screenshot

Step 10

AI further suggests an incident priority based on similar past incidents, leveraging the teamwork graph, Atlassian's data layer encompassing team work, goals, and knowledge.

Screenshot

Step 11

Once the incident is created, review it for accuracy and completeness.

Screenshot

Step 12

ROVO offers suggestions and actions to efficiently manage the incident, providing a summary and potential root cause, along with similar historical incidents.

Screenshot

Step 13

Recommendations include assigning team members and adding affected services to resolve the incident swiftly. If no services are initially added, consider adding them as key elements.

Screenshot

Step 14

Affected services help define the incident's impact and are critical for understanding its full scope. Consider updating severity and including additional team members who can assist in resolution.

Screenshot

Step 15

Resources from the teamwork graph, such as related incidents and knowledge-based articles, provide essential context for quick resolution.

Screenshot

Step 16

When an incident is confirmed as major, activate the major incident toggle to initiate additional actions, such as creating a Slack channel for team collaboration and gathering observability data.

Screenshot

Step 17

Join the Slack channel, where AI will summarize the incident for new responders, ensuring everyone is informed and ready to collaborate.

Screenshot

Step 18

Return to the incident record for further examination.

Screenshot

Step 19

With the major toggle activated, observe the integration of observability data from New Relic, although technical issues may occasionally arise.

Screenshot

Step 20

Proceed to gather additional context from the Robo Ops agent for comprehensive understanding and analysis.

Screenshot

Step 21

Request recent critical alerts from the New Relic account, providing valuable insight into the incident's scope and context. Verify the potential root cause with Robo Ops assistance.

Screenshot

Step 22

Review documentation and analysis supporting the suggested root cause, ensuring confidence in resolving the incident.

Screenshot

Step 23

Upon resolving the incident, Robo Ops aids by drafting a post-incident report (PIR), streamlining documentation and review.

Screenshot

Step 24

Post-incident activities can be exhausting, so the draft PIR allows for efficient review and closure.

Screenshot

Step 25

The draft PIR, requiring human approval, maintains responder engagement in the final review process.

Screenshot

Step 26

Examine the draft PIR document.

Screenshot

Step 27

Review the draft PIR, already published to Confluence, powered by Robo automation.

Screenshot

Step 28

Ensure the PIR contains complete narratives, impacts, and information before final approval and distribution.

Screenshot

Step 29

The PIR also includes links to essential references and the Slack channel for further context.

Screenshot

U