Sysdig Monitor Gets a Cutting-Edge Alert and Notification modules Makeover

Overview

Sysdig uses alerts to notify users about potential infrastructure issues based on changes in collected metrics. These metrics act as dials, and when a reading goes beyond a set threshold, an alert triggers to grab your attention.

Our goal was to make this process smoother and more user-friendly.

Timeline

February 2022 – Jan 2024

Contribution 

User Research / Snap Testing / Quick Prototype / Data Analysis

Tools 

Figma / Periscope / Pendo 

User Research & Pain Points Identification

To understand user needs, we employed a multi-pronged approach:

\

Data Analysis

Tools like Periscope and Pendo helped us analyze user behavior patterns. We looked at:

[

Frequently created (and discarded) alert types (indicating potential confusion)

[

Heavily modified types (suggesting usability issues)

\

Customer Workshops

We validated quantitative data through workshops, where users mapped their existing goals to proposed new user journeys.

\

Competitive Analysis

We analyzed competitor tools and market trends to identify gaps in our offerings.

Based on these insights, we identified key areas for improvement:

\

Limited Alert Types

We needed to expand functionality to address diverse monitoring needs.

\

Alert Configuration Challenges

Users struggled to configure thresholds accurately, often needing multiple edits.

\

Inefficient Investigation

Alert investigation lacked clear workflows for troubleshooting.

Designing for a Smoother User Experience

We addressed these pain points through several design interventions:

Real-time Alert Previews

Users can now see how historical data impacts metric behavior, helping them set accurate thresholds on the first try.

Metric Label Selector

This feature provides documentation and suggestions for labels, ensuring users choose the right metric for their alert.

Warning Thresholds

Users can define separate notification channels for warning thresholds, eliminating the need for duplicate alerts.

Enhanced Investigation

Alerts can now link to relevant dashboards and runbooks, streamlining troubleshooting workflows. These links are also embedded in notification channels for easy access.

Expanded Alert Types

New alert types like “Change Alert” and “Group Outlier Alert” provide users with more granular monitoring capabilities.

Prometheus Query (PromQL) Integration

Users can translate metric alerts into PromQL alerts, leveraging PromQL’s power even without advanced knowledge (through form-based assistance).

Clear Alert Summaries

Alerts are presented in plain English, making technical details easier to understand.

Highly Customizable Notification Channels

Users can tailor notification content to their needs. We also introduced metric behavior snapshots for quick visual analysis (initially available on Slack and email).

Improved Alerts List Page Filtering

Improved alert discoverability by providing filtering options for currently triggering alerts, alerts with unreporting metrics, and deactivated alerts.

Measuring Success: User Behavior & Feedback

Our data showed positive user behavior changes:

\

Users started creating a wider variety of alert types, indicating a better understanding of their monitoring options.

\

Warning thresholds were underutilized initially, likely due to the lack of a single-alert merge option. We're exploring solutions to address this.

\

Label editing after alert creation significantly decreased, suggesting users were making informed label choices with the new selector.

\

Notification channel customization was highly appreciated, prompting a second project phase to extend it to more channels and introduce a powerful webhook editor.

Conclusion

The revamped user experience demonstrates how design thinking can empower users with the information they need, presented in a clear and actionable way. We’re committed to continuously iterating and improving based on user feedback, ensuring Sysdig remains a user-centric platform for infrastructure monitoring and security.