v1

Observe /

Set infrastructure alerts

You can use infrastructure alerts to let you and your team know when there is an issue with your applications or addons.

Alert settings are configured on an account-wide basis, and can then be used in notification integrations to send alerts to webhooks or other platforms.

Click here to view your account's infrastructure alerts configuration page.
Configuring infrastructure alerts in the Northflank application

Types of infrastructure alerts

Container alerts

Notification typeExplanation
Container crashAlerts trigger when a container crashes. This could be caused by a bad exit code, out of memory error, or continual restarts (for services)
Container evictionAlerts trigger when a container is evicted, which happens if it runs out of ephemeral storage. This occurs when a container image is too large, or tries to write more data than the disk can hold.
CPU usage spikeAlerts trigger when a container's CPU usage reaches 90% or more for a short period of time
CPU sustained usageAlerts trigger when a container's CPU usage remains at 90% or more for 5 minutes
Memory usage spikeAlerts trigger when a container's memory usage reaches 90% or more for a short period of time
Memory sustained usageAlerts trigger when a container's memory usage remains at 90% or more for 5 minutes

Volume alerts

You can configure volume alerts to let you know when storage for a service or addon reaches 75% or 90% capacity. However, it's recommended that you increase the storage available to an addon when it exceeds 50% capacity.

Platform volumes refer to any persistent volumes attached to your workloads (separate from the ephemeral storage assigned to your containers), and addon volumes refer to the storage assigned to an addon.

Cluster alerts

You can configure cluster alerts to let you know of issues with your clusters on other cloud providers.

Configure infrastructure alerts

You can set a limit on how often you will receive a notification for each type of infrastructure alert from the alerts page in your account settings.

The time window for each alert means only one notification per resource will be sent in that timeframe, even if it occurs multiple times. You should configure these thresholds based on your own workloads and requirements, so that you can be aware of issues without being overwhelmed by notifications.

For example, if your container crashed alert threshold was set to 30 minutes, and you had 3 containers that crashed every 5 minutes, you would get 3 alerts every 30 minutes. However, if you set the threshold to 10 minutes, you would receive 9 alerts in the same time period.

Receive infrastructure alerts

To receive infrastructure alerts you can configure a notification integration, which will send each event generated by your alerts via the chosen integration.

You can add infrastructure alerts to existing or new notification integrations, and specify which alerts will be sent to your integration.

You can also filter the alerts you receive to come from resources in specific projects by setting this in the notification integration.

© 2024 Northflank Ltd. All rights reserved.