Observe /
Set infrastructure alerts
You can use infrastructure alerts to let you and your team know when there is an issue with your applications or addons.
Alert settings are configured on an account-wide basis, and can then be used in notification integrations to send alerts to webhooks or other platforms.
Types of infrastructure alerts
Container alerts
Notification type | Explanation |
---|---|
Container crash | Alerts trigger when a container crashes. This could be caused by a bad exit code, out of memory error, or continual restarts (for services) |
Container eviction | Alerts trigger when a container is evicted, which happens if it runs out of ephemeral storage. This occurs when a container image is too large, or tries to write more data than the disk can hold. |
CPU usage spike | Alerts trigger when a container's CPU usage reaches 90% or more for a short period of time |
CPU sustained usage | Alerts trigger when a container's CPU usage remains at 90% or more for 5 minutes |
Memory usage spike | Alerts trigger when a container's memory usage reaches 90% or more for a short period of time |
Memory sustained usage | Alerts trigger when a container's memory usage remains at 90% or more for 5 minutes |
Volume alerts
You can configure volume alerts to let you know when storage for a service or addon reaches 75% or 90% capacity. However, it's recommended that you increase the storage available to an addon when it exceeds 50% capacity.
Platform volumes refer to any persistent volumes attached to your workloads (separate from the ephemeral storage assigned to your containers), and addon volumes refer to the storage assigned to an addon.
Cluster alerts
You can configure cluster alerts to let you know of issues with your clusters on other cloud providers.
Configure infrastructure alerts
You can set a limit on how often you will receive a notification for each type of infrastructure alert from the alerts page in your account settings.
The time window for each alert means only one notification per resource will be sent in that timeframe, even if it occurs multiple times. You should configure these thresholds based on your own workloads and requirements, so that you can be aware of issues without being overwhelmed by notifications.
For example, if your container crashed alert threshold was set to 30 minutes, and you had 3 containers that crashed every 5 minutes, you would get 3 alerts every 30 minutes. However, if you set the threshold to 10 minutes, you would receive 9 alerts in the same time period.
Receive infrastructure alerts
To receive infrastructure alerts you can configure a notification integration, which will send each event generated by your alerts via the chosen integration.
You can add infrastructure alerts to existing or new notification integrations, and specify which alerts will be sent to your integration.
You can also filter the alerts you receive to come from resources in specific projects by setting this in the notification integration.
Next steps
Receive notifications
Create notification integrations to be alerted when selected events occur in your account.
View logs
View detailed, real-time logs from builds, deployments, and more.
Monitor containers
Monitor the health and resource usage of deployments, and view detailed logs and metrics for individual container.
Use the Northflank API
Learn how to create and manage projects on Northflank programmatically using the REST API.