Header image for blog post: AI observability and analytics with Langfuse on Northflank

Published 24th April 2025

AI observability and analytics with Langfuse on Northflank

Adding AI capabilities to your product has never been easier. The rise of lightweight, open-source models has lowered the barrier to fine-tuning and deploying custom solutions for a wide range of use cases. But shipping an AI-powered feature is only the first step—the real challenge begins as you scale from early experiments to production systems handling millions or even billions of tokens daily.

At this scale, monitoring, evaluation, and optimisation become mission-critical. Understanding how your models behave in the real world, identifying failure modes, and continuously improving prompts, model selection, and system design are key to ensuring reliability, controlling costs, and delivering real value to your users. Without robust observability, it’s impossible to know whether your AI systems are working as intended—or how to make them better.

This is where Langfuse comes in.

In this guide, we’ll cover:

What Langfuse is and how it can help
The architecture of self-hosted Langfuse
How to deploy Langfuse on Northflank
How to integrate Langfuse with your codebase and start capturing insights, with a demo application

What is Langfuse?

Langfuse is an open-source observability and analytics platform purpose-built for applications powered by large language models (LLMs). It provides developers with the tracing, evaluation, and monitoring tools needed to gain deep visibility into the behaviour and performance of AI systems in production. By capturing structured metadata—including prompt inputs, model outputs, latency, token usage, and user feedback—Langfuse makes it easier to analyse usage patterns, detect issues, and iterate effectively.

Langfuse integrates seamlessly with a wide range of LLM providers and frameworks, helping teams debug, optimise, and continuously improve their AI-powered features.

Langfuse self-hosted architecture

Deploying Langfuse on Northflank requires the following resources:

Langfuse web deployment service (langfuse/langfuse:3)
- Serves the Langfuse UI and API
Langfuse worker deployment service (langfuse/langfuse-worker:3)
- Asynchronously processes events
Clickhouse deployment service (clickhouse/clickhouse-server:23.10) with a persistent volume
- Stores traces, observations, and scores
PostgreSQL managed addon
- Database for transactional workloads
Redis managed addon
- Used for queue and cache operations
MinIO managed addon
- S3-compatible storage for raw event storage, multi-modal inputs, and large exports

Prerequisites

Before deploying Langfuse on Northflank, make sure you have completed the following:

Sign up or log in to your Northflank account and create or select a team
Optional: If you want to deploy GPU-accelerated AI workloads on Northflank you can integrate your AWS/GCP/Azure account to deploy a cluster with Northflank BYOC.
Optional: If you want to run the example on your local machine, make sure you have Python (or Docker) installed

Deploy using Northflank’s stack template

You can deploy Langfuse with just a click using Northflank’s stack template. Running the stack template will deploy all the required resources and create secret groups with the necessary environment variables to immediately start using Langfuse.

You can explore and edit the stack template using the visual editor. Secrets are automatically generated and securely stored in the template’s argument overrides under advanced configuration.

Editing the Langfuse template in Northflank

Add your own domain (optional)

If you want to serve your Langfuse web service from your own subdomain, add your domain and subdomain to your Northflank team. You can then add it to your Langfuse deployment either by editing the stack template before running it, or after your Langfuse project is deployed.

Navigate to the Langfuse web service and navigate to the networking section. Expand custom domains and security rules and click add custom domain. Select your subdomain and (optionally) disable the code.run domain and save the changes.

Set up Langfuse

After deploying Langfuse you can access it via the web service’s domain, found in the service header or on the service’s ports & DNS page.

Viewing the Langfuse web service deployed in Northflank

Click sign up to create a new username and password.

Creating a Langfuse account

Note: by default anyone with the public domain can view and register an account. You can add a Northflank security policy, or configure SSO for Langfuse and disable signups.

After creating your account or logging in using SSO, you’ll be prompted to create a new organisation, invite members, and create a project.

For this guide, we’ll name the organisation testing, skip adding members, and create the project test.

Creating a new organisation in Langfuse

Langfuse will then prompt you to create an API key, and the provide you with the secret key, public key, and the host to enable tracing in your application.

Create an API key and set up tracing in Langfuse

You should store these keys somewhere secure, as you will not be able to view them again. You can save them in a Northflank secret group in your Langfuse project, or a password or secret manager.

They'll be used to provide the following environment variables to your AI application:

LANGFUSE_SECRET_KEY=""
LANGFUSE_PUBLIC_KEY=""
LANGFUSE_HOST=""

Deploy an AI model server

For this example we’ll deploy Ollama in the same project as Langfuse. In Northflank, click create new, select deployment service, and name it ollama.

Choose external image as the deployment source then enter ollama/ollama:latest as the image path. Make sure the port 11434 is added in networking, and set to HTTP. You can publicly expose this port to access it locally.

In this case we’ll be running a small model to test our integration, so you can select a smaller compute plan such as 2 vCPU and 4GB memory and no GPU.

Deploying an Ollama image on Northflank

When the service has finished deploying, open a shell in the running container from the service's containers list. Run the command ollama pull ${model}, in this example ollama pull qwen:0.5b, to download the model and make it available for requests.

Pulling a model in Ollama using a container shell in the Northflank UI

Deploy an AI application

Langfuse is easy to integrate with many popular AI frameworks. You use our example repository, a simple Python application using Langfuse’s OpenAI API library to make requests to an OpenAPI-compatible server, in this case the Ollama instance we just deployed.

You can also follow Langfuse's quickstart guides for different languages and frameworks.

import os
from langfuse.openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY", "EMPTY"),
    base_url=os.environ.get("OPENAI_API_BASE", 'Not set'),
)

model=os.environ.get("MODEL", "qwen:0.5b")

models = client.models.list()

print("Available models:", models)

completion = client.completions.create(
    model=model,
    prompt="San Francisco is a"
)

print("Completion result:", completion)

chat_response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "user", "content": "Why is the sky blue?"}
    ]
)

print("Chat response:", chat_response)

Create a new job in Northflank and select manual job. Call it langfuse test, choose version control as the source, enter https://github.com/northflank-examples/langfuse-python-integration as the repository and select the main branch. Choose Dockerfile as the build type and then create the job.

Creating a job to query an AI model on Northflank

You can also run this locally by make the necessary environment variables listed below available in your local environment, replacing the Northflank private endpoints with the public domains.

Provide Langfuse secrets

Finally, we’ll create a new secret group in Northflank. Call it Langfuse integration , and add the following environment variables to it, replacing the values with the ones from your Langfuse integration:

# Provide the application with the Ollama server details
OPENAI_API_KEY="EMPTY"
OPENAI_API_BASE="http://ollama:11434/v1"
# Choose the model to request
MODEL="qwen:0.5b"

# Provide the Langfuse web API endpoint and secrets to enable tracing 
LANGFUSE_SECRET_KEY="sk-lf-*****"
LANGFUSE_PUBLIC_KEY="pk-lf-*****"
LANGFUSE_HOST="http://web:3000"

Save the secret group and the variables will be available to your job on the next run.

Run the application

Trigger a job run, which will send a couple of requests to the Ollama AI model server. Click on the job run container to view the logs, which should show the requests.

Return to your Langfuse dashboard and you should see the traces appear.

Langfuse dashboard running on Northflank

Open tracing to see traces and observations, and begin adding them to datasets.

Traces in Langfuse from an AI application and server deployed on Northflank

You should now be able to use Langfuse to observe and analyse your AI models by integrating it with your existing stack, or creating applications from scratch with Langfuse enabled. You'll be able to collect user feedback on model responses, annotate traces and observations, create test and benchmark datasets, and monitor metrics for quality, cost, latency, and volume.

Next steps

This guide and stack template should provide you with a basic working setup for Langfuse, however you can expand and customise the stack depending on your requirements. Langfuse is configured by environment variables, which you can add to a secret group so that they can be inherited by both the Langfuse web and worker services.

You can find a full list of configuration options available by environment variable on the Langfuse Self Hosting docs.

Authentication & SSO

The stack template deployment uses email & password for authentication by default, but you can enable SSO using your auth provider by adding the necessary environment variables.

If you‘re using a custom domain you’ll need to update the environment variable NEXTAUTH_URL in the Langfuse web service with your subdomain.

To allow users to receive password reset emails you’ll need to add email capability.

Networking & encryption

Project networking is secure and private by default and the Northflank load balancer handles SSL/TLS certificates and encryption for external traffic.

Headless initialisation

If you need to deploy Langfuse repeatedly with an infrastructure as code approach, for example across different environments or regions, you can add LANGFUSE_INIT_* environment variables to initialise your Langfuse project.

First, set the required argument overrides for your template on the settings page:

# ARGUMENT OVERRIDES
LANGFUSE_INIT_ORG_ID="northflank"
LANGFUSE_INIT_PROJECT_ID="example-project"
LANGFUSE_INIT_PROJECT_PUBLIC_KEY="lf_pk_1234567890"
LANGFUSE_INIT_PROJECT_SECRET_KEY="lf_sk_1234567890"
LANGFUSE_INIT_USER_EMAIL="user@example.com"
LANGFUSE_INIT_USER_PASSWORD="password123"

These will be securely stored on Northflank. Next, add the environment variable keys to a secret group node in your template, and obtain the values from the argument object:

# ENVIRONMENT VARAIBLES
LANGFUSE_INIT_ORG_ID="${args.LANGFUSE_INIT_ORG_ID}"
LANGFUSE_INIT_PROJECT_ID="${args.LANGFUSE_INIT_PROJECT_ID}"
LANGFUSE_INIT_PROJECT_PUBLIC_KEY="${args.LANGFUSE_INIT_PROJECT_PUBLIC_KEY}"
LANGFUSE_INIT_PROJECT_SECRET_KEY="${args.LANGFUSE_INIT_PROJECT_SECRET_KEY}"
LANGFUSE_INIT_USER_EMAIL="${args.LANGFUSE_INIT_USER_EMAIL}"
LANGFUSE_INIT_USER_PASSWORD="${args.LANGFUSE_INIT_USER_PASSWORD}"

Now when you run the template you can use the same argument overrides to populate the configuration variables, or provide new values for different environments.

Adding argument overrides to a Langfuse template on Northflank

Transactional emails

To send transactional emails you can provide your SMTP server’s address with SMTP_CONNECTION_URL and EMAIL_FROM_ADDRESS as environment variables.

OpenTelemetry tracing

You can add Langfuse to your OTel setup by adding the following environment variables: OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_SERVICE_NAME, OTEL_TRACE_SAMPLING_RATIO .