
AI observability and analytics with Langfuse on Northflank
Adding AI capabilities to your product has never been easier. The rise of lightweight, open-source models has lowered the barrier to fine-tuning and deploying custom solutions for a wide range of use cases. But shipping an AI-powered feature is only the first step—the real challenge begins as you scale from early experiments to production systems handling millions or even billions of tokens daily.
At this scale, monitoring, evaluation, and optimisation become mission-critical. Understanding how your models behave in the real world, identifying failure modes, and continuously improving prompts, model selection, and system design are key to ensuring reliability, controlling costs, and delivering real value to your users. Without robust observability, it’s impossible to know whether your AI systems are working as intended—or how to make them better.
This is where Langfuse comes in.
In this guide, we’ll cover:
- What Langfuse is and how it can help
- The architecture of self-hosted Langfuse
- How to deploy Langfuse on Northflank
- How to integrate Langfuse with your codebase and start capturing insights, with a demo application
Langfuse is an open-source observability and analytics platform purpose-built for applications powered by large language models (LLMs). It provides developers with the tracing, evaluation, and monitoring tools needed to gain deep visibility into the behaviour and performance of AI systems in production. By capturing structured metadata—including prompt inputs, model outputs, latency, token usage, and user feedback—Langfuse makes it easier to analyse usage patterns, detect issues, and iterate effectively.
Langfuse integrates seamlessly with a wide range of LLM providers and frameworks, helping teams debug, optimise, and continuously improve their AI-powered features.
Before deploying Langfuse on Northflank, make sure you have completed the following:
- Sign up or log in to your Northflank account and create or select a team
- Optional: If you want to deploy GPU-accelerated AI workloads on Northflank you can integrate your AWS/GCP/Azure account to deploy a cluster with Northflank BYOC.
- Optional: If you want to run the example on your local machine, make sure you have Python (or Docker) installed
Deploying Langfuse on Northflank requires the following resources:
- Langfuse web deployment service (
langfuse/langfuse:3
)- Serves the Langfuse UI and API
- Langfuse worker deployment service (
langfuse/langfuse-worker:3
)- Asynchronously processes events
- Clickhouse deployment service (
clickhouse/clickhouse-server:23.10
) with a persistent volume- Stores traces, observations, and scores
- PostgreSQL managed addon
- Database for transactional workloads
- Redis managed addon
- Used for queue and cache operations
- MinIO managed addon
- S3-compatible storage for raw event storage, multi-modal inputs, and large exports
You can deploy Langfuse with just a click using Northflank’s stack template. Running the stack template will deploy all the required resources and create secret groups with the necessary environment variables to immediately start using Langfuse.
You can explore and edit the stack template using the visual editor. Secrets are automatically generated and securely stored in the template’s argument overrides under advanced configuration.
If you want to serve your Langfuse web service from your own subdomain, add your domain and subdomain to your Northflank team. You can then add it to your Langfuse deployment either by editing the stack template before running it, or after your Langfuse project is deployed.
Navigate to the Langfuse web service and navigate to the networking section. Expand custom domains and security rules and click add custom domain. Select your subdomain and (optionally) disable the code.run domain and save the changes.
After deploying Langfuse you can access it via the web service’s domain, found in the service header or on the service’s ports & DNS page.
Click sign up to create a new username and password.
Note: by default anyone with the public domain can view and register an account. You can add a Northflank security policy, or configure SSO for Langfuse and disable signups.
After creating your account or logging in using SSO, you’ll be prompted to create a new organisation, invite members, and create a project.
For this guide, we’ll name the organisation testing
, skip adding members, and create the project test
.
Langfuse will then prompt you to create an API key, and the provide you with the secret key, public key, and the host to enable tracing in your application.
You should store these keys somewhere secure, as you will not be able to view them again. You can save them in a Northflank secret group in your Langfuse project, or a password or secret manager.
They'll be used to provide the following environment variables to your AI application:
LANGFUSE_SECRET_KEY=""
LANGFUSE_PUBLIC_KEY=""
LANGFUSE_HOST=""
For this example we’ll deploy Ollama in the same project as Langfuse. In Northflank, click create new, select deployment service, and name it ollama
.
Choose external image as the deployment source then enter ollama/ollama:latest
as the image path. Make sure the port 11434
is added in networking, and set to HTTP
. You can publicly expose this port to access it locally.
In this case we’ll be running a small model to test our integration, so you can select a smaller compute plan such as 2 vCPU and 4GB memory and no GPU.
When the service has finished deploying, open a shell in the running container from the service's containers list. Run the command ollama pull ${model}
, in this example ollama pull qwen:0.5b
, to download the model and make it available for requests.
Langfuse is easy to integrate with many popular AI frameworks. You use our example repository, a simple Python application using Langfuse’s OpenAI API library to make requests to an OpenAPI-compatible server, in this case the Ollama instance we just deployed.
You can also follow Langfuse's quickstart guides for different languages and frameworks.
import os
from langfuse.openai import OpenAI
client = OpenAI(
api_key=os.environ.get("OPENAI_API_KEY", "EMPTY"),
base_url=os.environ.get("OPENAI_API_BASE", 'Not set'),
)
model=os.environ.get("MODEL", "qwen:0.5b")
models = client.models.list()
print("Available models:", models)
completion = client.completions.create(
model=model,
prompt="San Francisco is a"
)
print("Completion result:", completion)
chat_response = client.chat.completions.create(
model=model,
messages=[
{"role": "user", "content": "Why is the sky blue?"}
]
)
print("Chat response:", chat_response)
Create a new job in Northflank and select manual job. Call it langfuse test
, choose version control as the source, enter https://github.com/northflank-examples/langfuse-python-integration
as the repository and select the main
branch. Choose Dockerfile as the build type and then create the job.
You can also run this locally by make the necessary environment variables listed below available in your local environment, replacing the Northflank private endpoints with the public domains.
Finally, we’ll create a new secret group in Northflank. Call it Langfuse integration
, and add the following environment variables to it, replacing the values with the ones from your Langfuse integration:
# Provide the application with the Ollama server details
OPENAI_API_KEY="EMPTY"
OPENAI_API_BASE="http://ollama:11434/v1"
# Choose the model to request
MODEL="qwen:0.5b"
# Provide the Langfuse web API endpoint and secrets to enable tracing
LANGFUSE_SECRET_KEY="sk-lf-*****"
LANGFUSE_PUBLIC_KEY="pk-lf-*****"
LANGFUSE_HOST="http://web:3000"
Save the secret group and the variables will be available to your job on the next run.
Trigger a job run, which will send a couple of requests to the Ollama AI model server. Click on the job run container to view the logs, which should show the requests.
Return to your Langfuse dashboard and you should see the traces appear.
Open tracing to see traces and observations, and begin adding them to datasets.
You should now be able to use Langfuse to observe and analyse your AI models by integrating it with your existing stack, or creating applications from scratch with Langfuse enabled. You'll be able to collect user feedback on model responses, annotate traces and observations, create test and benchmark datasets, and monitor metrics for quality, cost, latency, and volume.
This guide and stack template should provide you with a basic working setup for Langfuse, however you can expand and customise the stack depending on your requirements. Langfuse is configured by environment variables, which you can add to a secret group so that they can be inherited by both the Langfuse web and worker services.
You can find a full list of configuration options available by environment variable on the Langfuse Self Hosting docs.
The stack template deployment uses email & password for authentication by default, but you can enable SSO using your auth provider by adding the necessary environment variables.
If you‘re using a custom domain you’ll need to update the environment variable NEXTAUTH_URL
in the Langfuse web service with your subdomain.
To allow users to receive password reset emails you’ll need to add email capability.
Project networking is secure and private by default and the Northflank load balancer handles SSL/TLS certificates and encryption for external traffic.
If you need to deploy Langfuse repeatedly with an infrastructure as code approach, for example across different environments or regions, you can add LANGFUSE_INIT_*
environment variables to initialise your Langfuse project.
First, set the required argument overrides for your template on the settings page:
# ARGUMENT OVERRIDES
LANGFUSE_INIT_ORG_ID="northflank"
LANGFUSE_INIT_PROJECT_ID="example-project"
LANGFUSE_INIT_PROJECT_PUBLIC_KEY="lf_pk_1234567890"
LANGFUSE_INIT_PROJECT_SECRET_KEY="lf_sk_1234567890"
LANGFUSE_INIT_USER_EMAIL="user@example.com"
LANGFUSE_INIT_USER_PASSWORD="password123"
These will be securely stored on Northflank. Next, add the environment variable keys to a secret group node in your template, and obtain the values from the argument object:
# ENVIRONMENT VARAIBLES
LANGFUSE_INIT_ORG_ID="${args.LANGFUSE_INIT_ORG_ID}"
LANGFUSE_INIT_PROJECT_ID="${args.LANGFUSE_INIT_PROJECT_ID}"
LANGFUSE_INIT_PROJECT_PUBLIC_KEY="${args.LANGFUSE_INIT_PROJECT_PUBLIC_KEY}"
LANGFUSE_INIT_PROJECT_SECRET_KEY="${args.LANGFUSE_INIT_PROJECT_SECRET_KEY}"
LANGFUSE_INIT_USER_EMAIL="${args.LANGFUSE_INIT_USER_EMAIL}"
LANGFUSE_INIT_USER_PASSWORD="${args.LANGFUSE_INIT_USER_PASSWORD}"
Now when you run the template you can use the same argument overrides to populate the configuration variables, or provide new values for different environments.
To send transactional emails you can provide your SMTP server’s address with SMTP_CONNECTION_URL
and EMAIL_FROM_ADDRESS
as environment variables.
You can add Langfuse to your OTel setup by adding the following environment variables: OTEL_EXPORTER_OTLP_ENDPOINT
, OTEL_SERVICE_NAME
, OTEL_TRACE_SAMPLING_RATIO
.
Refer to the following guides to learn how to deploy GPU-enabled workloads on Northflank with BYOC:
- Deploy Juypter Notebook with Tensorflow in AWS, GCP, and Azure
- Self-host vLLM in your own cloud account with Northflank BYOC
- Self-host Deepseek R1 on AWS, GCP, Azure & K8s in Three Easy Steps
Northflank allows you to deploy your code and databases within minutes. Sign up for a Northflank account and create a free project to get started.
- Build, deploy, scale, and release from development to production
- Observe & monitor with real-time metrics & logs
- Deploy managed databases and storage
- Manage infrastructure as code
- Deploy clusters in your own cloud accounts
- Run GPU workloads