

How to build an Internal Developer Platform (and why you might not want to)
An Internal Developer Platform (IDP) is a self-service layer that abstracts away the complexities of infrastructure so that developers can deploy code without touching Kubernetes manifests, Terraform modules, or CI/CD pipelines. It's the internal equivalent of platforms like Vercel or Heroku, built specifically for the needs (and constraints) of your company.
It centralizes the workflow for deploying and operating software: provisioning environments, managing secrets, deploying workloads, and monitoring health. Done well, it enables fast, secure, and reliable shipping, without asking every developer to become a DevOps expert.
On paper, building an IDP sounds strategic. It promises developer autonomy, consistency across environments, and a better security posture. Teams imagine faster deploys, fewer tickets to infra, and more time spent shipping product. But the dream and the reality are rarely aligned.
The platform team sets out to build abstractions over Kubernetes and cloud infrastructure.
They assemble a stack of open-source tools: Argo for GitOps, Vault for secrets, Prometheus for metrics, Istio or Linkerd for service mesh. They script templates, wire up automations, and create CLIs or dashboards for developers. It starts to look promising.
But what starts as a well-intentioned attempt to reduce friction often turns into a maintenance nightmare. The more features the team builds, the more surface area they commit to supporting. And the further they drift from the real goal: helping developers ship business logic that serves users.
Understand who you’re building for. What kind of workloads are being deployed: stateless microservices, cron jobs, stateful databases? Are developers comfortable with GitOps? Do they prefer CLIs or UIs? These questions will shape every layer of your platform.
You’ll be building on Kubernetes, but that’s just the beginning.
A functional IDP needs to handle provisioning, networking, observability, security, and developer interfaces. A typical stack includes:
- Kubernetes for orchestration (EKS, GKE, AKS, or self-managed)
- ArgoCD or Flux for GitOps deployment
- Prometheus, Grafana, Loki, and Tempo for observability
- Vault or Sealed Secrets for secrets management
- Cert-manager for TLS automation
- NGINX or Traefik for ingress
- CSI drivers for persistent storage
- KEDA for workload autoscaling
You’ll also need to set up a service mesh (Istio, Linkerd, or Cilium) if you're managing east-west traffic or want to enforce mTLS.
This is where most platform teams stall. The Kubernetes API is too low-level for product engineers. You’ll need to abstract these primitives into composable services:
- Define Helm or Kustomize templates for deploying services
- Create opinionated defaults for environments, secrets, autoscaling, and resource limits
- Build a UI or CLI that lets developers launch services, view logs, roll back deployments, and see metrics
- Implement GitOps flows so developers can commit YAML or JSON manifests and watch their workloads go live
- Integrate with your CI pipelines for push-to-deploy automation
If you’re serving multiple product teams, you need tenant isolation. Namespaces alone aren’t enough. Implement:
- NetworkPolicies to isolate workloads
- RBAC across Kubernetes and platform UI
- Audit logs for compliance
- Per-team quotas on CPU, memory, and storage
💡 Also: think about secrets management from day one. Hardcoded secrets in Helm values files will come back to bite you.
A good platform is observable by default. This includes:
- Pre-wired metrics, logs, and traces for every deployed service
- Central dashboards for performance, error rates, and cost
- Alerts on pod restarts, CPU throttling, failed jobs
- Per-deployment views in both UI and CLI
Developers should be able to:
- Spin up preview environments from PRs
- Restart services
- Trigger rollbacks
- Rotate secrets
- Access logs and metrics
Security should be enforced by the platform, not downstream teams. Run containers as non-root, apply seccomp profiles, disable capabilities, and sandbox workloads when running untrusted or third-party code.
IDPs only work if they reduce toil. You’ll need to automate:
- Environment creation
- DNS and TLS setup
- Canary and blue/green deployments
- Image scanning and policy enforcement
- Cost reporting
And finally: documentation. If developers can’t figure out how to use the platform, they’ll go around it.
The hard truth is that most internal platforms degrade over time. They rot. Tooling versions drift. APIs break. Documentation gets stale. And the team that built the platform becomes a bottleneck for every new feature request.
Worse, the adoption often never comes. Developers keep bypassing the platform because it’s too rigid, too slow, or just too confusing. Suddenly, the team has spent 18 months and seven figures on a product no one wants to use.
This is a pattern repeated across the industry: teams build internal Herokus and end up maintaining brittle toolchains. They’re spending more time debugging their factory than baking anything in it.
There are cases where building your own platform makes sense. If you’re a FAANG-scale company with unique compliance requirements, massive scale, or infrastructure so custom that nothing off-the-shelf fits, then yes. Build it. But be ready to dedicate entire teams to it, like game studios maintaining custom engines.
For everyone else, the rationale starts to fall apart.
Buying a platform used to mean sacrificing control. Early PaaS tools like Heroku and Cloud Foundry were too rigid, too opinionated, and couldn’t support complex enterprise use cases. But that’s no longer true.
Modern platform solutions like Northflank give you the same abstractions: multi-cloud support, workload orchestration, secrets management, observability, but without the multi-year investment. They’re extensible. You can run them in your own cloud. They support GitOps, APIs, CLIs, and UI workflows out of the box.
These platforms aren’t built in isolation. They benefit from seeing tens of thousands of real-world deployments, absorbing patterns, and adapting to emerging use cases faster than any internal team could.
Read more about the "build vs buy" conundrum here.
And they work. Companies using Northflank get time back. They avoid wasting quarters on YAML templates and Terraform modules and instead spend that time building actual product.
The metaphor is obvious, but worth repeating: you’re not in the business of building factories. You’re in the business of shipping software. The more time you spend optimizing the conveyor belts, the less time you spend delivering what your customers care about.
Your users don’t care about your CI pipeline. They care about whether your product works. Every hour your team spends fine-tuning a deployment script is an hour lost on feature development.
So yes, you can build your own Internal Developer Platform. But if your team’s goal is to move fast, stay secure, and ship features that make an impact, it’s probably not worth it.
Before you sink another month into writing Helm charts or building a UI for ephemeral environments, ask yourself a simple question:
Are we solving problems unique to our company, or just repeating the same work every other team is doing?
You already know the answer.
Northflank helps engineering teams skip the platform tax and go straight to delivery. Git-native. Developer-friendly.
Start deploying with Northflank today, for free.
A platform is the underlying system that automates and abstracts infrastructure—handling deployment, secrets, observability, governance, etc.
A portal is just the interface. It's how developers interact with the platform: a UI, a CLI, a set of APIs.
Building a portal without a robust platform underneath is like building a cockpit without an airplane. It might look slick, but it doesn't actually fly.
Not technically, but it's the current best practice. GitOps brings auditability, rollback safety, and better alignment between code and infra. If you're not using GitOps, you're probably rebuilding a worse version of it with custom scripts.
Backstage is a catalog and portal framework. It’s great for visibility, documentation, and onboarding, but it doesn’t provision environments, handle deployments, or enforce security policies out of the box. You still need to build or plug in the actual platform logic behind it.
If you want something usable by developers, expect 12 months minimum with a dedicated team. Some teams have come to us after trying it for 3+ years. 🙂 And that's just to hit parity with modern out-of-the-box solutions. Keeping it maintained is a permanent cost center.
- Developers bypass it or complain it’s too rigid
- The platform team becomes a bottleneck for changes
- Onboarding takes longer than it should
- Documentation is outdated or nonexistent
- You're spending time maintaining glue code instead of building product
If your infrastructure or compliance requirements are so specific that no off-the-shelf tool fits (and you have the headcount to support it) building might be justified. But that’s rare. Most teams are better off buying and customizing an existing solution.