Fetching latest headlines…
EP3: Native Kubernetes deployment is officially working in Coolify. But getting there meant wrestling with vicious race conditions.
NORTH AMERICA
🇺🇸 United StatesMay 11, 2026

EP3: Native Kubernetes deployment is officially working in Coolify. But getting there meant wrestling with vicious race conditions.

0 views0 likes0 comments
Originally published byDev.to

If you missed Episode 2, we realized that Coolify's SSH-native engine is surprisingly cluster-friendly. The architecture wasn't locked to Docker; it simply lacked a translation layer.

In Episode 3, it was time to build that translation layer and prove the concept. But turning theory into reality required two massive, distinct phases of implementation and this led to one of the most stressful race conditions I've ever debugged, all while I was backpacking across 4 countries.

Here is the story of how the Kubernetes Proof of Concept (POC) came to life.

🏗️ Phase 1: The Struggle for a Cluster

Before you can deploy an application to a Kubernetes cluster, you must actually have a cluster to deploy to.

This was the first major hurdle. The official Coolify deployment currently has zero built-in Kubernetes infrastructure. I knew that I couldn't just build a deployment script without giving users a way to actually spin up an environment to test it on without leaving the UI.

I spent days searching around, reviewing so many different options for how to bootstrap a cluster natively. Ultimately, I settled on K3s. It is an incredibly lightweight, production-ready Kubernetes distribution that is perfectly suited for the types of servers Coolify normally runs on.

I integrated it directly into the frontend. I built out the UI and the underlying backend logic so that users can now do two things straight from the dashboard:

  1. Spin up a brand new K3s cluster from scratch on a server.
  2. Link securely to an existing Kubernetes cluster.

Phase 1 was a resounding success. I had the foundation.

🌍 Phase 2: Racing Across Borders

As I moved into Phase 2, life happened. I had to pause active work while I took a 5-day tour across 4 different countries.

But while I was navigating borders, the codebase I had just written to handle my deployments was locked in an intense race condition of its own.

I was literally racing across borders while trying to stop my codebase from "racing" and locking up the UI.

Here is what went wrong.

🐉 Deploying the Docker Image & Fighting the Clock

The second phase of the POC was the grand finale: taking a standard Docker image (I used Nginx), deploying it as a service directly to the K3s cluster I had just created, and ensuring it was accessible via an Ingress route.

The translation script worked flawlessly. My Docker manifests were effortlessly converted into K8s Deployments, Services, and Traefik Ingress rules. But making the status UI sync was a nightmare.

To ensure the deployment succeeded, I initially set the core action to run synchronously. The deployment worked! But because it waited for the cluster to finish, it held the PHP process hostage and completely locked up the Coolify frontend.

"Simple," I thought. "Just revert the deployment to an asynchronous background job."

The UI immediately became snappy again. But this introduced the ultimate syncing problem. The moment the async deployment fired, Coolify's Application Status Checker instantly polled the K8s API.

Because Kubernetes is eventually-consistent, it takes a few seconds to pull the image and schedule the pods. The API responded, accurately, that there were zero pods running. Instead of understanding that the app was just booting up, the Coolify orchestrator aggressively flagged the perfectly healthy deployment as "Exited" or "Failed."

A fully functional deployment was showing a glaring red error state.

🚀 The Resolution: The POC is Alive

You cannot force an intrinsically asynchronous system (Kubernetes scheduling) to behave linearly against a strict synchronous status check. The absence of a resource immediately after creation is an expected state, not a failure.

I solved the "racing codebase" by injecting an intelligent, two-minute graceful memory window into the status pipeline. If the checker polls an application within two minutes of an update and finds zero pods, it simply holds the UI status at "Starting" until the pods are scheduled. The moment the K8s API confirms the pods are healthy, it seamlessly flips the interface to "Running."

The end-to-end "Docker Image -> K8s" flow is now incredibly fast, fully observable, and completely robust. I conquered the K3s installation, I defeated the deployment race conditions, and I proved that Native Kubernetes fits perfectly inside Coolify.

⏭️ Next in the Investigation

Letting the automated systems handle the eventual consistency of Kubernetes meant I could actually close my laptop and enjoy the rest of my tour across countries. But the work is far from over.

Next up, I dive deeper into linking external production clusters and polishing the features for robust availability. You will not want to miss what's coming next!

GitHub Issue: https://github.com/coollabsio/coolify/issues/2390

Connect with me: Twitter/X, Linkedin, Telegram

This is the third post in a series documenting my investigation into building native Kubernetes support for Coolify. Next up: Connecting robust external clusters.

Comments (0)

Sign in to join the discussion

Be the first to comment!