Zero-Downtime Deployment Strategies for Small Teams

Every developer has experienced the stomach-churning moment of deploying to production, watching the site go down, and hoping everything comes back up cleanly. For small teams and solo developers, downtime during deployment feels inevitable. After all, the big zero-downtime strategies are built for companies with dedicated platform engineering teams, right? Not necessarily. With a handful of straightforward techniques and tools you probably already have installed, you can deploy without your users ever noticing.

Why Zero-Downtime Matters More Than You Think

Even brief downtime has compounding costs. Search engine crawlers that hit a 503 error may deprioritize your pages. Users who encounter an error page rarely hit refresh—they leave. API consumers that get connection resets may trigger their own cascading failures. And if you deploy frequently, those 30-second windows add up fast.

For a static site like GGAMES.MOBI, the stakes are lower than for a transactional application, but the principles apply universally. The good news is that the simplest zero-downtime techniques are also the cheapest to implement.

Blue-Green Deployments with Symlinks

The blue-green pattern is the most intuitive zero-downtime strategy. You maintain two identical production environments—call them "blue" and "green." At any given time, one is live and the other is idle. You deploy to the idle environment, verify it works, then switch traffic over instantly.

For a small team running on a single VPS, you do not need Kubernetes or a cloud load balancer to pull this off. A symlink swap is enough:

# Directory structure
/var/www/blue/    # Contains current live release
/var/www/green/   # Deploy target
/var/www/current -> /var/www/blue   # Symlink nginx follows

# Deploy new version to green
rsync -az --delete ./dist/ /var/www/green/

# Run smoke tests against green
curl -f http://localhost:8081/health || exit 1

# Atomic swap
ln -sfn /var/www/green /var/www/current

# Nginx picks up the new target on next request
# No reload needed if using $realpath_root

The critical detail is that ln -sfn performs an atomic rename at the filesystem level. There is no window where the symlink points to nothing. Nginx resolves the symlink on each request when you configure it with $realpath_root instead of $document_root, so no reload is required.

Benefits of this approach:

Instant rollback—just swap the symlink back to the previous directory
No special tooling required beyond rsync and ln
Works on any Linux server, any hosting provider
The idle environment doubles as a staging/preview environment

Rolling Updates for Multi-Server Setups

If you have more than one server behind a load balancer, rolling updates are natural. The idea is simple: pull one server out of the load balancer pool, deploy to it, verify it works, put it back, then repeat for the next server.

Even with just two servers, this works well. A basic script looks like this:

#!/bin/bash
SERVERS=("web1.example.com" "web2.example.com")

for server in "${SERVERS[@]}"; do
    echo "Deploying to $server..."

    # Remove from load balancer
    ssh lb "sudo sed -i 's/$server/#$server/' /etc/nginx/upstream.conf"
    ssh lb "sudo nginx -s reload"
    sleep 5  # Allow in-flight requests to complete

    # Deploy
    rsync -az --delete ./dist/ "$server:/var/www/app/"

    # Health check
    ssh "$server" "curl -sf http://localhost/health" || {
        echo "FAILED on $server - aborting"
        exit 1
    }

    # Re-add to load balancer
    ssh lb "sudo sed -i 's/#$server/$server/' /etc/nginx/upstream.conf"
    ssh lb "sudo nginx -s reload"
    sleep 2
done

The sleep after removing a server gives in-flight requests time to complete. This is called connection draining. For a small team, five seconds is usually plenty. Production load balancers like HAProxy and AWS ALB have built-in connection draining with configurable timeouts.

Database Migrations: The Expand-Contract Pattern

The hardest part of zero-downtime deployment is rarely the application code—it is the database. A naive migration that renames a column or changes a type will break the currently-running application the instant the migration executes.

The expand-contract pattern solves this by splitting every breaking migration into two or three non-breaking steps:

Expand: Add the new column (or table) alongside the old one. Deploy code that writes to both columns but reads from the old one.
Migrate: Backfill the new column with data from the old one. Once complete, deploy code that reads from the new column.
Contract: Drop the old column in a later deployment once you have confirmed the new column is fully in use.

Consider renaming a username column to display_name:

-- Step 1: Expand (deploy 1)
ALTER TABLE users ADD COLUMN display_name VARCHAR(255);

-- Step 2: Backfill
UPDATE users SET display_name = username WHERE display_name IS NULL;

-- Step 3: Contract (deploy 2, days later)
ALTER TABLE users DROP COLUMN username;

Between deploy 1 and deploy 2, your application code must handle both columns. This is more work than a single migration, but it means you never have a moment where the running code disagrees with the database schema.

Health Checks: The Foundation of Everything

Every zero-downtime strategy depends on knowing whether the new deployment is healthy before sending traffic to it. A health check endpoint does not need to be complicated. At minimum, it should verify:

The application process is running and accepting HTTP connections
The application can reach its database (if applicable)
Critical configuration values are present
The response includes a version identifier so you can confirm the new code is live

A simple implementation returns JSON with a version and status:

// GET /health
{
    "status": "ok",
    "version": "2026.04.03.1",
    "database": "connected",
    "uptime": 42
}

For static sites, a health check can be as simple as verifying that a known file returns 200. The key is to automate the check so your deployment script can make a go/no-go decision without human intervention.

Rollback Strategies

No deployment strategy is complete without a rollback plan. The best rollback strategies share a common trait: they are fast and require no thought under pressure.

Blue-green: Swap the symlink back. Rollback takes under one second.
Rolling: Redeploy the previous artifact to all servers using the same rolling script.
Container-based: Point to the previous image tag. The old image is cached locally.
Database: This is the hard one. If you followed expand-contract, the old code still works against the current schema because the old column is still present.

A common mistake is to treat rollback as an afterthought. Test your rollback procedure before you need it. The worst time to discover your rollback script has a bug is during a production incident at 2 AM.

Practical Tooling for Small Teams

You do not need Kubernetes, Terraform, or a dedicated DevOps engineer to achieve zero-downtime deployments. Here is a minimal toolset that covers most scenarios:

rsync: Fast, incremental file transfer. The --delete flag ensures the target matches the source exactly.
ln -sfn: Atomic symlink swap. The core of blue-green on a single server.
nginx -s reload: Graceful reload that does not drop connections. Nginx starts new worker processes with the new configuration while old workers finish serving in-flight requests.
curl: Health check verification in deployment scripts.
ssh: Remote command execution for multi-server deployments.
A simple bash script: Ties all of the above together into a repeatable, automated process.

If you want to level up slightly, tools like Caddy (with its API-driven config reload) or a simple Docker setup with docker-compose up -d --no-deps --build can provide the same benefits with a bit more structure.

Putting It All Together

Here is the deployment script we use for a simple static site with blue-green on a single server:

#!/bin/bash
set -euo pipefail

LIVE=$(readlink /var/www/current)
TARGET=$( [ "$LIVE" = "/var/www/blue" ] && echo "/var/www/green" || echo "/var/www/blue" )

echo "Live: $LIVE"
echo "Deploying to: $TARGET"

rsync -az --delete ./dist/ "$TARGET/"

# Smoke test the new deployment directly
HTTP_CODE=$(curl -so /dev/null -w "%{http_code}" "http://localhost:8081/")
if [ "$HTTP_CODE" != "200" ]; then
    echo "Smoke test failed (HTTP $HTTP_CODE). Aborting."
    exit 1
fi

ln -sfn "$TARGET" /var/www/current
echo "Deployed successfully. Live is now: $TARGET"

This script is under 20 lines, uses only standard Unix tools, and provides atomic switching with instant rollback capability. For a small team, that is more than enough.

Zero-downtime deployment is not about having the fanciest infrastructure. It is about understanding the fundamental patterns—atomic swaps, health verification, and backward-compatible changes—and applying them with whatever tools you have. Start with the symlink swap. Add health checks. Practice your rollback. You will sleep better on deployment days.