The Problem#
My Docker Swarm cluster had a critical issue - everything was pinned to p0. While this worked initially, it created several problems:
- Single point of failure: All services running on one node
- Resource imbalance: p0 was heavily loaded while p1, p2, p3 sat idle
- No redundancy: Losing p0 meant losing everything
- Memory pressure: Multiple heavy services competing for resources
The Solution#
Phase 1: Promote All Nodes to Managers#
Originally, only p0 was a manager. I promoted p1, p2, and p3 to managers:
docker node promote p1 p2 p3
This created a 4-node manager quorum, which can tolerate losing 2 nodes while maintaining cluster operations.
Phase 2: Verify Shared Storage#
Since my cluster uses GlusterFS mounted at /home/doc/swarm-data/, I verified all nodes could access the shared storage:
docker service create --name gluster-test \
--mode global \
--constraint 'node.role==manager' \
--mount type=bind,src=/home/doc/swarm-data,dst=/data \
alpine ls /data
All nodes successfully accessed the shared filesystem.
Phase 3: Remove Hostname Constraints#
I removed unnecessary node.hostname == p0 constraints from service configurations:
Services Updated:
- adminer
- n8n
- paperless (webserver + redis)
- authentik (server, worker, redis)
- uptime-kuma
- tracker-nginx
Services Left Pinned:
traefik(p0) - Needs published ports 80/443 with stable IPportainer(p0) - Management UI conveniencersync- Already flexible withnode.role == manager
Phase 4: Redeploy Services#
I force-updated services to redistribute them:
docker service update --force adminer_adminer
docker service update --force n8n_n8n
docker service update --force authentik_redis
docker service update --force authentik_authentik_server
docker service update --force authentik_authentik_worker
docker service update --force paperless_paperless_redis
Phase 5: Fix Portainer Connectivity#
After promoting nodes to managers, Portainer agents couldn’t find manager nodes (they cached the old worker role). Fixed by restarting the agents:
docker service update --force portainer_agent
Results#
Before#
p0: traefik, portainer, uptime-kuma, adminer, n8n, paperless, authentik (all 3 services)
p1: (mostly idle)
p2: (mostly idle)
p3: (mostly idle)
After#
p0: traefik, portainer, rsync
p1: authentik_redis, paperless_redis, tracker-nginx
p2: adminer, authentik_server, uptime-kuma
p3: authentik_worker, n8n, paperless_webserver
Benefits Achieved#
✅ Balanced workload - Services distributed across all 4 nodes ✅ High availability - 4-node manager quorum (can lose 2 nodes) ✅ Self-balancing - Services automatically redistribute on node failures ✅ Better resource utilization - All nodes actively participating
Lessons Learned#
GlusterFS enables true flexibility - Shared storage means services can run anywhere without storage constraints
Manager overhead is minimal - With only 4 nodes, the Raft consensus overhead is negligible
Portainer agents cache node roles - Always restart agents after promoting nodes to managers
Pin only what’s necessary - Only services with published ports or specific requirements need constraints
Let Swarm do its job - Without constraints, the scheduler does a good job distributing workload

