🏠 Homelab Infrastructure — Dibesh Shrestha

Self-hosted infrastructure from scratch

A complete homelab built on a single Proxmox node — no cloud, no managed services. Two active VMs running 30 Docker services, a full k3s Kubernetes cluster managed by Rancher, GitLab CI/CD pipelines deploying Zabbix environments via Terraform and Helm, and 16 publicly accessible subdomains behind Cloudflare Tunnel. Everything documented here is real, running, and maintained as a learning platform.

2VMs
Active VMs on Proxmox
30+
Docker containers
19+
Public subdomains
360+
CI/CD pipeline runs
34d
K8s cluster uptime
10+
Archived experiment VMs
Proxmox PVE 9.1Docker Composek3s v1.34Rancher 2.14Terraform v1.15Helm v3.20GitLab CI/CDTraefik v2.11Authentik SSOGraylog + OpenSearchcert-managerCrowdSec IDSCloudflare TunnelFleet GitOpsCAPI (rancher-turtles)
🌐 Network

Network layout

Single flat /24 LAN · all VMs on vmbr0 bridge · k3s uses Flannel CNI (10.42.0.0/16) · all external access via Cloudflare Tunnel, no open inbound ports

Proxmox PVE node
Physical host · vmbr0 bridge · enp0s31f6 NIC · kernel 6.17.9-1-pve
192.168.178.200
VM 400 — homelab-docker
Ubuntu 24.04 · Docker host · GitLab Runner · homelab bridge 172.18.0.0/16
192.168.178.215
VM 500 — cicd-automation
Debian 12 · k3s control plane · Traefik LoadBalancer · flannel.1 10.42.0.0/32 · cni0 10.42.0.0/24
192.168.178.220
k3s pod network (Flannel)
Pod CIDR: 10.42.0.0/16 · Service CIDR: 10.43.0.0/16 · CoreDNS: 10.43.0.10
10.42.0.0/16
zabbix-postgres (cross-VM)
Running on VM 400 · consumed by Zabbix pods on VM 500 via LAN IP
192.168.178.215:5432
External access layer
Cloudflare Tunnel on VM 400 → Traefik → services · zero inbound firewall ports required
*.abc-server.date

All public subdomains (via Cloudflare Tunnel → Traefik)

actual.abc-server.dateActual Budget
auth.abc-server.dateAuthentik SSO portal
finanzblick.abc-server.dateFinanzBlick (custom app)
graylog.abc-server.dateGraylog log management
homepage.abc-server.dateHomepage service dashboard
immich.abc-server.dateImmich photo backup
jellyfin.abc-server.dateJellyfin media server
monitoring-hub.abc-server.dateCustom Zabbix dashboard (live)
n8n.abc-server.daten8n workflow automation
netdata.abc-server.dateNetdata real-time metrics
pihole.abc-server.datePihole DNS admin
portainer.abc-server.datePortainer container management
proxmox-dashboard.abc-server.dateCustom Proxmox dashboard (live)
traefik.abc-server.dateTraefik dashboard
uptime-kuma.abc-server.dateUptime Kuma monitoring
zabbix-hub.abc-server.dateZabbix hub app
zabbix-dev.abc-server.dateZabbix dev (K8s, Helm)
zabbix-staging.abc-server.dateZabbix staging (K8s, Helm)
zabbix-prod.abc-server.dateZabbix prod (K8s, Helm)
rancher.abc-server.dateRancher 2.14 cluster mgmt

All subdomains protected by Authentik SSO (except where noted). No inbound firewall ports — all traffic via Cloudflare Tunnel.

⬡ Proxmox

Proxmox PVE 9.1 — node overview

Single bare-metal node · 8 vCPUs (4 cores + HT) · 23.3 GB RAM · ~960 GB ci-storage pool · Legacy BIOS · pve-manager/9.1.5

💻
CPU
8 vCPUs · 4 cores · x86_64 HT
Kernel: 6.17.9-1-pve
🧠
Memory
23.3 GB total · ~21.3 GB used (91%)
used91%
💾
Storage
ci-storage: 960 GB · 18% used
local-lvm: 148 GB · 39% used
local: 71 GB · 8% used

All VMs (12 total · 2 running · 10 archived)

VMIDNameStatusRAMDiskNotes
400homelab-docker● running10 GB150 GBDocker host · GitLab Runner · 192.168.178.215 · Ubuntu 24.04
500cicd-automation● running10 GB320 GBk3s v1.34 · Rancher 2.14 · 192.168.178.220 · Debian 12
200terra-automationstopped4 GB35 GBArchived — early Terraform VM experiments
203terra-monitoringstopped4 GB30 GBArchived — standalone monitoring VM experiment
204terra-k8s-masterstopped4 GB40 GBArchived — multi-node K8s experiment (master node)
205terra-k8s-worker1stopped1 GB15 GBArchived — multi-node K8s worker 1
206terra-k8s-worker2stopped1 GB15 GBArchived — multi-node K8s worker 2
300debian-12-templatestopped2 GB20 GBBase template for cloning new VMs
301zabbix-dbstopped1 GB20 GBArchived — standalone Zabbix DB experiment
302zabbix-server1stopped512 MB20 GBArchived — Zabbix HA experiment server 1
303zabbix-server2stopped512 MB20 GBArchived — Zabbix HA experiment server 2
304zabbix-webstopped512 MB20 GBArchived — Zabbix HA web frontend experiment

No LXC containers in use. All workloads run as full VMs. Archived VMs document the evolution from standalone Zabbix/K8s VMs to the current consolidated architecture.

🗺 Architecture diagram — hover layers for details
⬡ Proxmox PVE 9.1  ·  Single node  ·  8 vCPUs  ·  24 GB RAM  ·  ~960 GB storage  ·  192.168.178.200
🐳

homelab-docker

VM 400 · Ubuntu 24.04 · 192.168.178.215

30 containers
🛡

Access & security

Traefik · Cloudflare Tunnel · Authentik · CrowdSec · Pihole

👁

Observability

Graylog · MongoDB · Netdata · Uptime Kuma · zabbix-postgres:5432

📦

Self-hosted apps

Immich · Jellyfin · Actual Budget · FinanzBlick

🤖

AI / automation

Ollama · n8n · Telegram bot · Proxmox API

⚙️

GitLab Runner

Executes CI/CD jobs → kubeconfig → VM 500

🔧

Management

Portainer · Homepage · Watchtower ⚠ restart loop

runs jobs
postgres :5432

cicd-automation

VM 500 · Debian 12 · 192.168.178.220

Rancher 2.14
🌐

k3s cluster (single node)

Traefik LB · cert-manager · K8s Dashboard · metrics-server

🧪

zabbix-dev

9 revisions · ✓ healthy

🔬

zabbix-staging

2 revisions · ✓ healthy

🚀

zabbix-prod

prod env · ✓ healthy

📋

IaC layer

Terraform v1.15 · Helm v3.20 · GitLab HTTP state

🔀

GitLab CI pipelines

Plan (auto) → Apply (manual) · 360+ runs

🐴

Rancher system

Fleet · CAPI (rancher-turtles) · upgrade-controller

Docker VM
Kubernetes VM
Data flow
Hover any layer for details
🐳 Docker — VM 400 · 192.168.178.215

homelab-docker containers

Ubuntu 24.04 · 10 GB RAM (4.1 GB used) · 148 GB disk (47% used) · Kernel 6.8.0-111 · Two Compose stacks: home-lab (25 services) + crowdsec (2 services)

traefik
traefik:v2.11
:80 :443 :8081 (dashboard)
cloudflared
cloudflare/cloudflared:latest
no ports — tunnel only
authentik
ghcr.io/goauthentik/server:latest
SSO — auth.abc-server.date
authentik-worker
ghcr.io/goauthentik/server:latest
background task worker
authentik-postgres
postgres:15
:5432 (internal)
authentik-redis
redis:7
:6379 (internal)
crowdsec
crowdsecurity/crowdsec:latest
IDS engine (crowdsec stack)
crowdsec-bouncer
fbonalair/traefik-crowdsec-bouncer
blocks malicious IPs at Traefik
pihole
pihole/pihole:latest
:53 (DNS) :80 (admin)
docker-socket-proxy
tecnativa/docker-socket-proxy
:2375 (restricted socket access)
graylog-mongo
mongo:7.0
:27017 (internal) — Graylog backend
netdata
netdata/netdata:v1.47.5
:19999 — netdata.abc-server.date
uptime-kuma
louislam/uptime-kuma
:3001 — uptime-kuma.abc-server.date
zabbix-postgres
postgres:15
0.0.0.0:5432 → K8s Zabbix pods
immich
ghcr.io/immich-app/immich-server:release
:2283 — immich.abc-server.date
immich-postgres
tensorchord/pgvecto-rs:pg14-v0.2.0
:5432 (internal) — vector ext.
immich-redis
redis:7
:6379 (internal)
jellyfin
jellyfin/jellyfin
:8096 — jellyfin.abc-server.date
actual
actualbudget/actual-server:latest
:5006 — actual.abc-server.date
finanzblick
home-lab-finanzblick (custom build)
:8000 — finanzblick.abc-server.date
ollama
ollama/ollama
0.0.0.0:11434 (LLM API)
n8n
n8nio/n8n:latest
0.0.0.0:5678 — n8n.abc-server.date
homepage
ghcr.io/gethomepage/homepage:latest
:3000 — homepage.abc-server.date
portainer
portainer/portainer-ce:latest
:9000/:9443 — portainer.abc-server.date
proxmox-dashboard
home-lab-proxmox-dashboard (custom)
:8080 — proxmox-dashboard.abc-server.date
monitoring-hub
home-lab-monitoring-hub (custom)
:8000 — monitoring-hub.abc-server.date
zabbix-hub
home-lab-zabbix-hub (custom)
:8000 — zabbix-hub.abc-server.date
gitlab-runner
gitlab/gitlab-runner:latest
no ports — executes CI pipeline jobs
watchtower ⚠
containrrr/watchtower
Restarting (exit 1) — investigating
☸ Kubernetes — VM 500 · 192.168.178.220

k3s cluster — cicd-automation

Debian 12 · 10 GB RAM (7.3 GB used) · 320 GB disk (12% used) · Kernel 6.1.0-44-amd64 · containerd 2.2.2 · Single control-plane node · 34d uptime

k3s v1.34.6+k3s1
Node: cicd-automation · Ready
Role: control-plane · 34d uptime
🐴
Rancher 2.14.1
3 replicas running · 20d
rancher.abc-server.date
🔒
cert-manager
3 pods (cainjector, webhook, main)
32d uptime · 3 restarts each

Helm releases

ReleaseNamespaceChartApp versionRev.Status
zabbix-devzabbix-devzabbix-7.0.127.0.169✓ healthy
zabbix-stagingzabbix-stagingzabbix-7.0.127.0.162✓ all healthy
zabbix-prodzabbix-prodzabbix-7.0.127.0.161✓ healthy
ranchercattle-systemrancher-2.14.1v2.14.13✓ healthy
fleetcattle-fleet-systemfleet-109.0.10.15.14✓ healthy
rancher-turtlescattle-turtles-systemrancher-turtles-109.0.10.26.12✓ healthy
rancher-webhookcattle-systemrancher-webhook-109.0.10.10.43✓ healthy
traefikkube-systemtraefik-39.0.501v3.6.101✓ healthy
system-upgrade-controllercattle-systemsystem-upgrade-controller-109.0.1v0.19.12✓ healthy

Ingress resources

NamespaceHostPortsAddress
cattle-systemrancher.abc-server.date80, 443192.168.178.220
zabbix-devzabbix-dev.abc-server.date80192.168.178.220
zabbix-stagingzabbix-staging.abc-server.date80192.168.178.220
zabbix-prodzabbix-prod.abc-server.date80192.168.178.220
🔀 CI/CD

GitLab CI → Terraform → Helm

GitLab Runner on VM 400 executes jobs targeting VM 500 via kubeconfig · Terraform manages K8s state · Remote state in GitLab HTTP backend

🔀
Pipeline flow
1. commitPush to GitLab (cicd-zabbix or cicd-demo repo)
2. planterraform plan — auto, shows diff, no change
3. applyManual trigger → terraform apply → Helm release update
idempotentRe-run at any time — Terraform checks state, only applies diffs
📋
Terraform details
Versionv1.15.1 (v1.15.4 available)
BackendGitLab HTTP (remote state)
Providershashicorp/helm + hashicorp/kubernetes
Resourceshelm_release + kubernetes_ingress_v1
Chartzabbix-community/zabbix v7.0.12
SecretsMasked CI/CD variables in GitLab
Total runs360+ pipeline executions

Environment status

All three environments share the same Terraform config and Helm chart. The namespace variable switches between them.

EnvNamespaceDBRevisionsCurrent state
zabbix-devzabbix-dev192.168.178.215:54329✓ All pods healthy
zabbix-stagingzabbix-staging192.168.178.215:54322✓ All 3 pods healthy
zabbix-prodzabbix-prod192.168.178.215:54321✓ All pods healthy
🔬 Projects

Learning by Building

Real infrastructure experiments — built to understand, not to impress. AI-assisted where noted. These projects are also featured on the main portfolio page.

Live demo

monitoring-hub

Custom dashboard built on top of the Zabbix API — shows live CPU, RAM, disk, ping, problems and maintenance status for all monitored hosts. Includes per-host drawer with enable/disable/maintenance actions. AI-assisted development — I designed the architecture and monitoring logic; code generated with AI help.

Python FastAPI Zabbix API Docker Homelab AI-Assisted
Running in production

Homelab Platform

Self-hosted Docker Compose platform running 30+ services on a single VM (Ubuntu 24.04, 10 GB RAM). Traefik as reverse proxy, Cloudflare Tunnel for zero-trust external access, Authentik SSO protecting every service, CrowdSec for intrusion detection. Graylog + OpenSearch for centralised log aggregation, Netdata + Uptime Kuma for real-time monitoring. Notable apps: Immich (photo backup), Jellyfin (media), Actual Budget, Pihole DNS, Portainer, and a Homepage dashboard aggregating all services.

Infrastructure: Single Proxmox node (PVE 9.1, 8 vCPUs, 24 GB RAM, ~960 GB storage) running two active VMs: homelab-docker (all Docker services + GitLab Runner) and cicd-automation (k3s cluster + Terraform + Helm). Previously explored multi-VM Zabbix HA and multi-node K8s setups — consolidated to the current architecture.

Docker Compose Traefik Authentik CrowdSec Cloudflare Graylog Netdata Uptime Kuma Immich Jellyfin Portainer n8n Homelab
Running daily

AI Infrastructure Bot

n8n workflow that queries the Docker socket proxy daily at 15:00, sends container health data to a local Ollama LLM for analysis, then pushes a formatted status report to Telegram. A second flow responds to on-demand commands like /proxmox with live VM CPU/RAM/status from the Proxmox API. No coding required — built entirely with n8n's visual workflow builder.

n8n Ollama Telegram Proxmox API Homelab
Running in homelab

FinanzBlick

Self-hosted personal finance app — PDF/CSV import from ING bank, monthly analytics, savings rate, recurring transaction detection, and year-over-year comparisons. AI-assisted development — I designed the features and data model; code generated with AI help. Runs as a container in K8s, deployed manually.

FastAPI Python Docker SQLite Homelab AI-Assisted
🗺 Roadmap

Learning roadmap

Progressing from enterprise monitoring expert into Kubernetes/DevOps/IaC territory — building evidence through real homelab projects, not just tutorials.

14+ years enterprise monitoring — Zabbix Certified Professional 7.0
✓ Completed · April 2026
4 Zabbix certifications (Professional, Specialist, Security, SNMP). 30,000+ monitored hosts across 200+ enterprise clients. Led Nagios-to-Zabbix migrations at scale. Deep knowledge of enterprise monitoring architecture, escalation management, and platform ownership. This is the professional foundation everything else builds on.
✓ doneCP-2604-039CS-2412-151ZEX03 · ZEX05
Homelab foundation — Proxmox · Docker · Cloudflare Tunnel · SSO · Observability
✓ Completed · 2024–2025
Built from zero: single Proxmox node, 30 Docker services, zero-trust external access via Cloudflare Tunnel, Authentik SSO protecting all services, CrowdSec IDS, Graylog centralised logging, Pihole DNS, Netdata metrics. 16+ public subdomains. All production-grade personal infrastructure — not just tutorials.
✓ doneProxmox PVEDocker ComposeCloudflare TunnelAuthentikGraylogCrowdSec
k3s + Rancher + GitLab CI/CD + Terraform + Helm — full IaC pipeline
✓ Completed · April–May 2026
Deployed single-node k3s via Rancher 2.14. Built GitLab CI pipelines (plan auto / apply manual) triggering Terraform → Helm to deploy Zabbix to three isolated K8s environments. Remote Terraform state in GitLab HTTP backend. 360+ pipeline runs. Added CAPI via rancher-turtles, Fleet GitOps, system-upgrade-controller. This is real IaC, not a demo.
✓ donek3s v1.34Rancher 2.14TerraformHelmGitLab CIFleetCAPI
Deepen K8s debugging and observability skills
▶ Active — now
All three Zabbix environments (dev, staging, prod) are running. Next step is going deeper into K8s observability: understanding pod logs, resource metrics, event streams, and how to read cluster state. This builds real troubleshooting muscle beyond "it deployed".
activekubectl logskubectl describeK8s debugging
Deploy FinanzBlick + monitoring-hub via CI/CD pipeline
▶ Next — ongoing
Currently both custom apps are deployed manually. The natural next step is to add them to GitLab CI with proper Docker build → push → deploy pipelines. This closes the gap between "I can deploy Zabbix via CI" and "I can deploy any service via CI" — which is what a DevOps role actually requires.
nextGitLab CIDocker buildHelm
Multi-node k3s + GitOps with Fleet + Prometheus/Grafana stack
→ Progressive — next
Expand from single-node to multi-node k3s using archived VMs (204–206) or new VMs. Implement full GitOps via Rancher Fleet. Add Prometheus + Grafana (subdomains already defined: prometheus.abc-server.date, grafana.abc-server.date) as the monitoring stack. This mirrors real production K8s environments and directly demonstrates consulting-level skills.
nextmulti-nodeFleet GitOpsPrometheusGrafana
Full IaC for homelab — Proxmox provider + Cloudflare provider + OpenTofu
○ Future
Currently only Zabbix K8s deployments are managed as IaC. Goal: declaratively manage VM creation (Proxmox Terraform provider), DNS records (Cloudflare provider), and potentially Docker Compose stacks. This closes the full platform engineering loop and demonstrates infrastructure-as-code beyond K8s.
futureOpenTofuProxmox providerCloudflare provider
K8s/DevOps/IaC consultant — internal or new employer
○ Long-term goal
Transition from Zabbix monitoring expert to K8s/DevOps/IaC consultant. The homelab documents a credible learning trajectory: monitoring foundation → containerisation → Kubernetes → CI/CD → IaC. The project portfolio makes this a realistic career pivot — moving toward platform engineering or cloud infrastructure roles.
futureconsultingK8sIaCplatform engineering
🤖 Automation

n8n infrastructure automation

Two live workflows running in the homelab — one scheduled daily report, one on-demand Telegram command handler. No coding required — built entirely with n8n's visual workflow builder.

Daily container health report
Runs every day at 15:00 automatically
Schedule
15:00 daily
🌐
HTTP GET
docker-socket-proxy
{}
JS Prepare
format data
{}
JS Aggregate
summarise
🌐
Ollama AI
POST :11434
✈️
Telegram
send report

Every day at 15:00, n8n queries the Docker socket proxy to get the status of all 30 containers, passes the data through JavaScript to prepare and aggregate it, sends it to the local Ollama LLM for analysis, then pushes a formatted health report to Telegram.

✈️
On-demand /proxmox command
Reply to any Telegram message containing /proxmox
✈️
Telegram
on message
{}
JS Parse
parse cmd
🔀
Switch
/proxmox?
>_
Proxmox API
get VM stats
✈️
Telegram
send reply

Send /proxmox to the Telegram bot and get an instant reply with live VM stats — CPU %, RAM %, disk usage, and running/stopped status for all 12 VMs — pulled directly from the Proxmox API.

Live Telegram response to /proxmox
🖥 Proxmox
🖥 HOST
CPU: 25%
RAM: 84%
Disk: 9%
🧱 VMS
200 | terra-automation | 🔴 stopped
203 | terra-monitoring | 🔴 stopped
204 | terra-k8s-master | 🔴 stopped
400 | homelab-docker | 🟢 running | CPU: 6% | RAM: 85%
500 | cicd-automation | 🟢 running | CPU: 23% | RAM: 92%
This message was sent automatically with n8n

Dibesh Shrestha · Wetzlar, Germany · Portfolio · GitHub · GitLab · LinkedIn

Infrastructure data accurate as of May 2026 · No credentials or secrets included