Homelab Infrastructure — Dibesh Shrestha

🌐 Network

Network layout

Single flat /24 LAN · all VMs on vmbr0 bridge · k3s uses Flannel CNI (10.42.0.0/16) · all external access via Cloudflare Tunnel, no open inbound ports

Proxmox PVE node

Physical host · vmbr0 bridge · enp0s31f6 NIC · kernel 6.17.9-1-pve

192.168.178.200

VM 400 — homelab-docker

Ubuntu 24.04 · Docker host · GitLab Runner · homelab bridge 172.18.0.0/16

192.168.178.215

VM 500 — cicd-automation

Debian 12 · k3s control plane · Traefik LoadBalancer · flannel.1 10.42.0.0/32 · cni0 10.42.0.0/24

192.168.178.220

k3s pod network (Flannel)

Pod CIDR: 10.42.0.0/16 · Service CIDR: 10.43.0.0/16 · CoreDNS: 10.43.0.10

10.42.0.0/16

zabbix-postgres (cross-VM)

Running on VM 400 · consumed by Zabbix pods on VM 500 via LAN IP

192.168.178.215:5432

External access layer

Cloudflare Tunnel on VM 400 → Traefik → services · zero inbound firewall ports required

*.abc-server.date

All public subdomains (via Cloudflare Tunnel → Traefik)

actual.abc-server.dateActual Budget

auth.abc-server.dateAuthentik SSO portal

finanzblick.abc-server.dateFinanzBlick (custom app)

graylog.abc-server.dateGraylog log management

homepage.abc-server.dateHomepage service dashboard

immich.abc-server.dateImmich photo backup

jellyfin.abc-server.dateJellyfin media server

monitoring-hub.abc-server.dateCustom Zabbix dashboard (live)

n8n.abc-server.daten8n workflow automation

netdata.abc-server.dateNetdata real-time metrics

pihole.abc-server.datePihole DNS admin

portainer.abc-server.datePortainer container management

proxmox-dashboard.abc-server.dateCustom Proxmox dashboard (live)

traefik.abc-server.dateTraefik dashboard

uptime-kuma.abc-server.dateUptime Kuma monitoring

zabbix-hub.abc-server.dateZabbix hub app

zabbix-dev.abc-server.dateZabbix dev (K8s, Helm)

zabbix-staging.abc-server.dateZabbix staging (K8s, Helm)

zabbix-prod.abc-server.dateZabbix prod (K8s, Helm)

rancher.abc-server.dateRancher 2.14 cluster mgmt

All subdomains protected by Authentik SSO (except where noted). No inbound firewall ports — all traffic via Cloudflare Tunnel.

⬡ Proxmox

Proxmox PVE 9.1 — node overview

Single bare-metal node · 8 vCPUs (4 cores + HT) · 23.3 GB RAM · ~960 GB ci-storage pool · Legacy BIOS · pve-manager/9.1.5

💻

CPU

8 vCPUs · 4 cores · x86_64 HT

Kernel: 6.17.9-1-pve

🧠

Memory

23.3 GB total · ~21.3 GB used (91%)

used91%

💾

Storage

ci-storage: 960 GB · 18% used

local-lvm: 148 GB · 39% used

local: 71 GB · 8% used

All VMs (12 total · 2 running · 10 archived)

VMID	Name	Status	RAM	Disk	Notes
400	homelab-docker	● running	10 GB	150 GB	Docker host · GitLab Runner · 192.168.178.215 · Ubuntu 24.04
500	cicd-automation	● running	10 GB	320 GB	k3s v1.34 · Rancher 2.14 · 192.168.178.220 · Debian 12
200	terra-automation	stopped	4 GB	35 GB	Archived — early Terraform VM experiments
203	terra-monitoring	stopped	4 GB	30 GB	Archived — standalone monitoring VM experiment
204	terra-k8s-master	stopped	4 GB	40 GB	Archived — multi-node K8s experiment (master node)
205	terra-k8s-worker1	stopped	1 GB	15 GB	Archived — multi-node K8s worker 1
206	terra-k8s-worker2	stopped	1 GB	15 GB	Archived — multi-node K8s worker 2
300	debian-12-template	stopped	2 GB	20 GB	Base template for cloning new VMs
301	zabbix-db	stopped	1 GB	20 GB	Archived — standalone Zabbix DB experiment
302	zabbix-server1	stopped	512 MB	20 GB	Archived — Zabbix HA experiment server 1
303	zabbix-server2	stopped	512 MB	20 GB	Archived — Zabbix HA experiment server 2
304	zabbix-web	stopped	512 MB	20 GB	Archived — Zabbix HA web frontend experiment

No LXC containers in use. All workloads run as full VMs. Archived VMs document the evolution from standalone Zabbix/K8s VMs to the current consolidated architecture.

🗺 Architecture diagram — hover layers for details

⬡ Proxmox PVE 9.1 · Single node · 8 vCPUs · 24 GB RAM · ~960 GB storage · 192.168.178.200

🐳

homelab-docker

VM 400 · Ubuntu 24.04 · 192.168.178.215

30 containers

🛡

Access & security

Traefik · Cloudflare Tunnel · Authentik · CrowdSec · Pihole

👁

Observability

Graylog · MongoDB · Netdata · Uptime Kuma · zabbix-postgres:5432

📦

Self-hosted apps

Immich · Jellyfin · Actual Budget · FinanzBlick

🤖

AI / automation

Ollama · n8n · Telegram bot · Proxmox API

⚙️

GitLab Runner

Executes CI/CD jobs → kubeconfig → VM 500

🔧

Management

Portainer · Homepage · Watchtower ⚠ restart loop

runs jobs

▼

postgres :5432

▼

☸

cicd-automation

VM 500 · Debian 12 · 192.168.178.220

Rancher 2.14

🌐

k3s cluster (single node)

Traefik LB · cert-manager · K8s Dashboard · metrics-server

🧪

zabbix-dev

9 revisions · ✓ healthy

🔬

zabbix-staging

2 revisions · ✓ healthy

🚀

zabbix-prod

prod env · ✓ healthy

📋

IaC layer

Terraform v1.15 · Helm v3.20 · GitLab HTTP state

🔀

GitLab CI pipelines

Plan (auto) → Apply (manual) · 360+ runs

🐴

Rancher system

Fleet · CAPI (rancher-turtles) · upgrade-controller

Docker VM

Kubernetes VM

Data flow

Hover any layer for details

🐳 Docker — VM 400 · 192.168.178.215

homelab-docker containers

Ubuntu 24.04 · 10 GB RAM (4.1 GB used) · 148 GB disk (47% used) · Kernel 6.8.0-111 · Two Compose stacks: home-lab (25 services) + crowdsec (2 services)

traefik

traefik:v2.11

:80 :443 :8081 (dashboard)

cloudflared

cloudflare/cloudflared:latest

no ports — tunnel only

authentik

ghcr.io/goauthentik/server:latest

SSO — auth.abc-server.date

authentik-worker

ghcr.io/goauthentik/server:latest

background task worker

authentik-postgres

postgres:15

:5432 (internal)

authentik-redis

redis:7

:6379 (internal)

crowdsec

crowdsecurity/crowdsec:latest

IDS engine (crowdsec stack)

crowdsec-bouncer

fbonalair/traefik-crowdsec-bouncer

blocks malicious IPs at Traefik

pihole

pihole/pihole:latest

:53 (DNS) :80 (admin)

docker-socket-proxy

tecnativa/docker-socket-proxy

:2375 (restricted socket access)

graylog-mongo

mongo:7.0

:27017 (internal) — Graylog backend

netdata

netdata/netdata:v1.47.5

:19999 — netdata.abc-server.date

uptime-kuma

louislam/uptime-kuma

:3001 — uptime-kuma.abc-server.date

zabbix-postgres

postgres:15

0.0.0.0:5432 → K8s Zabbix pods

immich

ghcr.io/immich-app/immich-server:release

:2283 — immich.abc-server.date

immich-postgres

tensorchord/pgvecto-rs:pg14-v0.2.0

:5432 (internal) — vector ext.

immich-redis

redis:7

:6379 (internal)

jellyfin

jellyfin/jellyfin

:8096 — jellyfin.abc-server.date

actual

actualbudget/actual-server:latest

:5006 — actual.abc-server.date

finanzblick

home-lab-finanzblick (custom build)

:8000 — finanzblick.abc-server.date

ollama

ollama/ollama

0.0.0.0:11434 (LLM API)

n8n

n8nio/n8n:latest

0.0.0.0:5678 — n8n.abc-server.date

homepage

ghcr.io/gethomepage/homepage:latest

:3000 — homepage.abc-server.date

portainer

portainer/portainer-ce:latest

:9000/:9443 — portainer.abc-server.date

proxmox-dashboard

home-lab-proxmox-dashboard (custom)

:8080 — proxmox-dashboard.abc-server.date

monitoring-hub

home-lab-monitoring-hub (custom)

:8000 — monitoring-hub.abc-server.date

zabbix-hub

home-lab-zabbix-hub (custom)

:8000 — zabbix-hub.abc-server.date

gitlab-runner

gitlab/gitlab-runner:latest

no ports — executes CI pipeline jobs

watchtower ⚠

containrrr/watchtower

Restarting (exit 1) — investigating

☸ Kubernetes — VM 500 · 192.168.178.220

k3s cluster — cicd-automation

Debian 12 · 10 GB RAM (7.3 GB used) · 320 GB disk (12% used) · Kernel 6.1.0-44-amd64 · containerd 2.2.2 · Single control-plane node · 34d uptime

☸

k3s v1.34.6+k3s1

Node: cicd-automation · Ready

Role: control-plane · 34d uptime

🐴

Rancher 2.14.1

3 replicas running · 20d

rancher.abc-server.date

🔒

cert-manager

3 pods (cainjector, webhook, main)

32d uptime · 3 restarts each

Helm releases

Release	Namespace	Chart	App version	Rev.	Status
zabbix-dev	zabbix-dev	zabbix-7.0.12	7.0.16	9	✓ healthy
zabbix-staging	zabbix-staging	zabbix-7.0.12	7.0.16	2	✓ all healthy
zabbix-prod	zabbix-prod	zabbix-7.0.12	7.0.16	1	✓ healthy
rancher	cattle-system	rancher-2.14.1	v2.14.1	3	✓ healthy
fleet	cattle-fleet-system	fleet-109.0.1	0.15.1	4	✓ healthy
rancher-turtles	cattle-turtles-system	rancher-turtles-109.0.1	0.26.1	2	✓ healthy
rancher-webhook	cattle-system	rancher-webhook-109.0.1	0.10.4	3	✓ healthy
traefik	kube-system	traefik-39.0.501	v3.6.10	1	✓ healthy
system-upgrade-controller	cattle-system	system-upgrade-controller-109.0.1	v0.19.1	2	✓ healthy

Ingress resources

Namespace	Host	Ports	Address
cattle-system	rancher.abc-server.date	80, 443	192.168.178.220
zabbix-dev	zabbix-dev.abc-server.date	80	192.168.178.220
zabbix-staging	zabbix-staging.abc-server.date	80	192.168.178.220
zabbix-prod	zabbix-prod.abc-server.date	80	192.168.178.220

🔀 CI/CD

GitLab CI → Terraform → Helm

GitLab Runner on VM 400 executes jobs targeting VM 500 via kubeconfig · Terraform manages K8s state · Remote state in GitLab HTTP backend

🔀

Pipeline flow

1. commitPush to GitLab (cicd-zabbix or cicd-demo repo)

2. planterraform plan — auto, shows diff, no change

3. applyManual trigger → terraform apply → Helm release update

idempotentRe-run at any time — Terraform checks state, only applies diffs

📋

Terraform details

Version	v1.15.1 (v1.15.4 available)
Backend	GitLab HTTP (remote state)
Providers	hashicorp/helm + hashicorp/kubernetes
Resources	helm_release + kubernetes_ingress_v1
Chart	zabbix-community/zabbix v7.0.12
Secrets	Masked CI/CD variables in GitLab
Total runs	360+ pipeline executions

Environment status

All three environments share the same Terraform config and Helm chart. The namespace variable switches between them.

Env	Namespace	DB	Revisions	Current state
zabbix-dev	zabbix-dev	192.168.178.215:5432	9	✓ All pods healthy
zabbix-staging	zabbix-staging	192.168.178.215:5432	2	✓ All 3 pods healthy
zabbix-prod	zabbix-prod	192.168.178.215:5432	1	✓ All pods healthy

🔬 Projects

Learning by Building

Real infrastructure experiments — built to understand, not to impress. AI-assisted where noted. These projects are also featured on the main portfolio page.

Live — ongoing pipeline runs

cicd-zabbix

Homelab CI/CD project — GitLab pipelines trigger Terraform + Helm to deploy Zabbix 7.4 to a Kubernetes cluster (Rancher). Three isolated environments (zabbix-dev, zabbix-staging, and zabbix-prod) deployed fully from code. Apply stage is manual — plan runs automatically. All secrets stored as masked CI/CD variables. GitLab Runner runs on the Docker host (VM 400); the runner executes Terraform and Helm commands that target the k3s cluster on a separate VM (VM 500 / cicd-automation) via kubeconfig. Terraform state is stored remotely in GitLab's HTTP backend. Built to learn IaC and K8s deployment pipelines hands-on.

GitLab CI Terraform Helm Kubernetes Rancher Zabbix 7.4 Homelab

Live demo

monitoring-hub

Custom dashboard built on top of the Zabbix API — shows live CPU, RAM, disk, ping, problems and maintenance status for all monitored hosts. Includes per-host drawer with enable/disable/maintenance actions. AI-assisted development — I designed the architecture and monitoring logic; code generated with AI help.

Python FastAPI Zabbix API Docker Homelab AI-Assisted

Running in production

Homelab Platform

Self-hosted Docker Compose platform running 30+ services on a single VM (Ubuntu 24.04, 10 GB RAM). Traefik as reverse proxy, Cloudflare Tunnel for zero-trust external access, Authentik SSO protecting every service, CrowdSec for intrusion detection. Graylog + OpenSearch for centralised log aggregation, Netdata + Uptime Kuma for real-time monitoring. Notable apps: Immich (photo backup), Jellyfin (media), Actual Budget, Pihole DNS, Portainer, and a Homepage dashboard aggregating all services.

Infrastructure: Single Proxmox node (PVE 9.1, 8 vCPUs, 24 GB RAM, ~960 GB storage) running two active VMs: homelab-docker (all Docker services + GitLab Runner) and cicd-automation (k3s cluster + Terraform + Helm). Previously explored multi-VM Zabbix HA and multi-node K8s setups — consolidated to the current architecture.

Infrastruktur: Einzelner Proxmox-Node (PVE 9.1, 8 vCPUs, 24 GB RAM, ~960 GB Storage) mit zwei aktiven VMs: homelab-docker (alle Docker-Services + GitLab Runner) und cicd-automation (k3s-Cluster + Terraform + Helm). Zuvor Multi-VM-Zabbix-HA und Multi-Node-K8s-Setups erprobt — konsolidiert auf die aktuelle Architektur.

Docker Compose Traefik Authentik CrowdSec Cloudflare Graylog Netdata Uptime Kuma Immich Jellyfin Portainer n8n Homelab

Running daily

AI Infrastructure Bot

n8n workflow that queries the Docker socket proxy daily at 15:00, sends container health data to a local Ollama LLM for analysis, then pushes a formatted status report to Telegram. A second flow responds to on-demand commands like /proxmox with live VM CPU/RAM/status from the Proxmox API. No coding required — built entirely with n8n's visual workflow builder.

n8n Ollama Telegram Proxmox API Homelab

Running in homelab

FinanzBlick

Self-hosted personal finance app — PDF/CSV import from ING bank, monthly analytics, savings rate, recurring transaction detection, and year-over-year comparisons. AI-assisted development — I designed the features and data model; code generated with AI help. Runs as a container in K8s, deployed manually.

FastAPI Python Docker SQLite Homelab AI-Assisted

🗺 Roadmap

Learning roadmap

Progressing from enterprise monitoring expert into Kubernetes/DevOps/IaC territory — building evidence through real homelab projects, not just tutorials.

✓

14+ years enterprise monitoring — Zabbix Certified Professional 7.0

✓ Completed · April 2026

4 Zabbix certifications (Professional, Specialist, Security, SNMP). 30,000+ monitored hosts across 200+ enterprise clients. Led Nagios-to-Zabbix migrations at scale. Deep knowledge of enterprise monitoring architecture, escalation management, and platform ownership. This is the professional foundation everything else builds on.

✓ doneCP-2604-039CS-2412-151ZEX03 · ZEX05

✓

Homelab foundation — Proxmox · Docker · Cloudflare Tunnel · SSO · Observability

✓ Completed · 2024–2025

Built from zero: single Proxmox node, 30 Docker services, zero-trust external access via Cloudflare Tunnel, Authentik SSO protecting all services, CrowdSec IDS, Graylog centralised logging, Pihole DNS, Netdata metrics. 16+ public subdomains. All production-grade personal infrastructure — not just tutorials.

✓ doneProxmox PVEDocker ComposeCloudflare TunnelAuthentikGraylogCrowdSec

✓

k3s + Rancher + GitLab CI/CD + Terraform + Helm — full IaC pipeline

✓ Completed · April–May 2026

Deployed single-node k3s via Rancher 2.14. Built GitLab CI pipelines (plan auto / apply manual) triggering Terraform → Helm to deploy Zabbix to three isolated K8s environments. Remote Terraform state in GitLab HTTP backend. 360+ pipeline runs. Added CAPI via rancher-turtles, Fleet GitOps, system-upgrade-controller. This is real IaC, not a demo.

✓ donek3s v1.34Rancher 2.14TerraformHelmGitLab CIFleetCAPI

▶

Deepen K8s debugging and observability skills

▶ Active — now

All three Zabbix environments (dev, staging, prod) are running. Next step is going deeper into K8s observability: understanding pod logs, resource metrics, event streams, and how to read cluster state. This builds real troubleshooting muscle beyond "it deployed".

activekubectl logskubectl describeK8s debugging

▶

Deploy FinanzBlick + monitoring-hub via CI/CD pipeline

▶ Next — ongoing

Currently both custom apps are deployed manually. The natural next step is to add them to GitLab CI with proper Docker build → push → deploy pipelines. This closes the gap between "I can deploy Zabbix via CI" and "I can deploy any service via CI" — which is what a DevOps role actually requires.

nextGitLab CIDocker buildHelm

→

Multi-node k3s + GitOps with Fleet + Prometheus/Grafana stack

→ Progressive — next

Expand from single-node to multi-node k3s using archived VMs (204–206) or new VMs. Implement full GitOps via Rancher Fleet. Add Prometheus + Grafana (subdomains already defined: prometheus.abc-server.date, grafana.abc-server.date) as the monitoring stack. This mirrors real production K8s environments and directly demonstrates consulting-level skills.

nextmulti-nodeFleet GitOpsPrometheusGrafana

○

Full IaC for homelab — Proxmox provider + Cloudflare provider + OpenTofu

○ Future

Currently only Zabbix K8s deployments are managed as IaC. Goal: declaratively manage VM creation (Proxmox Terraform provider), DNS records (Cloudflare provider), and potentially Docker Compose stacks. This closes the full platform engineering loop and demonstrates infrastructure-as-code beyond K8s.

futureOpenTofuProxmox providerCloudflare provider

○

K8s/DevOps/IaC consultant — internal or new employer

○ Long-term goal

Transition from Zabbix monitoring expert to K8s/DevOps/IaC consultant. The homelab documents a credible learning trajectory: monitoring foundation → containerisation → Kubernetes → CI/CD → IaC. The project portfolio makes this a realistic career pivot — moving toward platform engineering or cloud infrastructure roles.

futureconsultingK8sIaCplatform engineering

🤖 Automation

n8n infrastructure automation

Two live workflows running in the homelab — one scheduled daily report, one on-demand Telegram command handler. No coding required — built entirely with n8n's visual workflow builder.

⏰

Daily container health report

Runs every day at 15:00 automatically

⏰

Schedule

15:00 daily

🌐

HTTP GET

docker-socket-proxy

{}

JS Prepare

format data

{}

JS Aggregate

summarise

🌐

Ollama AI

POST :11434

✈️

send report

Every day at 15:00, n8n queries the Docker socket proxy to get the status of all 30 containers, passes the data through JavaScript to prepare and aggregate it, sends it to the local Ollama LLM for analysis, then pushes a formatted health report to Telegram.

✈️

On-demand /proxmox command

Reply to any Telegram message containing /proxmox

✈️

on message

{}

JS Parse

parse cmd

🔀

Switch

/proxmox?

>_

Proxmox API

get VM stats

✈️

send reply

Send /proxmox to the Telegram bot and get an instant reply with live VM stats — CPU %, RAM %, disk usage, and running/stopped status for all 12 VMs — pulled directly from the Proxmox API.

Live Telegram response to /proxmox

🖥 Proxmox

🖥 HOST

CPU: 25%

RAM: 84%

Disk: 9%

🧱 VMS

200 | terra-automation | 🔴 stopped

203 | terra-monitoring | 🔴 stopped

204 | terra-k8s-master | 🔴 stopped

…

400 | homelab-docker | 🟢 running | CPU: 6% | RAM: 85%

500 | cicd-automation | 🟢 running | CPU: 23% | RAM: 92%

This message was sent automatically with n8n

Self-hosted infrastructure from scratch

Network layout

All public subdomains (via Cloudflare Tunnel → Traefik)

Proxmox PVE 9.1 — node overview

All VMs (12 total · 2 running · 10 archived)

homelab-docker

Access & security

Observability

Self-hosted apps

AI / automation

GitLab Runner

Management

cicd-automation

k3s cluster (single node)

zabbix-dev

zabbix-staging

zabbix-prod

IaC layer

GitLab CI pipelines

Rancher system

homelab-docker containers

k3s cluster — cicd-automation

Helm releases

Ingress resources

GitLab CI → Terraform → Helm

Environment status

Learning by Building

cicd-zabbix

monitoring-hub

Homelab Platform

Homelab-Plattform

AI Infrastructure Bot

KI-Infrastruktur-Bot

FinanzBlick

Learning roadmap

n8n infrastructure automation