Hardware Infrastructure¶

Overview¶

The HomeLab runs on virtualized infrastructure using Proxmox VE as the hypervisor, with dedicated VMs for Kubernetes nodes.

Cluster Composition¶

Kubernetes Nodes¶

Role	Count	Purpose
Master	1	Control plane, etcd, API server
Worker	4	Application workloads

Node Specifications¶

Each Kubernetes node runs as a VM with:

OS: Linux (optimized for containers)
Runtime: containerd
Kubernetes: K3s v1.34.3

Virtualization Layer¶

Proxmox VE¶

Role: Hypervisor for all VMs
Features:
Live migration
Snapshot management
Resource pooling

Storage Architecture¶

Longhorn CSI¶

Longhorn provides distributed block storage across worker nodes:

graph LR
    subgraph Worker 1
        L1[Longhorn Replica]
    end
    subgraph Worker 2
        L2[Longhorn Replica]
    end
    subgraph Worker 3
        L3[Longhorn Replica]
    end
    subgraph Worker 4
        L4[Longhorn Replica]
    end

    PVC[Persistent Volume Claim]
    PVC --> L1
    PVC --> L2
    PVC --> L3
    PVC --> L4

Features:

Automatic replication (default: 3 replicas)
Snapshot and backup support
Dynamic provisioning
Volume expansion

Storage Classes¶

Class	Provisioner	Reclaim Policy
`longhorn`	Longhorn	Delete
`longhorn-retain`	Longhorn	Retain

Resource Allocation¶

Typical Workload Distribution¶

Namespace	CPU Request	Memory Request
monitoring	500m	2Gi
databases	1000m	4Gi
hub	200m	512Mi
argocd	250m	512Mi

High Availability Considerations¶

Control Plane¶

Single master node (acceptable for homelab)
etcd data backed up regularly

Worker Nodes¶

4 worker nodes provide redundancy
Pod anti-affinity spreads workloads
Longhorn replicates data across nodes

Failure Scenarios¶

Failure	Impact	Recovery
1 worker down	Minimal, pods reschedule	Automatic
2 workers down	Degraded, some PVCs unavailable	Manual intervention
Master down	No new deployments	Restore from backup

Monitoring Hardware Health¶

Metrics Collected¶

CPU utilization
Memory usage
Disk I/O
Network throughput
Node conditions

Alerts¶

Node NotReady
High CPU/Memory usage
Disk pressure
Network unavailable

Capacity Planning¶

Current Utilization¶

Monitor via Grafana dashboards:

Cluster CPU usage
Cluster memory usage
Storage capacity
Network bandwidth

Scaling Options¶

Vertical: Increase VM resources
Horizontal: Add more worker nodes
Storage: Expand Longhorn pool