Skip to content

Hardware Infrastructure

Overview

The HomeLab runs on virtualized infrastructure using Proxmox VE as the hypervisor, with dedicated VMs for Kubernetes nodes.

Cluster Composition

Kubernetes Nodes

Role Count Purpose
Master 1 Control plane, etcd, API server
Worker 4 Application workloads

Node Specifications

Each Kubernetes node runs as a VM with:

  • OS: Linux (optimized for containers)
  • Runtime: containerd
  • Kubernetes: K3s v1.34.3

Virtualization Layer

Proxmox VE

  • Role: Hypervisor for all VMs
  • Features:
  • Live migration
  • Snapshot management
  • Resource pooling

Storage Architecture

Longhorn CSI

Longhorn provides distributed block storage across worker nodes:

graph LR
    subgraph Worker 1
        L1[Longhorn Replica]
    end
    subgraph Worker 2
        L2[Longhorn Replica]
    end
    subgraph Worker 3
        L3[Longhorn Replica]
    end
    subgraph Worker 4
        L4[Longhorn Replica]
    end

    PVC[Persistent Volume Claim]
    PVC --> L1
    PVC --> L2
    PVC --> L3
    PVC --> L4

Features:

  • Automatic replication (default: 3 replicas)
  • Snapshot and backup support
  • Dynamic provisioning
  • Volume expansion

Storage Classes

Class Provisioner Reclaim Policy
longhorn Longhorn Delete
longhorn-retain Longhorn Retain

Resource Allocation

Typical Workload Distribution

Namespace CPU Request Memory Request
monitoring 500m 2Gi
databases 1000m 4Gi
hub 200m 512Mi
argocd 250m 512Mi

High Availability Considerations

Control Plane

  • Single master node (acceptable for homelab)
  • etcd data backed up regularly

Worker Nodes

  • 4 worker nodes provide redundancy
  • Pod anti-affinity spreads workloads
  • Longhorn replicates data across nodes

Failure Scenarios

Failure Impact Recovery
1 worker down Minimal, pods reschedule Automatic
2 workers down Degraded, some PVCs unavailable Manual intervention
Master down No new deployments Restore from backup

Monitoring Hardware Health

Metrics Collected

  • CPU utilization
  • Memory usage
  • Disk I/O
  • Network throughput
  • Node conditions

Alerts

  • Node NotReady
  • High CPU/Memory usage
  • Disk pressure
  • Network unavailable

Capacity Planning

Current Utilization

Monitor via Grafana dashboards:

  • Cluster CPU usage
  • Cluster memory usage
  • Storage capacity
  • Network bandwidth

Scaling Options

  1. Vertical: Increase VM resources
  2. Horizontal: Add more worker nodes
  3. Storage: Expand Longhorn pool