Skip to content

Homelab Status Dashboard

Overview

A custom-built service health monitoring dashboard that auto-discovers and monitors all Kubernetes workloads in the cluster.

Architecture

graph LR
    subgraph External
        Users[Users]
        CF[Cloudflare Tunnel]
    end

    subgraph Kubernetes
        subgraph homelab-status namespace
            App[Status Dashboard]
        end

        subgraph All Namespaces
            D1[Deployments]
            D2[StatefulSets]
            D3[DaemonSets]
        end
    end

    Users --> CF
    CF --> App
    App -->|K8s API| D1
    App -->|K8s API| D2
    App -->|K8s API| D3

Components

Status Dashboard

Property Value
Technology Python Flask
Framework Gunicorn WSGI
Port 5000
URL dashboard.ajandrews.pro

Features

Auto-Discovery

The dashboard automatically discovers all workloads:

  • Deployments - Standard application pods
  • StatefulSets - Stateful applications (databases, etc.)
  • DaemonSets - Node-level services

Health Monitoring

Metric Description
Ready Replicas Current vs desired pod count
Status Up/Down based on readiness
Duration Time in current state
Namespace Logical grouping

Client-Side Latency

The dashboard measures latency from the user's browser to external services:

async function measurePing(url) {
    const start = performance.now();
    await fetch(url, { mode: 'no-cors' });
    return Math.round(performance.now() - start);
}

Deployment

Kubernetes Resources

apiVersion: apps/v1
kind: Deployment
metadata:
  name: homelab-status
  namespace: homelab-status
spec:
  replicas: 1
  template:
    spec:
      containers:
      - name: homelab-status
        image: ajxfear/homelab-status:latest
        ports:
        - containerPort: 5000
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "250m"

RBAC Configuration

The dashboard requires cluster-wide read access:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: homelab-status-reader
rules:
- apiGroups: [""]
  resources: ["namespaces"]
  verbs: ["list"]
- apiGroups: ["apps"]
  resources: ["deployments", "statefulsets", "daemonsets"]
  verbs: ["list", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: homelab-status-reader
subjects:
- kind: ServiceAccount
  name: homelab-status
  namespace: homelab-status
roleRef:
  kind: ClusterRole
  name: homelab-status-reader
  apiGroup: rbac.authorization.k8s.io

API Endpoints

Status API

GET /api/status

Returns service status grouped by namespace:

{
  "summary": {
    "total": 38,
    "up": 38,
    "down": 0,
    "health_percent": 100.0
  },
  "services": {
    "hub": [
      {
        "name": "hub-api",
        "namespace": "hub",
        "type": "deployment",
        "status": "up",
        "ready": "1/1",
        "duration": "2h 30m",
        "external_url": "https://api.ajandrews.pro"
      }
    ]
  },
  "last_check": "2026-01-10T12:00:00",
  "check_interval": 30
}

Endpoints API

GET /api/endpoints

Returns external endpoint URLs for client-side ping testing.

Health Check

GET /health

Returns service health status.

Configuration

Environment Variables

Variable Default Description
CHECK_INTERVAL 30 Seconds between checks
IGNORED_NAMESPACES kube-system,kube-public,kube-node-lease Namespaces to exclude

External Endpoints Map

Configure which services have external URLs for latency testing:

EXTERNAL_ENDPOINTS = {
    'hub-web': 'https://ajandrews.pro',
    'hub-api': 'https://api.ajandrews.pro',
    'wiki': 'https://wiki.ajandrews.pro',
    'nextcloud': 'https://cloud.ajandrews.pro',
    'prometheus-grafana': 'https://grafana.ajandrews.pro',
    'foundryvtt': 'https://dnd.ajandrews.pro',
    'argocd-server': 'https://argocd.ajandrews.pro',
}

UI Features

Summary Panel

Displays cluster health at a glance:

  • Total services count
  • Operational services (green)
  • Down services (red)
  • Health percentage

Namespace Groups

Services grouped by namespace with:

  • Collapsible sections
  • Service count badges
  • Status indicators (red badge if any down)

Service Cards

Each service displays:

  • Service name and type
  • Status indicator (dot)
  • Ready replica count
  • Time in current state
  • Client-side latency (for external services)
  • External URL link

Styling

The dashboard matches the Nexus Navigator theme:

Element Color
Background #0a0a0f
Cards #1a1a24
Accent (Cyan) #00f0ff
Accent (Purple) #a855f7
Status Up #22c55e
Status Down #ef4444

Monitoring

Background Checker

The service runs a background thread that:

  1. Queries Kubernetes API every 30 seconds
  2. Updates global service status
  3. Tracks status change timestamps
  4. Thread-safe with lock protection

Auto-Refresh

Frontend auto-refreshes every 30 seconds via JavaScript fetch.

Troubleshooting

Common Issues

Issue Cause Resolution
No services shown RBAC missing Apply ClusterRole and binding
403 from K8s API ServiceAccount issue Check SA is mounted
Stale data Background thread died Restart pod
Slow latency values CORS/network Expected for some services