Homelab Status Dashboard¶
Overview¶
A custom-built service health monitoring dashboard that auto-discovers and monitors all Kubernetes workloads in the cluster.
Architecture¶
graph LR
subgraph External
Users[Users]
CF[Cloudflare Tunnel]
end
subgraph Kubernetes
subgraph homelab-status namespace
App[Status Dashboard]
end
subgraph All Namespaces
D1[Deployments]
D2[StatefulSets]
D3[DaemonSets]
end
end
Users --> CF
CF --> App
App -->|K8s API| D1
App -->|K8s API| D2
App -->|K8s API| D3
Components¶
Status Dashboard¶
| Property | Value |
|---|---|
| Technology | Python Flask |
| Framework | Gunicorn WSGI |
| Port | 5000 |
| URL | dashboard.ajandrews.pro |
Features¶
Auto-Discovery¶
The dashboard automatically discovers all workloads:
- Deployments - Standard application pods
- StatefulSets - Stateful applications (databases, etc.)
- DaemonSets - Node-level services
Health Monitoring¶
| Metric | Description |
|---|---|
| Ready Replicas | Current vs desired pod count |
| Status | Up/Down based on readiness |
| Duration | Time in current state |
| Namespace | Logical grouping |
Client-Side Latency¶
The dashboard measures latency from the user's browser to external services:
async function measurePing(url) {
const start = performance.now();
await fetch(url, { mode: 'no-cors' });
return Math.round(performance.now() - start);
}
Deployment¶
Kubernetes Resources¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: homelab-status
namespace: homelab-status
spec:
replicas: 1
template:
spec:
containers:
- name: homelab-status
image: ajxfear/homelab-status:latest
ports:
- containerPort: 5000
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "250m"
RBAC Configuration¶
The dashboard requires cluster-wide read access:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: homelab-status-reader
rules:
- apiGroups: [""]
resources: ["namespaces"]
verbs: ["list"]
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets", "daemonsets"]
verbs: ["list", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: homelab-status-reader
subjects:
- kind: ServiceAccount
name: homelab-status
namespace: homelab-status
roleRef:
kind: ClusterRole
name: homelab-status-reader
apiGroup: rbac.authorization.k8s.io
API Endpoints¶
Status API¶
Returns service status grouped by namespace:
{
"summary": {
"total": 38,
"up": 38,
"down": 0,
"health_percent": 100.0
},
"services": {
"hub": [
{
"name": "hub-api",
"namespace": "hub",
"type": "deployment",
"status": "up",
"ready": "1/1",
"duration": "2h 30m",
"external_url": "https://api.ajandrews.pro"
}
]
},
"last_check": "2026-01-10T12:00:00",
"check_interval": 30
}
Endpoints API¶
Returns external endpoint URLs for client-side ping testing.
Health Check¶
Returns service health status.
Configuration¶
Environment Variables¶
| Variable | Default | Description |
|---|---|---|
CHECK_INTERVAL |
30 | Seconds between checks |
IGNORED_NAMESPACES |
kube-system,kube-public,kube-node-lease | Namespaces to exclude |
External Endpoints Map¶
Configure which services have external URLs for latency testing:
EXTERNAL_ENDPOINTS = {
'hub-web': 'https://ajandrews.pro',
'hub-api': 'https://api.ajandrews.pro',
'wiki': 'https://wiki.ajandrews.pro',
'nextcloud': 'https://cloud.ajandrews.pro',
'prometheus-grafana': 'https://grafana.ajandrews.pro',
'foundryvtt': 'https://dnd.ajandrews.pro',
'argocd-server': 'https://argocd.ajandrews.pro',
}
UI Features¶
Summary Panel¶
Displays cluster health at a glance:
- Total services count
- Operational services (green)
- Down services (red)
- Health percentage
Namespace Groups¶
Services grouped by namespace with:
- Collapsible sections
- Service count badges
- Status indicators (red badge if any down)
Service Cards¶
Each service displays:
- Service name and type
- Status indicator (dot)
- Ready replica count
- Time in current state
- Client-side latency (for external services)
- External URL link
Styling¶
The dashboard matches the Nexus Navigator theme:
| Element | Color |
|---|---|
| Background | #0a0a0f |
| Cards | #1a1a24 |
| Accent (Cyan) | #00f0ff |
| Accent (Purple) | #a855f7 |
| Status Up | #22c55e |
| Status Down | #ef4444 |
Monitoring¶
Background Checker¶
The service runs a background thread that:
- Queries Kubernetes API every 30 seconds
- Updates global service status
- Tracks status change timestamps
- Thread-safe with lock protection
Auto-Refresh¶
Frontend auto-refreshes every 30 seconds via JavaScript fetch.
Troubleshooting¶
Common Issues¶
| Issue | Cause | Resolution |
|---|---|---|
| No services shown | RBAC missing | Apply ClusterRole and binding |
| 403 from K8s API | ServiceAccount issue | Check SA is mounted |
| Stale data | Background thread died | Restart pod |
| Slow latency values | CORS/network | Expected for some services |