Skip to content

Log Analysis

Overview

Centralized log collection and analysis enables security monitoring, troubleshooting, and compliance.

Architecture

graph LR
    subgraph Sources
        Pods[K8s Pods]
        Nodes[Node Logs]
        Net[Network Logs]
    end

    subgraph Collection
        Agent[Elastic Agent]
    end

    subgraph Storage
        ES[(Elasticsearch)]
    end

    subgraph Analysis
        Kibana[Kibana]
        Alerts[Alert Rules]
    end

    Pods --> Agent
    Nodes --> Agent
    Net --> Agent
    Agent --> ES
    ES --> Kibana
    ES --> Alerts

Log Sources

Kubernetes Logs

Source Log Type Index Pattern
Pod stdout/stderr Container logs logs-kubernetes.*
kubelet Node logs logs-system.*
API server Audit logs logs-kubernetes.*

System Logs

Source Log Type Index Pattern
systemd Service logs logs-system.*
auth Authentication logs-system.auth
syslog System messages logs-system.syslog

Application Logs

Application Log Format Index Pattern
Hub API JSON logs-generic.*
Nginx Combined logs-nginx.*
MongoDB JSON logs-mongodb.*

Log Collection

Elastic Agent Configuration

DaemonSet deployment ensures logs from all nodes:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: elastic-agent
  namespace: monitoring
spec:
  template:
    spec:
      containers:
      - name: elastic-agent
        image: docker.elastic.co/beats/elastic-agent:8.17.0
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: containers
          mountPath: /var/lib/docker/containers

Log Parsing

JSON Logs

{
  "timestamp": "2024-01-15T10:30:00Z",
  "level": "error",
  "message": "Connection timeout",
  "service": "hub-api",
  "trace_id": "abc123"
}

Grok Patterns

%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:message}

Security Analysis

Failed Authentication

GET /logs-*/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {"event.category": "authentication"}},
        {"match": {"event.outcome": "failure"}}
      ]
    }
  }
}

Suspicious Activity

Pattern Indicator
Multiple auth failures Brute force
Unusual hours access Unauthorized access
Large data transfer Data exfiltration
New admin accounts Privilege escalation

Alerting

Security Alerts

Alert Condition Severity
Auth Brute Force >10 failures/5min High
Privilege Escalation New admin user Critical
Error Spike >100 errors/min Warning

Alert Configuration

rule:
  name: "Authentication Brute Force"
  type: threshold
  query: |
    event.category: "authentication" AND
    event.outcome: "failure"
  threshold: 10
  timeframe: 5m
  actions:
    - webhook

Dashboards

Security Overview

  • Failed authentication attempts
  • Error rates by service
  • Top client IPs
  • Threat intelligence matches

Application Logs

  • Log volume over time
  • Error distribution
  • Response time percentiles
  • Service health

Log Retention

Log Type Retention Storage
Security logs 90 days Hot → Warm
Application logs 30 days Hot → Delete
Debug logs 7 days Hot → Delete
Audit logs 1 year Hot → Cold

Compliance

Logged Events

Event Required
Authentication Yes
Authorization Yes
Data access Yes
Admin actions Yes
System changes Yes

Log Protection

  • Immutable indices
  • Access controls
  • Encryption at rest
  • Audit trail

Best Practices

  1. Structured logging - Use JSON format
  2. Correlation IDs - Track requests across services
  3. Log levels - Use appropriate severity
  4. Avoid PII - Don't log sensitive data
  5. Retention policies - Balance storage and compliance

Troubleshooting

Missing Logs

  1. Check agent status
  2. Verify index patterns
  3. Check ingest pipeline
  4. Review agent logs

Query Performance

  1. Use time filters
  2. Limit returned fields
  3. Use filters over queries
  4. Index high-cardinality fields