Log Analysis
Overview
Centralized log collection and analysis enables security monitoring, troubleshooting, and compliance.
Architecture
graph LR
subgraph Sources
Pods[K8s Pods]
Nodes[Node Logs]
Net[Network Logs]
end
subgraph Collection
Agent[Elastic Agent]
end
subgraph Storage
ES[(Elasticsearch)]
end
subgraph Analysis
Kibana[Kibana]
Alerts[Alert Rules]
end
Pods --> Agent
Nodes --> Agent
Net --> Agent
Agent --> ES
ES --> Kibana
ES --> Alerts
Log Sources
Kubernetes Logs
| Source |
Log Type |
Index Pattern |
| Pod stdout/stderr |
Container logs |
logs-kubernetes.* |
| kubelet |
Node logs |
logs-system.* |
| API server |
Audit logs |
logs-kubernetes.* |
System Logs
| Source |
Log Type |
Index Pattern |
| systemd |
Service logs |
logs-system.* |
| auth |
Authentication |
logs-system.auth |
| syslog |
System messages |
logs-system.syslog |
Application Logs
| Application |
Log Format |
Index Pattern |
| Hub API |
JSON |
logs-generic.* |
| Nginx |
Combined |
logs-nginx.* |
| MongoDB |
JSON |
logs-mongodb.* |
Log Collection
Elastic Agent Configuration
DaemonSet deployment ensures logs from all nodes:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: elastic-agent
namespace: monitoring
spec:
template:
spec:
containers:
- name: elastic-agent
image: docker.elastic.co/beats/elastic-agent:8.17.0
volumeMounts:
- name: varlog
mountPath: /var/log
- name: containers
mountPath: /var/lib/docker/containers
Log Parsing
JSON Logs
{
"timestamp": "2024-01-15T10:30:00Z",
"level": "error",
"message": "Connection timeout",
"service": "hub-api",
"trace_id": "abc123"
}
Grok Patterns
%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:message}
Security Analysis
Failed Authentication
GET /logs-*/_search
{
"query": {
"bool": {
"must": [
{"match": {"event.category": "authentication"}},
{"match": {"event.outcome": "failure"}}
]
}
}
}
Suspicious Activity
| Pattern |
Indicator |
| Multiple auth failures |
Brute force |
| Unusual hours access |
Unauthorized access |
| Large data transfer |
Data exfiltration |
| New admin accounts |
Privilege escalation |
Alerting
Security Alerts
| Alert |
Condition |
Severity |
| Auth Brute Force |
>10 failures/5min |
High |
| Privilege Escalation |
New admin user |
Critical |
| Error Spike |
>100 errors/min |
Warning |
Alert Configuration
rule:
name: "Authentication Brute Force"
type: threshold
query: |
event.category: "authentication" AND
event.outcome: "failure"
threshold: 10
timeframe: 5m
actions:
- webhook
Dashboards
Security Overview
- Failed authentication attempts
- Error rates by service
- Top client IPs
- Threat intelligence matches
Application Logs
- Log volume over time
- Error distribution
- Response time percentiles
- Service health
Log Retention
| Log Type |
Retention |
Storage |
| Security logs |
90 days |
Hot → Warm |
| Application logs |
30 days |
Hot → Delete |
| Debug logs |
7 days |
Hot → Delete |
| Audit logs |
1 year |
Hot → Cold |
Compliance
Logged Events
| Event |
Required |
| Authentication |
Yes |
| Authorization |
Yes |
| Data access |
Yes |
| Admin actions |
Yes |
| System changes |
Yes |
Log Protection
- Immutable indices
- Access controls
- Encryption at rest
- Audit trail
Best Practices
- Structured logging - Use JSON format
- Correlation IDs - Track requests across services
- Log levels - Use appropriate severity
- Avoid PII - Don't log sensitive data
- Retention policies - Balance storage and compliance
Troubleshooting
Missing Logs
- Check agent status
- Verify index patterns
- Check ingest pipeline
- Review agent logs
- Use time filters
- Limit returned fields
- Use filters over queries
- Index high-cardinality fields