End-to-End QA Scenario

Distributed Load Testing with Observability

k6 generators on EC2 Auto Scaling drive load against EKS services; correlate with X-Ray + CloudWatch; chaos via FIS.

Architecture

EC2 ASG (k6 workers, 20 nodes) ──► ALB ──► EKS (services A,B,C) ──► RDS / DynamoDB
                  │                                       │
                  ▼                                       ▼
              S3 results               X-Ray traces + CloudWatch metrics
                  │                                       │
                  └────► Athena ◄──── CloudWatch Logs Insights
                            │
                  FIS experiment: throttle service B mid-run, observe SLO

Workflow steps

1
Generators
Launch Template + ASG of 20 EC2 c6i.large running k6, coordinated by a controller node.
2
Target
Microservices on EKS behind ALB; X-Ray Active Tracing enabled across the call chain.
3
Correlate
k6 sets `x-qa-run-id` header; services propagate it as an X-Ray annotation so traces filter per run.
4
Chaos mid-load
Halfway through the test, FIS throttles network on service B pods; verify p95 stays within SLO.
5
Persist
k6 streams JSON results to S3; CloudWatch Logs ship pod logs.
6
Analyze
Athena queries k6 results + ALB access logs; dashboard correlates RPS, p95, error rate, CPU, saturation.

Key takeaways

Tag every layer with run-id — without it you can't correlate findings later.
Chaos + load together expose issues neither finds alone.
Cheap long-term storage (S3 + Athena) beats expensive APM retention for historical runs.

Architecture

Workflow steps

Generators

Target

Correlate

Chaos mid-load

Persist

Analyze

Key takeaways