All scenarios

End-to-End QA Scenario

Distributed Load Testing with Observability

k6 generators on EC2 Auto Scaling drive load against EKS services; correlate with X-Ray + CloudWatch; chaos via FIS.

Architecture

EC2 ASG (k6 workers, 20 nodes) ──► ALB ──► EKS (services A,B,C) ──► RDS / DynamoDB
                  │                                       │
                  ▼                                       ▼
              S3 results               X-Ray traces + CloudWatch metrics
                  │                                       │
                  └────► Athena ◄──── CloudWatch Logs Insights
                            │
                  FIS experiment: throttle service B mid-run, observe SLO

Workflow steps

  1. 1

    Generators

    Launch Template + ASG of 20 EC2 c6i.large running k6, coordinated by a controller node.

  2. 2

    Target

    Microservices on EKS behind ALB; X-Ray Active Tracing enabled across the call chain.

  3. 3

    Correlate

    k6 sets `x-qa-run-id` header; services propagate it as an X-Ray annotation so traces filter per run.

  4. 4

    Chaos mid-load

    Halfway through the test, FIS throttles network on service B pods; verify p95 stays within SLO.

  5. 5

    Persist

    k6 streams JSON results to S3; CloudWatch Logs ship pod logs.

  6. 6

    Analyze

    Athena queries k6 results + ALB access logs; dashboard correlates RPS, p95, error rate, CPU, saturation.

Key takeaways

  • Tag every layer with run-id — without it you can't correlate findings later.
  • Chaos + load together expose issues neither finds alone.
  • Cheap long-term storage (S3 + Athena) beats expensive APM retention for historical runs.