End-to-End QA Scenario

Distributed Playwright Grid on ECS Fargate

Run thousands of Playwright tests in parallel using SQS-backed Fargate workers, autoscaled on queue depth.

Architecture

Pipeline ─► Lambda enqueues N test shards ─► SQS queue
                                                       │
                       ECS Fargate Service (auto-scales on ApproxNumberOfMessagesVisible)
                       ├─► Worker 1 ─► run shard ─► JUnit + traces to S3
                       ├─► Worker 2 ─► ...
                       └─► Worker M ─► ...
                                                       │
                       EventBridge "queue empty" ─► Lambda aggregator ─► report

Workflow steps

1
Shard
Lambda splits the Playwright spec list into N shards (size tuned from historical durations) and enqueues to SQS.
2
Scale
ECS Service Auto Scaling tracks `ApproximateNumberOfMessagesVisible`; scales 1 → 100 workers in minutes.
3
Execute
Each Fargate task pulls a shard, runs `npx playwright test --shard`, uploads JUnit + trace.zip + video to S3.
4
Aggregate
When queue is empty, EventBridge invokes a Lambda that merges shards into a single HTML report.
5
Cleanup
Service scales back to 0; CloudWatch dashboard shows runtime, cost per run, and flake rate.

Key takeaways

Wall-clock collapses from hours to minutes by trading parallelism for cost.
Queue-driven scaling means you never pay for idle workers between runs.
Traces and videos in S3 make every failure debuggable post-hoc.

Architecture

Workflow steps

Shard

Scale

Execute

Aggregate

Cleanup

Key takeaways