Local AWS emulation with Kumo: a practical CI and dev workflow guide
Hands-on guide to using Kumo as a lightweight AWS emulator for local dev and CI — setup, persistence tradeoffs, S3/SQS/DynamoDB examples and flaky test fixes.
When your tests depend on AWS services, running them against the real cloud is slow, costly and flaky. Kumo is a lightweight AWS emulator that works well as a LocalStack alternative — a single binary (or a container) that can run 73 AWS services locally with optional data persistence. This guide walks through replacing cloud-dependent unit and integration tests with Kumo, explains persistence tradeoffs, shows hands-on examples for S3, SQS and DynamoDB, and offers CI patterns to avoid flaky tests.
Why Kumo?
Key advantages that make Kumo attractive for CI and local development:
- Small, fast startup — good for ephemeral CI jobs.
- No authentication required — simplifies SDK configuration in tests.
- Single binary and Docker support — easy to distribute and run.
- AWS SDK v2 compatible — integrates cleanly with modern AWS SDKs.
- Optional data persistence via a KUMO_DATA_DIR — survive restarts when you want to.
It’s not a 1:1 replacement for every AWS behavior, but it's stable enough for unit and integration test suites that don't require 100% edge-case parity.
Getting started: run Kumo locally
Two common ways to run Kumo in development and CI: as a single binary or inside Docker. Below are minimal examples you can adapt.
Docker (recommended for CI)
docker run -d --name kumo -p 4566:4566 -v /tmp/kumo-data:/data -e KUMO_DATA_DIR=/data ghcr.io/sivchari/kumo:latest
Notes:
- Port 4566 is commonly used by emulators; adjust if you configure Kumo differently.
- Mount a host directory to persist state between restarts (optional).
Run the binary
# download + run
./kumo --data-dir /tmp/kumo-data
Once Kumo is running, you can target it with AWS tooling by specifying an endpoint override. For example, with the AWS CLI:
aws --endpoint-url http://localhost:4566 s3 ls
Configuring your SDKs and tests
Because Kumo disables authentication by default, you can reduce test boilerplate. Typical patterns:
- AWS CLI: use
--endpoint-urlon each command. - Go AWS SDK v2: provide a custom endpoint resolver that points to Kumo.
- Boto3: use
boto3.client('s3', endpoint_url='http://localhost:4566').
Example Go SDK v2 snippet to override endpoints:
cfg, _ := config.LoadDefaultConfig(context.TODO())
resolver := aws.EndpointResolverFunc(func(service, region string) (aws.Endpoint, error) {
return aws.Endpoint{URL: "http://localhost:4566"}, nil
})
cfg.EndpointResolver = resolver
s3Client := s3.NewFromConfig(cfg)
Persistence: choices and tradeoffs
Kumo supports optional persistence through KUMO_DATA_DIR. Decide between ephemeral and persistent modes based on test goals:
- Ephemeral (no persistence)
- Pros: isolated, deterministic clean-state every run; great for CI where each job must start fresh.
- Cons: slower if you need to bootstrap complex fixture sets on every run.
- Persistent (KUMO_DATA_DIR)
- Pros: faster local dev because state survives restarts; useful for manual debugging and iterative work.
- Cons: risk of state leakage between tests; you must build reset/cleanup steps to keep tests deterministic.
Best practice: use ephemeral mode in CI. In local dev, enable persistence but include small scripts to snapshot and reset your workspace when necessary.
Practical examples: S3, SQS and DynamoDB
The following examples show common patterns you can plug into test setup/teardown steps.
S3: create bucket, upload, assert
# create a bucket and upload an object using AWS CLI
aws --endpoint-url http://localhost:4566 s3api create-bucket --bucket test-bucket
aws --endpoint-url http://localhost:4566 s3 cp ./fixture.json s3://test-bucket/fixture.json
# verify
aws --endpoint-url http://localhost:4566 s3api head-object --bucket test-bucket --key fixture.json
Test tips:
- Use unique names for buckets per test run (append a short UUID) to avoid cross-test interference when running tests in parallel.
- Explicitly delete buckets and objects during teardown if using persistent storage.
SQS: queue lifecycle and message visibility
# create queue
QUEUE_URL=$(aws --endpoint-url http://localhost:4566 sqs create-queue --queue-name test-queue --query 'QueueUrl' --output text)
# send message
aws --endpoint-url http://localhost:4566 sqs send-message --queue-url $QUEUE_URL --message-body 'hello'
# receive and delete
MSG=$(aws --endpoint-url http://localhost:4566 sqs receive-message --queue-url $QUEUE_URL --max-number-of-messages 1)
# delete by receipt handle
Test tips:
- Adjust visibility timeouts in tests to make retries predictable.
- Use short polling where you need fast assertions; add small retries/backoff in tests to avoid flaky timing.
DynamoDB: create table and run assertions
# create table
aws --endpoint-url http://localhost:4566 dynamodb create-table --table-name TestTable \
--attribute-definitions AttributeName=id,AttributeType=S \
--key-schema AttributeName=id,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1
# put item
aws --endpoint-url http://localhost:4566 dynamodb put-item --table-name TestTable --item '{"id": {"S":"1"}, "value": {"S":"hello"}}'
# get item
aws --endpoint-url http://localhost:4566 dynamodb get-item --table-name TestTable --key '{"id": {"S":"1"}}'
Test tips:
- Prefer on-demand style operations for speed if your SDK supports it; Kumo is typically fast and does not enforce throughput limits like production AWS.
- Clean up tables between tests if running with persistent data.
CI workflow patterns to avoid flaky tests
Flaky tests often stem from race conditions, uninitialized state, or network/timeouts. The patterns below fit most CI systems (GitHub Actions / GitLab / CircleCI etc.).
-
Start Kumo as a service or job step:
Run Kumo in the same job as your tests to ensure network locality and speed. Use a health check (HTTP /health or a simple API call) to wait until Kumo is ready.
-
Bootstrap test fixtures once per job:
After Kumo is ready, create the minimal set of buckets/queues/tables your test suite needs. Keep bootstrapping scripts idempotent.
-
Use deterministic names:
Prefix resources with the CI job id, branch name or a short random suffix to avoid collisions when running in parallel.
-
Retry on transient errors with limits:
Wrap calls that may race with small retries and exponential backoff rather than long fixed sleeps. This reduces flakiness without masking real failures.
-
Prefer ephemeral Kumo for CI:
Start Kumo with an empty data directory inside the job so every CI run starts from a clean slate.
-
Expose health endpoints and assert readiness:
Use curl or a small script to poll a Kumo health route (or a simple API call such as list buckets) before starting tests.
Example GitHub Actions fragment
- name: Start Kumo
run: |
docker run -d --name kumo -p 4566:4566 ghcr.io/sivchari/kumo:latest
- name: Wait for Kumo
run: |
for i in {1..30}; do
aws --endpoint-url http://localhost:4566 s3 ls && break || sleep 1
done
- name: Bootstrap fixtures
run: ./scripts/bootstrap-test-fixtures.sh
- name: Run tests
run: go test ./...
env:
AWS_ENDPOINT: http://localhost:4566
Limitations and tradeoffs
Kumo aims to be a fast emulator, but keep these caveats in mind:
- Behavioral differences: not all edge cases and eventual-consistency behaviors in AWS are emulated identically.
- Service coverage: while extensive, some service-specific features may be missing or simplified.
- Not a security mirror: since authentication is disabled by default, tests that verify IAM behavior need different approaches.
For user-facing, production-critical checks you should still run a smaller number of acceptance tests against real AWS. Kumo is best for fast feedback loops and CI-friendly integration tests.
Further reading and links
If you want to explore developer tooling and automation workflows in more depth, check out our article on how AI is changing developer tooling: Revolutionizing Web Scraping: How AI is Changing the Game for Developers. For local emulation, Kumo's project page (GitHub) is the primary source for releases, flags and service updates.
Summary: practical checklist
- Prefer ephemeral Kumo runs in CI; use persistent data locally for fast iteration.
- Bootstrap required resources at job start; use deterministic names and cleanup scripts.
- Override endpoints in SDKs and tools; avoid relying on authentication in tests.
- Add health checks, small retries and timeouts to reduce flakiness.
- Keep a small acceptance suite that runs against real AWS for final verification.
With these patterns, Kumo can replace many cloud-dependent tests, making your CI faster, cheaper and more reliable. Try it in a small repository first, iterate on bootstrap scripts, and you'll quickly get predictable local and CI test runs.
Related Topics
Alex Carter
Senior SEO Editor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Slash Code Review Costs for Scraper Projects with Kodus AI (Model-Agnostic, Zero-Markup)
Understanding Google's Core Algorithm Updates: Developer Implications
Scraping the EV PCB Supply Chain: How Developers Track Component Shortages and Market Signals
The Human Element in Tech: Building Nonprofit Solutions with Heart
Persistent vs Ephemeral State for Reproducible Scraper Tests (Using KUMO_DATA_DIR)
From Our Network
Trending stories across our publication group