Microservices Migration
A composite B2B company with a ten-year-old Ruby monolith — in production, generating revenue, but slowing the team down: 40-minute deploys, coarse scaling, weeks of onboarding. 30 engineers, a one-year horizon for meaningful progress, and no tolerance for freezing feature work.
The team used the Strangler Fig pattern, rejecting both a big-bang rewrite (too risky, undocumented business logic) and a pure lift-and-shift (would not fix deployment coupling).
The Strangler Fig Approach
The principle: every new feature lands in a new service, every refactor is an extraction opportunity, and the monolith only shrinks. A routing layer — CloudFront in front of an ALB whose listener rules route by path to target groups (one per service plus one for the monolith) — lets requests move to new services without changing clients.
A new service launches by adding an ALB rule that initially sends a small traffic percentage to the new target group; the team watches metrics and shifts weight or rolls back. The hard-won discipline: ALB target-group weights are the source of truth, so every request must go through the ALB. Business feature flags live in application code, not at the ALB.
Service Boundaries and Data Migration
Services were extracted along data-ownership boundaries, not codebase organization — the Billing service owns charges, subscriptions, and invoices. Two 'logic services' that owned no data were extracted and rolled back. After 18 months: Identity, Billing, Catalog, and a greenfield Notification service; the monolith still holds the core product, to be attacked last.
Data migration uses three alternative techniques: greenfield (empty DB for new entities), DMS-based cutover (one-way CDC, read-only canary on GETs, then a brief write freeze and flip — used for Billing), and application-level dual-write (used for Identity). After cutover the new service is the single owner; the monolith stops writing. A nightly schema-drift comparison job caught three subtle bugs.
Communication, Observability, and CI/CD
Services communicate via EventBridge for publish/subscribe domain events (the default) and direct service-to-service calls only for unavoidable synchronous needs (token validation). The team rejected a service mesh, relying on mTLS at the ALB plus application-level auth. Observability — X-Ray distributed tracing, CloudWatch Application Signals, consistent structured logging — is what made the migration manageable.
CI/CD migrated in stages: year one ran Jenkins for the monolith and CodePipeline for new services; year two retired Jenkins. The lesson: do not replace the deployment toolchain and split the monolith at the same time.
Big-bang rewrite — rejected — freezes feature work and bets on undocumented business logic surviving a from-scratch rebuild.
Lift-and-shift — rejected — containerizing the monolith does not fix the deployment-coupling problem.
Strangler Fig — chosen — route around the monolith with an ALB and replace it feature by feature, incrementally and reversibly.
- Extracting services along codebase-organization or request-lifecycle boundaries instead of data-ownership boundaries.
- Creating 'logic' microservices that own no data — they add latency and complexity and tend to be rolled back.
- Letting requests bypass the ALB, breaking the routing-layer-as-source-of-truth that the Strangler Fig depends on.
- Forgetting to remove the monolith's write paths after cutover, so the two stores diverge silently.
- Replacing the CI/CD toolchain and splitting the monolith at the same time, doubling the risk.
- Starting with the most critical service (Identity) instead of the lowest-risk one (Notification) to learn the patterns.
- Use the Strangler Fig pattern: route around the monolith with an ALB and replace it piece by piece.
- Extract services along data-ownership boundaries; a service that owns no data is probably a library.
- Treat ALB target-group weights as the source of truth and keep business feature flags in application code.
- Choose the data-migration technique per context (greenfield, DMS cutover, or dual-write) and make the new service the single owner after cutover.
- Communicate via EventBridge domain events by default; minimize synchronous cross-service calls.
- Invest in observability before extracting, and change the toolchain or split the monolith — one at a time.
Knowledge Check
What is the source of truth for which target serves which path in the Strangler Fig migration?
- ALB listener rules and target-group weights — so every request must go through the ALB
- Route 53 record TTLs governing how long name lookups stay cached, with shorter TTLs steering each request path to whichever target currently owns it
- Business feature flags evaluated deep inside the application code
- The legacy monolith's own internal request-dispatch router
Along what boundaries did the team successfully extract services?
- Data-ownership boundaries — a service owns a bounded slice of the domain including its data
- Codebase-organization boundaries, where each source file or module in the repository is carved out into its own independently deployed service
- Request-lifecycle boundaries, such as a validator and a formatter service
- Team-seating boundaries, matching whichever engineers sit near each other
After a data cutover, what must the monolith stop doing?
- Writing to the extracted service's tables — otherwise the two stores diverge silently
- Reading any data whatsoever from the shared store during the transition window, even for the routes it still legitimately owns and serves
- Serving any HTTP traffic for its remaining routes via the ALB
- Running on its current EC2 compute platform underneath
What is the lesson about CI/CD during the migration?
- Do not replace the deployment toolchain and split the monolith at the same time — one large change at a time
- Rip out Jenkins and cut every build over to CodePipeline on day one, before extracting any service at all, so the new toolchain is fully in place first
- Stand up a separate dedicated pipeline for every single commit
- Avoid CodePipeline entirely and keep everything on Jenkins
You got correct