Topic 52

Project and Repository Layout

LayoutStructure

Once Terraform grows past a single directory, layout becomes the dominant factor in how maintainable it is. There is no official structure, but there are strong patterns and clear anti-patterns, and the choice between them is decided by one tension: keeping things DRY versus keeping environments isolated. The layouts teams actually run are the ones that pick a side of that tension deliberately rather than letting the tree grow by accident.

The single decision that matters most is where your state boundaries fall, because everything else — plan speed, blast radius, who can break what — follows from it. Directory layout is mostly a way of expressing those state boundaries in a form a human can navigate.

From One Directory to Many

A single directory holds one root module and one state. It works until a plan starts taking minutes, until an apply that touches a load balancer also has the power to destroy a database, or until two teams keep colliding on the same state lock. The first thing to split is by blast radius and change frequency: the foundational, rarely-changed layer (VPC, IAM, DNS) goes in its own state, and the fast-moving application layer goes in another. A small plan over 30 resources is worth ten over 800.

The Modules Directory

A mature repo separates two kinds of directory cleanly. A modules/ tree holds reusable, parameterized internal modules — a network module, an ECS-service module — that have no backend and are never applied directly. The environment root modules compose those modules, hold the backend and environment specifics, and are the directories you actually run apply in. Mixing the two into one undifferentiated tree is the layout mistake that makes a repo unnavigable.

Two halves of a mature repo

repo

infrastructure/

modules/

network · ecs-service · rds

libraries: no backend, never applied directly

environments/

dev · staging · prod

root modules: own backend + state, applied here

A monorepo with a modules tree and per-environment roots

infrastructure/
├── modules/              # reusable, no backend, never applied directly
│   ├── network/
│   ├── ecs-service/
│   └── rds/
└── environments/
    ├── dev/              # root module: backend + module calls + tfvars
    ├── staging/
    └── prod/             # own state, own backend, isolated

Monorepo vs Polyrepo

Putting all infrastructure in one repository gives you a single place to see and change everything and lets one pull request touch related layers together. The cost is coarse access control — anyone with write access can change anything — and a large surface for a careless change. Splitting into per-team or per-service repositories isolates ownership and blast radius, but makes cross-cutting changes a multi-repo dance and turns shared-module versioning into a real chore. Most teams land on a monorepo with strong state boundaries: one repo to navigate, many states so no single apply can touch everything.

State File Granularity

Layout maps directly onto state boundaries, and state boundaries are what actually determine blast radius and plan speed. Smaller states mean faster plans, a smaller blast radius per apply, and a lock that only one small team contends for. The opposite failure mode is real too: split into dozens of tiny states and every value one needs from another becomes a remote-state lookup, until the cross-references are a tangle harder to reason about than the monolith you were escaping. Granularity is a judgment, not a maximum.

Monorepo vs polyrepo for infrastructure

Monorepo — all infrastructure in one repository with cross-references in reach and one place to change everything. Choose it when you want a single navigable tree and can enforce state boundaries; pair it with separate states per layer so blast radius stays small despite the single repo.

Polyrepo — per-team or per-service repositories that isolate ownership and blast radius at the repo level. Choose it when ownership boundaries are sharp and teams must not touch each other's code, accepting that cross-cutting changes and shared-module version bumps become multi-repo work.

Common Mistakes

Keeping a whole estate in one giant state and directory, so a routine plan takes minutes and any apply has the power to destroy everything.
Splitting into so many tiny states that cross-references become a tangle of terraform_remote_state lookups harder to follow than the monolith.
Mixing reusable modules and environment root modules in one undifferentiated tree, so nobody can tell what gets applied from what gets called.
Letting layout grow organically with no convention, so each corner of the repo is structured differently and onboarding means re-learning the map each time.
Drawing state boundaries by service type instead of by blast radius, so a fast-moving app layer shares a state and a lock with the foundational network layer.

Best Practices

Map state boundaries to blast radius and change frequency, separating the foundational layers from the fast-moving application layers.
Keep a clear split between modules/ for reusable code and environment root modules that are composed and applied.
Choose a monorepo with strong state boundaries unless ownership clearly demands separate repositories.
Adopt one consistent convention for directory structure and apply it across the whole repo so every corner reads the same way.
Keep states small enough for fast plans but large enough that related resources stay together without remote-state lookups.

Comparable tools Pulumi projects and stacks face the same layout and blast-radius questions CloudFormation nested stacks and StackSets partition an estate similarly Terragrunt exists largely to manage this layout problem

Knowledge Check

What primarily drives splitting infrastructure into multiple states and directories?

Blast radius and change frequency — keeping a small, slow-changing apply from sharing fate with a large, fast-moving one
A hard Terraform limit on the number of resources that a single state file is allowed to track, which forces a split once a layer grows past that fixed ceiling
The requirement that every AWS region be given its own separate, dedicated state file, so any deployment touching more than one region must be split apart by directory
A rule that module calls and plain resource blocks cannot coexist in the same directory, so each layer has to be peeled out into a directory of its own

What is the central trade-off of a monorepo versus polyrepo for infrastructure?

A monorepo gives one place to see and change everything but coarse access control; polyrepos isolate ownership but make cross-cutting changes harder
A monorepo always plans faster because Terraform caches plan results across every directory in the repository and reuses them on the next run, while a polyrepo throws that cache away each time
Polyrepos remove the need for state files entirely because each repo just tracks its own resources directly from the source, so only a monorepo ever has to keep a state file around
A monorepo forces every layer into one single shared state while polyrepos force many tiny separate ones, so the repo choice fully dictates how the state ends up partitioned

Why does state granularity map to blast radius?

An apply can only affect what is in its state, so smaller states mean a smaller set of resources any one apply can damage
Smaller state files are encrypted with a stronger algorithm than large state files are, so a tighter split hardens each apply against tampering and shrinks the blast radius
Terraform refuses to destroy any resource unless it shares a single state file with the trigger, so packing more into one state is what actually widens what an apply can reach
State size has no bearing on blast radius at all; only IAM policies actually control it, so splitting state files does nothing once the access rules are already in place

What distinguishes a modules/ directory from an environment root module?

Modules are reusable and never applied directly; root modules hold the backend and are the directories you run apply in
Modules hold the backend configuration and the root modules borrow it from them at apply time, so the reusable tree is the one place that decides where state actually lands
Root modules are the reusable libraries that get called from elsewhere, while the modules tree holds the deployable units you actually run apply against directly
There is no real difference; the two names are fully interchangeable conventions, so a single flat tree of mixed directories works exactly as well as any split

You got correct