Topic 14

What State Is and Why It Exists

StateFundamentals

Terraform's state is the file that maps your configuration to the real-world objects it created — the bridge between aws_instance.web in your code and i-0abc123 in AWS. When you run apply, Terraform creates resources and writes down which real object answers to which config address. Every later plan reads that file first.

Without state, Terraform would have no memory. It could not know which real resource matches which block, whether something already exists, or what to destroy. Almost every Terraform behavior that confuses newcomers — "why did it want to recreate everything?", "why is the resource still in AWS after I deleted it from the code?" — traces back to a misunderstanding of state.

What state sits between

Configuration (.tf)

Resources named by address — aws_instance.web — the desired shape you wrote.

State (the map)

The lookup table pinning each address to a real ID, so Terraform can compute a precise diff.

Real cloud objects

The actual AWS resources — i-0abc123 — known only by their real IDs.

Why State Is Necessary

Your configuration describes resources by address — aws_instance.web, aws_s3_bucket.logs. AWS knows those objects only by their real IDs — i-0abc123, a bucket name. State is the lookup table between the two. With it, Terraform compares the recorded reality against your config and computes a precise diff: this attribute changed, this resource is new, this one should go. Without it, the only safe move would be to recreate everything from scratch on every run, because Terraform would have no way to know what it already built.

A trimmed state file makes the mapping concrete. The instance block in your config becomes a JSON record pinning the address to the real instance ID and its known attributes.

terraform.tfstate — the config-to-reality mapping (trimmed)

{
  "resources": [
    {
      "type": "aws_instance",
      "name": "web",
      "instances": [
        {
          "attributes": {
            "id": "i-0abc123",
            "instance_type": "t3.micro",
            "private_ip": "10.0.1.42"
          }
        }
      ]
    }
  ]
}

What State Stores

State records far more than IDs. For every managed resource it stores the full set of resolved attribute values — some of them sensitive, like a generated database password or a private key — along with the dependency edges Terraform inferred and metadata such as the Terraform and provider versions that last wrote it. That last detail is why a state file written by a newer Terraform can refuse to load in an older one: the schema moved on.

State as a Performance Cache

State doubles as a cache. Because the last-known attributes are already recorded, Terraform can plan against them and refresh selectively rather than re-reading every resource's full configuration from the API on every run. On a small stack the difference is seconds; on a stack with thousands of resources it is the difference between a plan that returns in under a minute and one that hammers the AWS API for ten. The trade-off is that a stale cache can hide drift until the next refresh reconciles it.

State as the Source of What Exists

This is the point that bites people. plan and destroy operate on what is in state, not on what is in your configuration files. Delete a resource block from your code and Terraform plans to destroy the object — because it is still in state. Run terraform state rm instead and the object stays running in AWS while Terraform simply forgets it exists. Removing from state and destroying are two different operations with opposite effects, and choosing the wrong one either orphans a resource or deletes one you meant to keep.

The Dangers

State is a liability as much as an asset. It holds secrets in plaintext, so committing terraform.tfstate to a shared or public git repo leaks every password and key it recorded. It is the single source of truth for what Terraform manages, so a corrupted or lost file means Terraform no longer knows what it owns — and rebuilding that knowledge by hand is painful. And local state on a laptop is a single file with no sharing and no durability: the local backend does take an OS-level lock, so two applies on the same machine can't corrupt it — but that lock can't coordinate teammates on other machines or a copy passed around in chat, and a wiped disk takes your only record with it.

Common Mistakes

Assuming the config alone defines reality, then being surprised that state rm leaves the resource running in AWS while Terraform forgets it.
Committing terraform.tfstate to a shared or public git repo, leaking every secret and ID it stores in plaintext.
Hand-editing the state file in a text editor and corrupting its JSON or its serial and lineage bookkeeping so the backend rejects the next write.
Sharing one local state file across machines (committing it to git or passing it around in chat) — the local backend's lock only coordinates processes on the same machine, so separate laptops clobber each other.
Deleting the state file to "start fresh" while resources still exist, orphaning real infrastructure that now has no owner.

Best Practices

Treat state as sensitive: never commit it to git, and store it in a remote backend encrypted at rest.
Use terraform state subcommands to inspect or modify state rather than editing the JSON by hand.
Know the difference cold — state rm makes Terraform forget a resource without deleting it; destroy deletes it — and pick deliberately every time.
Isolate state per environment so a mistake in dev can never reach into prod's state.
Back up state before any risky operation; a remote backend with bucket versioning gives you a recovery point for free.

Comparable tools CloudFormation keeps equivalent state server-side in AWS, no file to manage Pulumi stores state in its service or a backend Ansible is stateless, which is why it cannot compute a destroy

Knowledge Check

Why does Terraform need a state file at all?

It maps each config address to the real resource ID so Terraform can compute a diff instead of recreating everything
It stores your AWS credentials so that you do not have to log in and authenticate yourself again on each separate run
It caches the downloaded provider binaries so that init does not have to re-download them every time
It is only ever needed for remote backends; local runs work perfectly fine without it

You run terraform state rm aws_instance.web. What happens to the EC2 instance?

It keeps running in AWS, but Terraform no longer tracks or manages it
It is destroyed in AWS, because removing the entry from state deletes the resource
It is recreated from the config block on the very next terraform apply
Nothing happens to it until you also remove the matching config block

Why is committing terraform.tfstate to a shared git repo a security problem?

State stores resolved attribute values, including secrets like passwords and keys, in plaintext
Git cannot diff JSON at all, so the state file silently corrupts itself on every single commit you make
The file embeds the provider binary, which is far too large to commit into git
Committing it deletes the live resources that the file references from AWS

Two engineers share one local terraform.tfstate by passing it around, and each runs apply from their own laptop. Why can this corrupt state?

The local backend's lock only coordinates processes on one machine, so neither laptop sees the other's lock and they clobber each other
The local backend never takes any lock at all, not even for two separate processes running on one machine
Terraform merges both engineers' changes cleanly afterward because the state file is strictly append-only
Nothing at all goes wrong here — copying the single state file back and forth by hand keeps both laptops perfectly in sync the entire time

You got correct