What State Is and Why It Exists
Terraform's state is the file that maps your configuration to the real-world objects it created — the bridge between aws_instance.web in your code and i-0abc123 in AWS. When you run apply, Terraform creates resources and writes down which real object answers to which config address. Every later plan reads that file first.
Without state, Terraform would have no memory. It could not know which real resource matches which block, whether something already exists, or what to destroy. Almost every Terraform behavior that confuses newcomers — "why did it want to recreate everything?", "why is the resource still in AWS after I deleted it from the code?" — traces back to a misunderstanding of state.
Why State Is Necessary
Your configuration describes resources by address — aws_instance.web, aws_s3_bucket.logs. AWS knows those objects only by their real IDs — i-0abc123, a bucket name. State is the lookup table between the two. With it, Terraform compares the recorded reality against your config and computes a precise diff: this attribute changed, this resource is new, this one should go. Without it, the only safe move would be to recreate everything from scratch on every run, because Terraform would have no way to know what it already built.
A trimmed state file makes the mapping concrete. The instance block in your config becomes a JSON record pinning the address to the real instance ID and its known attributes.
{
"resources": [
{
"type": "aws_instance",
"name": "web",
"instances": [
{
"attributes": {
"id": "i-0abc123",
"instance_type": "t3.micro",
"private_ip": "10.0.1.42"
}
}
]
}
]
}
What State Stores
State records far more than IDs. For every managed resource it stores the full set of resolved attribute values — some of them sensitive, like a generated database password or a private key — along with the dependency edges Terraform inferred and metadata such as the Terraform and provider versions that last wrote it. That last detail is why a state file written by a newer Terraform can refuse to load in an older one: the schema moved on.
State as a Performance Cache
State doubles as a cache. Because the last-known attributes are already recorded, Terraform can plan against them and refresh selectively rather than re-reading every resource's full configuration from the API on every run. On a small stack the difference is seconds; on a stack with thousands of resources it is the difference between a plan that returns in under a minute and one that hammers the AWS API for ten. The trade-off is that a stale cache can hide drift until the next refresh reconciles it.
State as the Source of What Exists
This is the point that bites people. plan and destroy operate on what is in state, not on what is in your configuration files. Delete a resource block from your code and Terraform plans to destroy the object — because it is still in state. Run terraform state rm instead and the object stays running in AWS while Terraform simply forgets it exists. Removing from state and destroying are two different operations with opposite effects, and choosing the wrong one either orphans a resource or deletes one you meant to keep.
The Dangers
State is a liability as much as an asset. It holds secrets in plaintext, so committing terraform.tfstate to a shared or public git repo leaks every password and key it recorded. It is the single source of truth for what Terraform manages, so a corrupted or lost file means Terraform no longer knows what it owns — and rebuilding that knowledge by hand is painful. And local state on a laptop is a single file with no sharing and no durability: the local backend does take an OS-level lock, so two applies on the same machine can't corrupt it — but that lock can't coordinate teammates on other machines or a copy passed around in chat, and a wiped disk takes your only record with it.
- Assuming the config alone defines reality, then being surprised that
state rmleaves the resource running in AWS while Terraform forgets it. - Committing
terraform.tfstateto a shared or public git repo, leaking every secret and ID it stores in plaintext. - Hand-editing the state file in a text editor and corrupting its JSON or its serial and lineage bookkeeping so the backend rejects the next write.
- Sharing one local state file across machines (committing it to git or passing it around in chat) — the local backend's lock only coordinates processes on the same machine, so separate laptops clobber each other.
- Deleting the state file to "start fresh" while resources still exist, orphaning real infrastructure that now has no owner.
- Treat state as sensitive: never commit it to git, and store it in a remote backend encrypted at rest.
- Use
terraform statesubcommands to inspect or modify state rather than editing the JSON by hand. - Know the difference cold —
state rmmakes Terraform forget a resource without deleting it;destroydeletes it — and pick deliberately every time. - Isolate state per environment so a mistake in dev can never reach into prod's state.
- Back up state before any risky operation; a remote backend with bucket versioning gives you a recovery point for free.
Knowledge Check
Why does Terraform need a state file at all?
- It maps each config address to the real resource ID so Terraform can compute a diff instead of recreating everything
- It stores your AWS credentials so that you do not have to log in and authenticate yourself again on each separate run
- It caches the downloaded provider binaries so that init does not have to re-download them every time
- It is only ever needed for remote backends; local runs work perfectly fine without it
You run terraform state rm aws_instance.web. What happens to the EC2 instance?
- It keeps running in AWS, but Terraform no longer tracks or manages it
- It is destroyed in AWS, because removing the entry from state deletes the resource
- It is recreated from the config block on the very next terraform apply
- Nothing happens to it until you also remove the matching config block
Why is committing terraform.tfstate to a shared git repo a security problem?
- State stores resolved attribute values, including secrets like passwords and keys, in plaintext
- Git cannot diff JSON at all, so the state file silently corrupts itself on every single commit you make
- The file embeds the provider binary, which is far too large to commit into git
- Committing it deletes the live resources that the file references from AWS
Two engineers share one local terraform.tfstate by passing it around, and each runs apply from their own laptop. Why can this corrupt state?
- The local backend's lock only coordinates processes on one machine, so neither laptop sees the other's lock and they clobber each other
- The local backend never takes any lock at all, not even for two separate processes running on one machine
- Terraform merges both engineers' changes cleanly afterward because the state file is strictly append-only
- Nothing at all goes wrong here — copying the single state file back and forth by hand keeps both laptops perfectly in sync the entire time
You got correct