Chapter 13: Advanced Patterns
Topic 78

Terraform with Ansible

Integration

Terraform provisions infrastructure; Ansible configures it. Real systems use both, with a clean handoff between them: Terraform creates the EC2 instances, VPC, and load balancer and outputs what Ansible needs to find them, then Ansible takes that inventory and installs, configures, and manages the software on top. Each tool stays in its lane and neither reinvents the other.

The whole topic turns on one boundary — existence versus configuration — and one mechanism for crossing it: tags and dynamic inventory, not provisioners. Get the handoff right and the two tools compose into a pipeline that's far cleaner than either trying to do both jobs.

The Terraform-to-Ansible handoff
Terraform provisions & tags
dynamic inventory by tag
Ansible configures the hosts

The Division of Labor

Terraform owns the existence and shape of infrastructure: the instances exist, the network is wired, the load balancer is in front, the security groups are correct. Ansible owns what runs on the machines: packages installed, config files rendered, services started and kept in their desired state. The line is whether the thing is a cloud resource (Terraform) or a property of the operating system and application running inside it (Ansible). Drawing that line explicitly is what keeps the two from fighting over the same concern.

The Handoff via Outputs

Terraform tells Ansible what it built through outputs. After apply, the instances' IPs, the autoscaling group's name, and the tags Terraform stamped on each resource are all available — and those are exactly the facts Ansible needs to target the right hosts. The most important of these is tags: Terraform tags every instance consistently, and that tag becomes the contract the configuration layer keys off.

Terraform tags instances and outputs what Ansible needs
resource "aws_instance" "web" {
  count         = 3
  ami           = "ami-0abc123"
  instance_type = "t3.micro"
  tags = {
    # the tag Ansible's dynamic inventory will group on
    Role = "web"
  }
}

output "web_private_ips" {
  value = aws_instance.web[*].private_ip
}

The Role = "web" tag is the handoff point. Terraform guarantees every web instance carries it; Ansible will discover the hosts by that tag rather than by a hardcoded address. The IP output is there for any script that wants the list directly, but the tag is what keeps the integration correct as instances come and go.

Dynamic Inventory

A static Ansible inventory — a file listing host IPs — drifts the moment Terraform scales the group or replaces an instance. Ansible's AWS inventory plugin solves this by querying AWS at run time and grouping hosts by tag: every instance tagged Role = web lands in the web group automatically, no matter when Terraform created it. The inventory is generated from reality on every run, so it never goes stale.

aws_ec2.yml — Ansible dynamic inventory grouping by Terraform's tag
plugin: amazon.aws.aws_ec2
regions:
  - us-east-1
keyed_groups:
  # build an Ansible group from the Role tag Terraform set
  - key: tags.Role
    prefix: role

Point Ansible at this file instead of a static host list and it discovers the current fleet itself. A playbook that targets role_web configures exactly the instances Terraform tagged Role = web, whether there are three of them today and seven tomorrow.

Orchestration Order

The sequence is fixed: provision with Terraform, then configure with Ansible. A pipeline runs terraform apply to create or update the infrastructure, then runs the Ansible playbook against the freshly-discovered inventory. Because the inventory is dynamic, the Ansible step automatically sees whatever Terraform just produced — new instances are configured, removed ones simply aren't in the inventory anymore. The two stages are decoupled enough that you can re-run Ansible to push a config change without touching infrastructure, and re-run Terraform to change infrastructure without forcing a full reconfigure.

Why Not Provisioners

Terraform's remote-exec provisioner can run commands on an instance at create time, and it is the wrong tool for configuration. It fires only once, on creation, so it never reconverges drift. It isn't tracked in state, so Terraform can't tell you what it did or detect when it needs redoing. And it makes apply brittle — a failed provisioner can leave a resource created but unconfigured. Ansible is built for exactly this job: idempotent, re-runnable, and able to bring a host back to its desired state any time. The handoff via tags and dynamic inventory is the right pattern precisely because it keeps configuration in the tool designed for it.

Common Mistakes
  • Cramming configuration into Terraform remote-exec provisioners instead of handing off to Ansible, losing idempotency, tracking, and the ability to reconverge drift.
  • Maintaining a static Ansible inventory that drifts from the instances Terraform actually created, instead of dynamic inventory keyed on a tag.
  • Coupling the two tools' runs so tightly that a config change forces an infrastructure apply, or an infrastructure change forces a full reconfigure.
  • Having both tools try to own the same concern — both managing the same config file — so they overwrite each other on alternating runs.
  • Tagging instances inconsistently, so the dynamic inventory misses hosts and Ansible silently skips configuring part of the fleet.
Best Practices
  • Keep the provisioning/configuration line clean: Terraform creates the infrastructure, Ansible configures what runs on it.
  • Hand off through Terraform outputs and Ansible dynamic inventory — discover hosts by tag rather than maintaining a static list.
  • Sequence the two in a pipeline: terraform apply first, then run Ansible against the freshly discovered inventory.
  • Bake what you can into machine images with Packer, and use Ansible only for what genuinely must be configured at runtime.
  • Tag every instance consistently so the dynamic inventory groups the whole fleet correctly and never silently skips a host.
Comparable tools cloud-init / user-data for simpler boot-time configuration Packer bakes images to shrink the configuration step Chef / Puppet / SaltStack fill Ansible's role in the same handoff

Knowledge Check

How is the division of labor drawn between Terraform and Ansible?

  • Terraform owns the existence and shape of infrastructure; Ansible owns what runs on the machines
  • Terraform handles the production environment while Ansible is reserved for staging and lower environments
  • Ansible creates the cloud resources and Terraform installs the software packages on top of them
  • They both manage the exact same resources in parallel for redundancy and failover

How does the handoff from Terraform to Ansible work?

  • Terraform tags instances and exposes outputs; Ansible's dynamic inventory discovers those hosts by tag at run time
  • Terraform SSHes into each freshly created instance and runs the Ansible playbook itself through a remote-exec provisioner
  • Ansible reads Terraform's state file directly on disk to find the hosts and their addresses
  • They share a single combined configuration file that both tools read and write on every run

Why does dynamic inventory beat a static host list?

  • It queries AWS at run time and groups hosts by tag, so it never drifts as Terraform scales or replaces instances
  • It encrypts the discovered host IP addresses at rest so they can't leak from inventory
  • It runs the playbook faster by caching the resolved host list permanently so AWS is never queried again after the first run
  • It lets Ansible provision new instances that Terraform missed during the apply

Why is this handoff preferred over a Terraform remote-exec provisioner?

  • Ansible is idempotent, re-runnable, and reconverges drift, while remote-exec runs once at create time, isn't tracked, and makes apply brittle
  • remote-exec is fundamentally unable to connect over SSH to any instance, so it never actually works at all in practice
  • remote-exec costs extra money per invocation
  • Ansible stores its full machine configuration inside the Terraform state so the next plan can diff it like any other tracked resource attribute

You got correct