Topic 47

PKI, CAs, and Revocation

PKI

Public Key Infrastructure is the whole system that makes a certificate worth trusting: the Certificate Authorities that issue them, the trust stores that anchor the roots, and the revocation machinery that can invalidate a certificate before its natural expiry. Issuing certificates is the easy part. The hard problem PKI has never fully solved is revoking a compromised one fast enough to matter.

When a server's private key leaks, its certificate is still cryptographically valid until its not-after date — the attacker can impersonate the server until then unless revocation reaches the client first. Two mechanisms, CRL and OCSP, were built to deliver that signal, both scale poorly, and the industry's pragmatic answer turned out to be making certificates so short-lived that revocation barely matters.

Which revocation approach fits

Simple, offline-friendly, freshness within hours is fine→CRL

Need per-cert live status, accept a round trip and CA privacy leak→OCSP

Want fresh status with no client round trip or leak→OCSP stapling

Want revocation to stop mattering at all→Short-lived certs

The Trust Anchors

Every chain terminates at a root CA whose self-signed certificate is preinstalled in a trust store — the set Mozilla, Apple, Microsoft, and Google curate and ship with browsers and operating systems. Membership is a privilege governed by audits and the CA/Browser Forum baseline requirements; a CA that misissues badly enough gets distrusted and removed, as happened to Symantec's and DigiNotar's roots.

The trust decision is therefore not yours but the trust-store maintainers'. Your client trusts roughly 150 root CAs by default, and any of them can issue a certificate for any domain. That is PKI's structural weakness: the system is only as trustworthy as its least careful CA, which is what name constraints and Certificate Transparency exist to contain.

Revocation: CRL versus OCSP

A Certificate Revocation List (CRL) is a signed file the CA publishes listing the serial numbers of every certificate it has revoked. A client downloads the whole list and checks membership. The problem is size and freshness: a busy CA's CRL grows to megabytes, clients cache it for hours, and a certificate revoked five minutes ago is not on the copy the client already holds.

OCSP (Online Certificate Status Protocol) trades the bulk download for a per-certificate query: the client asks the CA's responder "is serial X still good?" and gets a small signed answer. It is fresher but adds a network round trip to a third party on the connection's critical path, leaks to the CA which sites the user visits, and — the fatal flaw — clients soft-fail. If the responder is slow or down, browsers proceed as if the certificate were valid rather than block the page, which means an attacker who can block the OCSP query also defeats the check.

# query a cert's revocation status directly via OCSP
openssl ocsp -issuer chain.pem -cert server.pem \
  -url http://ocsp.example-ca.org -no_nonce
# Response verify OK
# server.pem: good       <- not revoked, this instant
#   This Update: ...  Next Update: ...

OCSP Stapling and Short-Lived Certificates

OCSP stapling fixes the round trip and the privacy leak. The server itself queries the OCSP responder periodically, caches the signed "good" response, and staples it into the TLS handshake. The client gets a fresh, CA-signed status with zero extra connections and without telling the CA which user is visiting. The status is still only as current as the staple's refresh interval, but it removes the responder from every client's critical path.

The deeper fix is short-lived certificates. If a certificate lives 90 days — or 7, or 1 — the window in which a leaked key is useful shrinks to that lifetime, and revocation stops being load-bearing because the certificate expires before a CRL would have propagated anyway. Let's Encrypt's 90-day default and the move toward even shorter spans are this strategy: rather than fix revocation, make certificates expire fast enough that you rarely need it. The catch is it only works with automated renewal.

CA Compromise and Containment

Because any trusted CA can sign for any domain, a single compromised or coerced CA is a global threat — DigiNotar's 2011 breach produced fraudulent Google certificates used to intercept Iranian users. Two mechanisms limit the blast radius. Name constraints restrict an intermediate to specific domains, so an internal CA scoped to corp.example.com cannot mint a valid certificate for a bank.

Certificate Transparency (CT) requires every publicly trusted certificate to be logged in append-only, publicly auditable logs, and browsers reject certificates that lack a proof of logging. A domain owner monitoring CT logs sees any certificate issued for their name within minutes — including a misissued one — turning silent misissuance into a detectable, attributable event. CT does not prevent a bad CA; it guarantees you find out.

CRL vs OCSP vs Short-Lived Certs

CRL is a bulk signed list of revoked serials the client downloads and caches; simple but large and stale, with revocations taking hours to propagate. OCSP is a per-certificate live query — fresher but it adds a round trip, leaks browsing to the CA, and clients soft-fail when the responder is unreachable, so it often does not actually block.

Short-lived certificates sidestep both: a 7-to-90-day lifetime bounds the damage from a leaked key without any revocation lookup, which is why the industry moved this way. The cost is mandatory automated renewal — a short cert with manual renewal is just a scheduled outage.

Common Mistakes

Relying on OCSP or CRL as a hard security control. Browsers soft-fail when the responder is unreachable, so an attacker who blocks the revocation query also defeats the revocation, and the page loads anyway.
Issuing long-lived certificates with no fast revocation path. A two-year certificate whose key leaks stays abusable for the rest of its life if revocation soft-fails, with no way to truly pull it.
Not monitoring Certificate Transparency logs for your domains. A misissued certificate from any of the 150 trusted CAs goes unnoticed without CT monitoring, and you learn about the impersonation from the incident instead.
Skipping name constraints on an internal CA. An unconstrained private intermediate that leaks can mint a valid certificate for any public domain in your fleet's trust store, not just your internal names.
Treating the 150 default root CAs as equally trustworthy. Any one of them can sign for your domain, so trust is only as strong as the weakest CA, and a single misissuance affects you regardless of which CA you chose.

Best Practices

Enable OCSP stapling on your servers so clients get a fresh, CA-signed status in the handshake without a third-party round trip or a privacy leak to the responder.
Adopt short-lived certificates (90 days or less) with automated renewal, so a leaked key has a bounded useful life and you do not depend on revocation propagating in time.
Monitor Certificate Transparency logs for every domain you own and alert on any certificate you did not request, turning misissuance into a same-day detection.
Apply name constraints to internal and intermediate CAs so a compromised one is scoped to specific domains and cannot impersonate arbitrary public sites.
Set OCSP Must-Staple on high-value certificates where supported, so a client refuses to connect without a valid staple instead of soft-failing past a missing revocation answer.

Comparable conceptsRPKI (routing's PKI, Topic 22)Certificate Transparency (misissuance audit)DNSSEC (signed DNS records)

Knowledge Check

Why does OCSP often fail to actually block a revoked certificate in practice?

Browsers soft-fail when the responder is unreachable, so blocking the query bypasses it
The OCSP response is unsigned, so an attacker can simply forge a 'good' reply
It must download the entire certificate revocation list, which is far too large to finish fetching in time
It only reports whether the certificate has expired, never whether it was revoked

What does OCSP stapling change about how a client learns a certificate's revocation status?

The server staples the signed status into the handshake, removing the client's round trip to the CA
It disables revocation checking entirely so the handshake completes faster
The server signs its own revocation status, which the client trusts directly
The client downloads the full revocation list just once and then caches it locally for all of its later connections

Why did the industry move toward short-lived certificates instead of better revocation?

A short lifetime bounds the damage from a leaked key, so revocation matters less
Shorter certificates are issued using stronger cryptographic algorithms than longer-lived ones
Frequent reissuance reduces the load on the certificate authority's responders
They remove the need for any renewal automation on the server side

You got correct