caduh

TLS / HTTPS: the parts that break in production

11 min readUpdated

Cert chains, renewals, mTLS, HSTS, and handshake debugging, explained from the operator’s side.

TLS / HTTPS Deep Dive

Cert chains, renewal, mTLS, HSTS, and debugging bad handshakes

Goal: help you run HTTPS like an operator instead of treating TLS as a magical black box that only matters when a cert is expiring or a browser starts throwing red screens.


TL;DR

  • A TLS deployment is mostly about five things: the certificate chain, hostname validation, renewal automation, policy headers like HSTS, and clean debugging when handshakes fail.
  • Your server should usually present the leaf certificate plus intermediate certificate(s). The root certificate is generally not sent because clients are expected to already trust it from their trust store.
  • Renewal should be fully automated. In 2026, the cleanest path is still ACME with monitoring, staged testing, and reload automation. If your ACME client supports ARI, use it; if not, renewing well before expiry is still the practical fallback.
  • mTLS is useful when the client itself must prove identity, especially for service-to-service, B2B APIs, and device fleets. It is usually the wrong default for general browser users.
  • HSTS tells browsers to stop using HTTP for a host and to stop allowing click-through on future certificate errors. That makes it powerful and easy to misuse.
  • Most "bad handshake" incidents come down to a short list: wrong cert, broken chain, SNI mismatch, protocol mismatch, missing client cert, or trust-store mismatch.

1) Start with the right mental model

TLS is not just "the lock icon." It is a protocol that lets a client and server:

  • agree on protocol settings
  • authenticate the server
  • optionally authenticate the client
  • derive shared session keys
  • encrypt application traffic afterward

For HTTPS, the rough flow is:

ClientHello
  - SNI: which hostname the client wants
  - ALPN: h2 / http/1.1 / h3
  - supported TLS versions and key shares

ServerHello
  - chosen version and cipher settings
  - certificate chain
  - proof of private-key possession

Optional:
  - client certificate request for mTLS

Finished
  - encrypted HTTP starts

If you want the 60-second version first, see How SSL/TLS Works in 60 Seconds.


2) How certificate chains actually work

2.1 The chain is usually:

Leaf certificate -> Intermediate CA -> Root CA
  • The leaf cert is the certificate for your hostname, such as api.example.com.
  • The intermediate signs the leaf.
  • The root is the trust anchor already present in the client trust store.

2.2 What the server should send

Most web servers should send:

  • the leaf certificate
  • the intermediate certificate(s)

and usually not the root.

In practical terms, that usually means serving a full chain file, not just the leaf certificate by itself.

Why?

  • Clients are expected to already trust the root from their operating system or browser trust store.
  • Sending the root usually adds noise, not value.
  • The practical problem is almost always a missing intermediate, not a missing root.

2.3 What the client checks

When the client validates the server certificate, it checks things like:

  • the hostname is listed in the certificate SAN
  • the certificate is currently valid by time
  • the chain can be built to a trusted root
  • the signatures along the chain are valid
  • the certificate is appropriate for server authentication

2.4 Common chain mistakes

Missing intermediate

One client works, another fails, and someone says "but it works in my browser."

That often means one client fetched or cached an intermediate while another client did not. Your fix is still the same: serve the full chain correctly.

Wrong certificate on the right IP

If the wrong virtual host or listener answers, the client may see a valid certificate for the wrong hostname. This is often an SNI routing problem, not a CA problem.

Old pinned assumptions

If you pin specific intermediates or make assumptions about one exact issuance chain, you make renewals harder than they need to be. CAs can rotate intermediates and offer different valid chains over time.


3) Renewal: what a sane setup looks like in 2026

3.1 Renewal should be boring

The correct long-term goal is not "remember to renew certificates."

It is:

certificates renew automatically, validate correctly, deploy safely, reload cleanly, and alert only when automation fails

3.2 ACME is still the default

For public web certificates, ACME is still the obvious default. That means:

  • automated issuance
  • automated renewal
  • automated domain validation
  • less manual key and certificate handling

3.3 What to automate

Your renewal workflow should cover all of this:

  1. obtain or renew the certificate
  2. verify the right names were issued
  3. install the new leaf plus chain
  4. reload or rotate the terminating process
  5. confirm the new certificate is actually being served
  6. alert only if any of those steps fail

3.4 Timing in April 2026

Practical rule:

  • if your ACME client supports ACME Renewal Information (ARI), let the CA guide the renewal window
  • if it does not, renew comfortably before expiry instead of cutting it close

For Let's Encrypt specifically, their integration guidance still recommends checking ARI regularly, and the old "renew with plenty of runway" operational habit remains correct even when ARI is unavailable.

3.5 Renewal mistakes that cause outages

  • renewing successfully but forgetting to reload the proxy or load balancer
  • updating one edge node but not all of them
  • replacing the leaf cert but not the matching chain file
  • assuming the old chain will stay valid forever
  • testing only production and never using staging or dry-run renewal paths
  • waiting until the certificate is almost expired before discovering DNS or validation drift

3.6 A practical renewal checklist

  • run the renewal job more than once a day if your tooling expects periodic checks
  • test renewals in staging
  • alert on days to expiry and also on renewal job failure
  • reload the actual terminating service
  • re-check the served certificate from the outside after deployment
  • avoid manual copy-paste certificate operations unless there is no alternative

4) mTLS: when the client must prove identity too

Normal TLS authenticates the server to the client.

mTLS adds client authentication, so the client also presents a certificate and proves possession of its private key.

4.1 Where mTLS fits well

  • service-to-service traffic
  • internal platform traffic
  • B2B APIs
  • IoT or device fleets
  • machine identities where browser login flows are irrelevant

4.2 Where mTLS is usually a bad default

  • ordinary consumer browser traffic
  • flows where users switch devices often and certificate distribution would be painful
  • products that really need user identity and authorization decisions more than device identity

4.3 What the server validates in mTLS

At a high level, the server checks:

  • was a client certificate presented?
  • does it chain to a trusted client CA or trust store?
  • is it valid by time?
  • is it the right kind of certificate for client auth?
  • does its identity map cleanly to the principal we expect?

4.4 Operational truths about mTLS

  • mTLS authenticates the client certificate holder, which might represent a device, workload, or organization rather than a human user
  • certificate issuance, distribution, rotation, and revocation become part of your product or platform design
  • once TLS terminates at an edge proxy, you need a trustworthy way to pass the verified client identity downstream

4.5 A subtle TLS 1.3 point

TLS 1.3 improves privacy by encrypting certificates in the handshake, but "reactive" client-certificate flows are not something you should assume work uniformly everywhere. If your endpoint needs client certificates, design around requesting them deliberately at connection establishment.

4.6 Quick mTLS test

curl -v --cert client.pem --key client.key https://api.example.com

If that fails during handshake, suspect:

  • wrong client cert
  • wrong client key
  • client cert not chained to a CA the server trusts
  • missing client-auth EKU or wrong profile
  • endpoint not actually configured for the truststore you think it is

5) HSTS: what it really changes

HSTS tells browsers:

  • always use HTTPS for this host in the future
  • do not allow click-through on future certificate errors for that host

That second part is why it matters.

5.1 What HSTS does not do

HSTS does not magically fix TLS.

It does not:

  • repair an expired certificate
  • help clients that are not browsers
  • protect the first-ever visit before the browser has learned the policy, unless the domain is preloaded

5.2 What HSTS does do

Once a browser has learned a valid HSTS policy for a host, it will:

  • rewrite future HTTP attempts to HTTPS
  • refuse to offer the user a bypass path for certificate failures on that host

That is exactly why you should not enable aggressive HSTS before your HTTPS setup is truly solid.

5.3 Safe rollout order

Good rollout pattern:

  1. make HTTPS correct everywhere
  2. start with HSTS on HTTPS responses
  3. use a smaller max-age first
  4. expand confidence
  5. only then consider includeSubDomains
  6. only then consider preload

5.4 Preload is a commitment

If you want HSTS preload, the current submission requirements still include things like:

  • serving a valid certificate
  • redirecting HTTP to HTTPS on the same host if port 80 is in use
  • serving all subdomains over HTTPS
  • serving an HSTS header on the base domain with max-age of at least 31536000
  • including includeSubDomains
  • including preload

This is powerful, but it is also the fastest way to break forgotten subdomains, internal hosts, or "temporary" legacy exceptions.

5.5 HSTS examples

Reasonable first step:

Strict-Transport-Security: max-age=86400

Stronger long-term setting once you are sure:

Strict-Transport-Security: max-age=31536000; includeSubDomains

Preload candidate only after you mean it:

Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

For header guidance beyond HSTS, see Security Headers Cheat Sheet.


6) Debugging bad handshakes: start with the likely failure class

Most TLS incidents become easier once you classify the failure first.

| Symptom | Usual cause | First check | |---|---|---| | Hostname mismatch | wrong cert or SNI misrouting | inspect the cert actually served | | "unable to get local issuer certificate" | missing intermediate or trust mismatch | inspect the full chain | | Works in browser, fails in curl | different trust stores or local CA assumptions | compare CA stores and verbose output | | Only one edge IP fails | stale cert or chain on one node | test that exact IP | | mTLS endpoint gives 400/403 or handshake alert | missing or untrusted client cert | retry with explicit client cert and key | | Started failing right after renewal | reload gap or wrong deployed file | compare served cert vs on-disk cert |

6.1 Check what the server is actually serving

openssl s_client \
  -connect example.com:443 \
  -servername example.com \
  -showcerts \
  -status \
  -verify_return_error </dev/null

Look for:

  • the leaf certificate subject and SANs
  • the issuer chain
  • verification errors
  • whether OCSP stapling is present

6.2 Check the hostname and validity window

openssl s_client -connect example.com:443 -servername example.com </dev/null 2>/dev/null \
  | openssl x509 -noout -subject -issuer -dates -ext subjectAltName

This quickly answers:

  • is this the cert I expected?
  • is it expired?
  • does it cover the requested hostname?

6.3 Check what curl thinks

curl -vI https://example.com

This helps you see:

  • protocol negotiation
  • certificate verification failure details
  • whether curl is using the trust store you think it is

If you need a custom trust root for testing:

curl -v --cacert root-ca.pem https://example.com

6.4 Test one specific edge or load balancer node

curl -v --resolve example.com:443:203.0.113.10 https://example.com

If one IP fails and another works, you likely have a deployment consistency problem, not a CA problem.

6.5 Verify a chain offline

openssl verify -CAfile root.pem -untrusted intermediate.pem leaf.pem

That is useful when you want to answer:

  • does this chain validate at all?
  • which issuer is missing?
  • is the wrong intermediate being paired with the leaf?

6.6 Reproduce an mTLS failure deliberately

curl -v --cert client.pem --key client.key https://api.example.com

If the server requires a client certificate and you do not send one, many stacks will fail the handshake or return an immediate authorization-style error.

6.7 Server-side things to inspect

When the client-side view is not enough, check:

  • which certificate file and key file are actually loaded
  • whether the process was reloaded after renewal
  • whether the edge and the origin use different trust stores
  • whether backend TLS uses the right SNI and hostname verification
  • whether clocks are badly skewed on clients, proxies, or servers

7) Common failure patterns and what they usually mean

7.1 Browser says certificate is invalid right after a deploy

Usually:

  • wrong cert deployed
  • old process still serving old files
  • missing intermediate on one edge node
  • hostname mismatch due to wrong listener or SNI route

7.2 Internal service call fails but public browser traffic is fine

Usually:

  • private CA not trusted by that runtime
  • internal client does stricter hostname or chain validation
  • backend TLS configuration forgot to set SNI

7.3 mTLS works in staging but not prod

Usually:

  • the production truststore does not contain the issuing client CA
  • the presented client certificate profile differs from what prod expects
  • the identity mapping logic from cert subject or SAN is inconsistent

7.4 HSTS made the incident feel worse

That is expected.

HSTS does not create the certificate problem, but it removes the browser's ability to ignore it for remembered hosts. That is the point of HSTS.


8) FAQ

Should I send the root certificate from my server?

Usually no. Send the leaf plus intermediate certificate(s). Clients normally already trust the root from their trust store.

Does HSTS help with expired certificates?

No. It makes browsers stricter about certificate failures after the host is known as HSTS, which is good for security but bad for sloppy operations.

Should every internal API use mTLS?

Not automatically. Use mTLS where machine identity really matters and you can operate certificate issuance and rotation cleanly. Do not add it everywhere just because it sounds more secure.

Why does renewal succeed but users still see the old cert?

Usually because the terminating process was not reloaded, one node did not update, or a CDN or load balancer is still serving a stale configuration.

Why does curl fail while the browser works?

Different trust stores, different enterprise roots, different chain-building behavior, or different local TLS backends. Compare verbose output before guessing.


9) Final recommendation

If you only keep one operational rule in your head, keep this one:

  • serve the right chain
  • renew automatically
  • use mTLS only when client identity is actually the problem
  • roll out HSTS carefully
  • debug handshakes by isolating cert, trust, SNI, and client-auth failures one by one

That is still the cleanest way to run TLS and HTTPS in April 2026.