DKIM body hash failures — why they happen

You've set up DMARC, you're receiving aggregate reports, and you're seeing DKIM failures. Specifically, you're noticing a significant number of "body hash failures." This isn't just an academic exercise; these failures mean your emails aren't passing DMARC for DKIM, potentially impacting deliverability and trust, especially if you're enforcing a p=reject or p=quarantine policy.

So, what exactly is a DKIM body hash failure, and why do they occur? Let's break it down.

What is DKIM and the Body Hash?

DKIM (DomainKeys Identified Mail) is an email authentication method designed to detect email spoofing. It allows the receiving mail server to check if an email that claims to come from a specific domain was indeed authorized by the owner of that domain.

Here's a simplified overview of how it works:

  1. Signing: When an email is sent, the sending mail server (or a signing service) calculates a cryptographic hash of specific parts of the email – typically some headers and the message body.
  2. DKIM-Signature Header: This hash, along with other parameters (like the signing domain, selector, canonicalization algorithms, and the signed header fields), is placed into a DKIM-Signature header and added to the email. The entire header is then digitally signed using the sender's private key.
  3. Verification: When the receiving mail server gets the email, it extracts the DKIM-Signature header. It then uses the public key (retrieved from the sender's DNS records based on the signing domain and selector) to verify the digital signature of the DKIM-Signature header itself.
  4. Content Verification: If the DKIM-Signature header's signature is valid, the receiver then re-calculates the hash of the same parts of the email (headers and body) using the same canonicalization algorithms specified in the DKIM-Signature header.
  5. Comparison: The re-calculated hash is compared against the bh= (body hash) value stored in the DKIM-Signature header. If they match, the body hash verification passes. If they don't, you have a DKIM body hash failure.

A body hash failure explicitly means that the content of the email body, as received, does not match the content that was originally signed.

Common Reasons for DKIM Body Hash Failures

Body hash failures typically stem from modifications to the email's body after it has been signed by the sender but before it reaches the recipient's mail server. Even seemingly minor changes can completely invalidate the hash.

1. Content Modification in Transit

This is by far the most common culprit. Many legitimate email infrastructure components can inadvertently alter email content.

  • Email Service Providers (ESPs) and Marketing Platforms: Many ESPs (e.g., Mailchimp, SendGrid, HubSpot) add tracking pixels, rewrite URLs for click tracking, or append unsubscribe links. If you sign an email before handing it off to an ESP, and then they make these modifications, your original DKIM signature will break.
  • Corporate Email Gateways/Relays: Internal mail servers, particularly in larger organizations, might add disclaimers, footers, legal notices, or even scan and modify attachments. These changes, no matter how small, will invalidate the body hash.
  • Anti-Spam/Anti-Virus Scanners: Some security solutions might modify the email body, for instance, by stripping potentially malicious content, adding warnings, or re-encoding parts of the message.
  • Mailing List Managers: When an email passes through a mailing list, the manager often wraps the email, adds headers, footers, or modifies the Content-Type headers, all of which can impact the body hash.
  • Line Ending Canonicalization Issues: Email standards typically use CRLF (\r\n) for line endings. Systems might convert LF to CRLF or vice-versa, or even strip trailing whitespace. While DKIM canonicalization (simple vs. relaxed) is designed to mitigate some of this, mismatches can still occur.

Real-world Example 1: Corporate Gateway Adding a Disclaimer

Imagine your application sends an email, and your internal mail transfer agent (MTA) signs it with DKIM. However, your corporate outbound mail gateway is configured to automatically append a legal disclaimer to all outgoing emails.

Original signed body:

Hi John,
Your order #12345 has shipped.
Thanks,
Your Team

Body after gateway modification:

Hi John,
Your order #12345 has shipped.
Thanks,
Your Team

---------------------------------------------------
This email is confidential.
Please consider the environment before printing.

When the recipient's server verifies the DKIM signature, it will calculate the hash of the modified body. This hash will not match the bh= value in the DKIM-Signature header, which was calculated based on the original body. Result: body hash failure.

To debug this, you'd need to: 1. Obtain the raw source of the email as received by the recipient (e.g., from their mail client's "View Source" option). 2. Compare it to the raw source of the email as it left your signing system (if you have access to your outbound mail logs or a test mailbox that receives an unmodified version). 3. Look for discrepancies in the body content. Tools like diff on extracted body parts can highlight changes. For example, if you save the raw email to email.eml, you might extract the body (ignoring headers) and look for added content.

# Example of extracting body from a raw email file (simplified)
# Note: This is a simplification, actual extraction needs to handle MIME parts.
sed -n '/^$/,$p' email.eml | tail -n +2 > email_body_received.txt

# If you have an "original" body (e.g., from an internal system before modification)
# you can compare them:
diff email_body_original.txt email_body_received.txt

This diff command would quickly show you any added lines or modified characters.

2. Incorrect Canonicalization

DKIM specifies two canonicalization algorithms for the body: simple and relaxed. The chosen algorithm is indicated in the c= tag of the DKIM-Signature header (e.g., c=relaxed/relaxed for header/body).

  • simple canonicalization: Very strict. It does not allow any changes to whitespace characters or empty lines at the end of the body. Any modification