How I Built a Context-Aware OpenAI Spam Scanner Inside WHM MailScanner on cPanel/AlmaLinux

JeffTechnical Articles & Notes

I host email for a lot of domains, which means I also host a lot of rubbish.

SpamAssassin does a good job. MailScanner does a good job. But there is a category of mail that sits in the awkward middle ground: not obviously clean, not obviously malicious, and not always handled as well as I’d like by static rules alone.

So I built a custom spam-scoring layer inside MailScanner which can use OpenAI’s API as a bounded second opinion, while still relying heavily on local rules, trusted sender logic, recipient-specific context, and hard short-circuits to avoid pointless API calls.

This is not “replace SpamAssassin with AI”.

That would be daft.

This is “keep the normal mail stack, then add a constrained extra decision layer for the murky stuff”.

The overall design

The scanner lives in MailScanner’s custom spam scanner hook, in:

GenericSpamScanner.pm

That hook returns two things to MailScanner:

  • a numeric score
  • a short one-line report

That means it can influence the final MailScanner spam decision without replacing SpamAssassin or hacking MailScanner to bits.

The flow is broadly:

  1. Mail arrives
  2. MailScanner processes it normally
  3. My custom scanner inspects it
  4. local short-circuit rules run first
  5. only if it is still worth asking does it call OpenAI
  6. the model returns a structured score and classification
  7. that gets converted into a bounded MailScanner score adjustment

The important word there is bounded. The model does not get to run wild.

Why I built it this way

Hosted mail is messy.

I usually do not know the senders. I only vaguely know the recipients. And the server handles mail for many unrelated domains, all with different normal patterns of communication.

A plumbing business, a holiday cottage business, a consultant, a local shop and a school do not receive the same sort of legitimate email. Treating them all the same is a good way to make worse decisions.

So I wanted something that could:

  • trust genuinely trusted mail hard
  • skip expensive lookups wherever possible
  • use per-recipient or per-domain context
  • score suspicious edge cases more intelligently
  • stay tightly constrained so one bad model call cannot wreck delivery

The key files and directories

I am deliberately not publishing every literal live path exactly as they exist on my server. There is no point giving the internet a neat little treasure map of my box. But these are the key locations in a setup like this, and they reflect the real structure closely enough to be useful.

MailScanner root:

/opt/mailscanner

Custom scanner location:

/opt/mailscanner/lib/custom/GenericSpamScanner.pm

Rules directory:

/opt/mailscanner/etc/rules

Main scanner config / secrets env file:

/opt/mailscanner/etc/secrets/openai.env

Fallback env file:

/etc/mailscanner/openai.env

Recipient-context profiles:

/opt/mailscanner/etc/secrets/recipient-context/*.env

Trusted authenticated sender whitelist:

/opt/mailscanner/etc/secrets/trusted.auth.whitelist.rules

Fast local bad-text rules:

/opt/mailscanner/etc/secrets/spam.badtext.rules

Standard MailScanner whitelist rules:

/opt/mailscanner/etc/rules/spam.whitelist.rules
/opt/mailscanner/etc/rules/spam.whitelist.rules.custom

Cooldown marker file for backing off OpenAI calls after failures:

/var/spool/mailscanner/openai.cooldown

Debug logging for recipient-context decisions:

/var/log/mailscanner/openai-recipient-context-debug.jsonl

Debug logging for schema parse failures:

/var/log/mailscanner/openai-schema-parse-fail.jsonl

Permissions matter

This is one of those gloriously boring details that causes ridiculous amounts of breakage.

MailScanner needs to be able to read the files it depends on. That sounds obvious, but if your service user cannot traverse the directory or read the file, your “clever” system will quietly become your “broken” system.

A sensible pattern is:

/opt/mailscanner/etc/rules
readable/traversable by the MailScanner runtime

/opt/mailscanner/etc/secrets/openai.env
readable by the MailScanner runtime but not world-readable

/opt/mailscanner/etc/secrets/recipient-context/
likewise readable but not broadly exposed

The general principle is simple:

  • directories must be traversable
  • files must be readable
  • secret files should not be world-readable
  • rule files should only be writable by the administrator

In my case the live rules area has been locked down pretty tightly, and the scanner was built around that reality rather than assuming a nice loose default layout.

What happens before OpenAI is even considered

This is where the current live version became much better than the early versions.

The scanner does not just fling every message at an API and hope for the best. It has a pile of local short-circuit logic first.

1. Localhost skip

If a message is local infrastructure mail, there is usually no point paying to analyse it. So localhost-originated mail can be skipped outright.

2. From-domain skip list

Certain sender domains can be excluded from OpenAI checks entirely. That is useful for internal systems, relays, known automations and other traffic where an API call adds no value.

3. Trusted authenticated whitelist

This is one of the strongest trust paths.

If SPF and DKIM pass and align properly, and the sender matches a trusted authenticated whitelist rule, the scanner can force a strong legitimate score without any OpenAI call.

That matters because when a sender is both properly authenticated and explicitly trusted, I want that to count heavily.

4. Standard whitelist checks

The scanner also checks the existing MailScanner whitelist rules early. That avoids wasting API calls on mail that local policy has already decided is safe enough to skip.

5. Bad-text rules

There is also a fast local bad-text mechanism for recurring rubbish.

Sometimes you do not need an LLM. Sometimes you just need to say “if this horrible phrase or header pattern appears again, hammer it”.

And frankly that is fine.

6. Header-aware bad-text matching

The live scanner also supports header-aware bad-text rules, not just body text matching. So a rule can target the presence of specific header-related characteristics as well as subject/body strings.

That is extremely handy for recurring spam runs with recognisable technical fingerprints.

7. Newsletter / mailing-list fast path

Authenticated newsletters and list mail can often be recognised without asking OpenAI anything. They may still be unwanted, but they are not usually the most interesting edge cases.

8. Force-legit guardrail

There is also a protection path for messages that look very safe.

If mail is authenticated, has no URLs, no suspicious attachments and no obvious high-risk characteristics, the scanner can lean legitimate without making an API call.

That is important because the nastiest mistake in mail filtering is often not “spam got through”. It is “real mail got treated as junk”.

Recipient context files

This is one of my favourite features in the live version.

The scanner can load little per-domain or per-recipient context files from:

/opt/mailscanner/etc/secrets/recipient-context/*.env

These describe what is normal for particular hosted accounts, domains or recipients. They can include things like:

  • what sort of business the domain belongs to
  • what kind of mail is expected
  • which recipients should match exactly
  • weak prior context which should help classification without overriding obvious scam signals

That last part matters.

The scanner does not treat recipient context as absolute truth. It treats it as a weak prior. That is the right design. Context should help judgement, not excuse obvious nonsense.

A trimmed excerpt from the live code looks like this:

my $rcx_matches = _recipient_context_matches($to);
my %rcx_feat = _recipient_context_feat($rcx_matches);

And the profile loading is driven from a configurable path pattern:

my $paths = _cfg('AETHER_RECIPIENT_CONTEXT_FILES', '/opt/mailscanner/etc/secrets/recipient-context/*.env');

The point is simple: the same email can be normal for one recipient and suspicious for another. Context makes that visible.

How configuration is loaded

The scanner reads its configuration from env-style files, preferring the secrets location and falling back if needed.

A representative excerpt:

my @CFG_FILES = ( '/opt/mailscanner/etc/secrets/openai.env', '/etc/mailscanner/openai.env', );

That config holds things like:

  • the OpenAI endpoint
  • the model name
  • the API key
  • timeouts
  • input size caps
  • score limits
  • confidence handling
  • skip toggles
  • cooldown behaviour

In other words, the Perl module is not where I want lots of tuning values hard-coded forever.

What gets sent to OpenAI

I do not just throw an entire raw email blob at the API.

That would be wasteful, expensive and sloppy.

The scanner builds a compact prompt from selected features such as:

  • chosen headers
  • sender and recipient information
  • authentication results
  • URLs
  • trimmed body text
  • attachment metadata
  • recipient context, if available

The live code explicitly caps input size and URL count. For example:

my $input_max = int(_cfg('AETHER_INPUT_MAX_BYTES', 12000));

my $max_urls = int(_cfg('AETHER_MAX_URLS', 25));

That is deliberate. The aim is to give the model enough signal to be useful, without paying to send mountains of rubbish it does not need.

What the OpenAI call returns

The model is asked for structured output rather than a fluffy essay.

That output is then converted into a MailScanner score contribution.

Again, the key point is that it is bounded.

It contributes to the decision. It does not become the decision.

How the score is constrained

One of the most important design choices in a system like this is to limit how much influence the model gets.

If you let one model call swing the score too hard, eventually it will do something stupid at exactly the wrong time.

So the scanner maps the model’s output into a controlled MailScanner score range. During testing and cautious rollout I have deliberately kept that influence modest.

A simplified example of the sort of thing this looks like is:

my $ms_score = $spam_score / 100.0;

$ms_score = 1.0 if $ms_score > 1.0;

$ms_score = -1.0 if $ms_score < -1.0;

That is a much saner live starting point than allowing the model to swing classification by a huge amount on day one.

Example: trusted authenticated whitelist short-circuit

This is one of the more important bits of logic in the scanner because it shows the philosophy of the whole design.

If the message is authenticated and matches trusted rules, I do not want to pay for an API opinion on whether it might somehow be dodgy anyway.

I want to trust it hard and move on.

The live code is built around that idea. In simplified form it looks like:

if (_truthy(_cfg('AETHER_TRUSTED_AUTH_ENABLE', '1')))
{
my $twhy = _is_trusted_auth_whitelisted($hdrs, $from, $to);

if ($twhy)
{
my $score = _num(_cfg('AETHER_TRUSTED_AUTH_SCORE', '-15'));
return ($score, "AetherwebSpamScanner: trusted_auth_whitelist");
}
}

That is one of the reasons the scanner works sensibly in production rather than behaving like an excitable intern.

Example: header-aware bad-text rule loading

The live version supports special prefixes for header-aware matching, which is genuinely useful for recurring campaigns with recognisable message structure.

A representative excerpt:

# Prefix format for header-aware matching:

header:<text> # hdr:<text>
This includes headers in addition to subject/body.

if ($line =~ /^(?:header|hdr)\s*:\s*(.+)$/i) { ... }

That is a small feature, but a very practical one.

Cooldown and failure handling

External APIs are external APIs. Sometimes they wobble, time out, return nonsense, or just have a funny five minutes.

The scanner therefore includes cooldown behaviour rather than blindly hammering the API over and over again if there is an upstream problem.

This is important for both cost control and operational sanity.

A representative location for the cooldown marker is:

/var/spool/mailscanner/openai.cooldown

That way transient trouble does not instantly become log spam and pointless repeated requests.

Logging and debugging

One thing I learned quickly is that giant diagnostic blobs do not belong in message headers.

Technically, yes, you can cram enormous chunks of debug text into headers.

Practically, it becomes horrible, encoded, ugly and annoying.

So the scanner keeps the MailScanner-visible report short and pushes deeper diagnostics to proper logs.

The recipient-context and schema-failure JSONL logs are especially useful because they are easy to grep, easy to analyse and do not turn every processed message into a landfill.

Why this works better than a naive AI mail filter

Because it is not naive.

The useful part is not “there is AI in it”. The useful part is the surrounding discipline:

  • local rules first
  • trusted authentication respected
  • recipient context used carefully
  • known rubbish blocked cheaply
  • known good mail spared expensive lookups
  • the model’s influence tightly bounded

That is the difference between something genuinely practical and something that sounds clever until it ruins someone’s invoice email.

Would I recommend building it this way?

Yes, with one big caveat.

Do not hand too much power to the model.

If you treat it as an all-knowing judge, you are asking for trouble. If you treat it as a tightly boxed scoring assistant sitting alongside MailScanner and SpamAssassin, it becomes much more useful.

That is really the whole design philosophy of this system.

Final thoughts

The current live scanner is a lot more capable than the first rough version.

It now has:

  • context-aware recipient profiles
  • standard whitelist short-circuits
  • trusted authenticated whitelisting
  • header-aware bad-text rules
  • skip-domain logic
  • newsletter/list fast paths
  • force-legit guardrails
  • cooldown handling
  • structured debugging
  • bounded score contribution

Which is exactly how it should be.

The whole point is not to replace the traditional mail stack. It is to improve judgement on the awkward cases while staying fast, cheap and cautious everywhere else.

That balance matters.

Want help setting something similar up?

If you are running WHM MailScanner and want something similar built or adapted for your own server, feel free to contact me...

I can help with the architecture, the MailScanner integration, the rule layout, the OpenAI/API side, and the practical reality of getting it working sensibly on a live hosted mail server without doing anything daft.