Email obfuscation: What works in 2026?

(spencermortensen.com)

72 points | by jaden 4 hours ago

9 comments

  • ciroduran 21 minutes ago
    I stopped being concerned about email harvesting years ago, I just simply leave the email on my website. Spam handling is okay enough, I guess.

    But I like this review of techniques, even the simplest ones are very effective, that surprised me.

  • fmajid 36 minutes ago
    I use SVG where I created a text object in Affinity Designer and converted it to curves so the SVG doesn't have text any more, just vectors for the glyphs of it. Seems to work pretty well at keeping spammers at bay.
  • bit1993 2 hours ago
    Good stuff, but I think the title should be Email address obfuscation. Thank you for sharing I guess, but spammers can now learn from this too (:
    • ghywertelling 19 minutes ago
      https://www.gregegan.net/

      Contact details: [any mailbox] [at] [the domain name of this web site]. Please don’t ask me to give interviews, sign books, appear on podcasts, attend conferences or conventions, or provide feedback or endorsements for works of fiction, scientific theories, or slabs of text disgorged by chatbots.

      I have no idea how to decipher this obfuscation.

  • dandersch 1 hour ago
    Very interesting. It seems for his own email the author has opted for a combination of the CSS display none technique and a XOR cipher:

      <span class="hidden email"><b>999a8f84898f98</b>aa<b>878b8386c4</b>999a8f84898f988785989e8f84998f84c4898587</span>
  • newscracker 2 hours ago
    > HTML entities are often decoded automatically by server-side libraries, which means that even the most basic harvesters can get your email addresses without any special effort. This technique should be worthless—and, yet, it still stops most harvesters.

    Anecdotal, but I’ve used HTML entities on a public static website for a long time using an href tag with mailto, and yet I’ve not seen any spam.

    I guess any spammer who uses some level of GenAI to process and extract email addresses would have a lot more success against all the methods listed in this article.

    • ciroduran 20 minutes ago
      I wouldn't think it's very cost effective to apply GenAI to extract email addresses
  • _ache_ 2 hours ago
    I'm sorry, but that is not how email address are spammed in bulk.

    The data-source are the enormous data breach that are more and more frequent. There is more intensive to collect more information on someone you already know something about than spamming an email you don't even know if it's a valid one.

    The spam can also be very more effective as it present itself with personal information about the spammed.

    • curiousObject 1 hour ago
      The OP put those addresses on that web page, and only on that web page. Some addresses received spam.

      Edit: that’s not to deny that big data leaks are a serious problem

  • gfody 1 hour ago
    I filter everything that does NOT include “+asdf” in the to:
  • jwr 15 minutes ago
    This is such a waste of effort. Your E-mail address is not and can't be a secret. It will get into spammer databases eventually, no matter what you do. You will spend a lot of effort doing all these fancy tricks, and eventually you will get spam anyway.

    Also, a note to those who make fancy "me+someservice@somedomain.com" addresses: make really sure you are in control and these work. Some services (including mine) will need to E-mail you one day, for example to tell you that your account will be deleted because of inactivity. If you don't receive that E-mail because of your fancy spam defenses, your account will be deleted. I've seen people hurt themselves like this and it makes me sad.

    On a constructive note: what works very well is spam filtering using LLMs. We have AI to help us with this problem today. I wrote an LLM despammer tool which processes my inbox via IMAP using a local LLM (for privacy reasons). I see >97% accuracy in my benchmarks on my (very difficult) testing corpus. It's nearly perfect in real life usage. I've tested many local models in the 4-32B range and the top practical choice is gpt-oss:20b (GGUF, I run it from LM Studio, MLX quantizations are worse) — not only does it perform very well, but it's also really fast.