Techie Nerd, Inc.


Home

About Us

Services

Examples

Contact Us
     

The Despaminator Spambot Vaccine

Protect Your Email Addresses from Spambots

Placing your email address on a web page results in more spam than any other activity:

http://www.cdt.org/speech/spam/030319spamreport.shtml
This is because spambots - programs that harvest email addresses from web pages - now are the main source of spammable email addresses.

Of course, you can avoid exposing your email address to spambots by not placing it anywhere on the web. For example, you may:

  • provide web site visitors with a feedback form to reach you, rather than a "mailto:" URL, or
  • provide a phone or fax number instead of an email address,
but many times you really need to use an email address. How can you make your email address accessible to people viewing your web site, but inaccessible to spambots?

Humans reading a web page differ from a spambot reading a page in several ways that can be exploited to make email addresses accessible to humans but inaccessible (or nearly inaccessible) to spambots:

  • humans have greater intelligence, and can participate in more complex interactions, than spambots,
  • humans can interpret graphical features of a web page that a spambot cannot,
  • whereas a spambot is just a simple, text-oriented HTML filter, humans normally use a feature-rich web browser that supports multimedia and plugins such as Flash and Java, and
  • unlike a spambot, a full-featured web browser can interpret JavaScript.

Intelligence and Interactivity

One could create an interactive challenge-and-response system that only a human could pass, but few humans would have the patience to endure it when all they want is an email address. We need a more convenient mechanism.

Graphical Features

Display an image that a human can read (except those who are blind) yet a spambot cannot read. Implementation can be with a static graphic or one generated dynamically via server side script (CGI, JSP, ASP, etc.) or client side program (Flash, Java). For example, here is a static graphic for an email address:

readmeifyoucan

BaffleText is a more advanced form of this technique.

An alternative, text-oriented technique is to create a table that aligns the pieces of an email address so that a human can read it, but a spambot cannot. For example, the table:

<table border="0" cellspacing="0" cellpadding="0">
   <tr>
     <td>read</td><td>me@if</td><td>youca</td><td>n.com</td>
   </tr>
</table>

is rendered as:

readme@ifyoucan.com

Of course, neither the image nor text layout techniques can produce a clickable email link, so they are less than ideal even if they successfully hide the email address from the spambots. Worse, a slightly smarter spambot could decipher the email address embedded in the table layout, so that will not be a good long-term technique.

Multimedia and Plugins

One could create an audio or video clip to convey one's email address and a spambot surely would be unable to harvest it. Unfortunately, the email address would not be clickable, so this technique also is less than ideal.

Both Flash and Java interpreters are built into most modern web browsers. When coded properly, they will require only small downloads that can display clickable links that spambots are unlikely to harvest. The programmer should obfuscate the construction of the email address rather than just hardcoding it directly, though. Otherwise, a slightly smarter spambot could download the Flash .swf or Java .class file and run a "string" search on it to extract its email address(es).

One disadvantage of this approach is that it requires either Flash or Java programming skills. Another is that the user will notice a delay while the browser starts up the Java Virtual Machine. The main disadvantage, though, is that a web page with multiple email addresses will require multiple instantiations of the Flash or Java applet. The overhead could be substantial.

JavaScript and the Despaminator Spambot Vaccine

Editing HTML Files with JavaScript

Several web sites illustrate a simple JavaScript encoding for a "mailto:" URL that fools the current generation of spambots, yet still provides a clickable hyperlink for the user. For example, the email address can be encoded as:

   <script language="JavaScript" type="text/javascript">
    var user = "readme";
    var site = "ifyoucan";
    var hld = "com";
    document.write("<a href=\"mai" + "lto:" + user + "@" + site +
                   "." + hld + "\">");
    document.write(user + "@" + site + "." + hld + "</a>");
   </script>
   
This technique scales well for a web page with many email addresses, especially if the decoding JavaScript is packaged as a function in a .js file. Or course, it assumes that the user's browser supports JavaScript. While that usually is true for modern browsers, some desktop users disable JavaScript (for security reasons or to eliminate popups) and many wireless devices do not support it.

The examples at: http://www.wonderwinds.com/JavaScripts/anti-spam.htm are notable by also showing how to add cc, bcc, subject, and body text to a "mailto:" URL. The JavaScript-based encoder at: http://www.hiveware.com/enkoder_form.php obfuscates the email address text even better than the example above. Also read http://www.macdevcenter.com/pub/a/mac/2002/11/01/spam.html for a good overview of spam protection techniques.

The Despaminator Spambot Vaccine

The examples above require editing of the HTML files to protect email addresses from spammers. In contrast, the despaminator can protect all the email addresses in a static HTML web site without requiring any changes to its files. For an example, see the old CryoCare web site at: http://www.cryocare.org/ and view the HTML source to see what a spambot sees.

The despaminator acts like a proxy between the user and the web site. For each page of the web site it:

  1. converts each email address into JavaScript that a spambot will be unable to parse and
  2. edits all hyperlinks so that references to other pages of the same web site must pass through the despaminator, too.

As mentioned above, the despaminator can accomplish this feat only for a static web site. The current version cannot act as a runtime filter for programmatically-generated web pages such as those with .jsp, .php, .asp, or .aspx extensions since it reads files directly from the server rather than via (password-protected) http requests. It also requires the web site's host to support user-defined CGI scripts. It has a few other, technical limitations, as described in its documentation.

Free Download of the Perl source code for the Despaminator Spambot Vaccine.