Spam-free after resolving my Mollom problem


As a brief follow-up to my post about dealing with spam on Drupal, the problem I had with Mollom turned out to be purely in my site configuration. Apparently some of the configuration changed from the previous version I had been using, so as soon as I re-saved the settings it all started working correctly again! I'm happy to say I haven't seen any spam come in during the past ten days and Mollom is the only anti-spam engine I'm currently running.

Movies I'm anticipating for Summer 2010

Some of the upcoming movies I'm looking forward to for this Summer include:

  • April 16th: Kick Ass - an unusual take on the superhero origin story.
  • May 7th: Iron Man 2 - does this need an explanation?
  • May 28th: Prince of Persia: The Sands of Time - I loved the original computer game from almost 20 years ago, it has Jake "Donnie Darko" Gyllenhaal
  • June 18th: Toy Story 3 - Toy Story 2 is one of Pixar's best, and it comes out on a family member's birthday, so we're all going to go see it!
  • July 9th: Predators - a pseudo reboot of the franchise sees a bunch of tough humans whisked away to be the game for the predators' favorite past time. The trailer looks pretty good, and I'm a fan of the franchise (excluding AVP and AVP2).
  • July 16th: Inception - written & directed by the person behind the amazing reboot of the Batman franchise, the trailer has definitely grabbed my attention.
  • July 23rd: Salt - Angelina Jolie running around in an action flick. Say no more.
  • August 13th: Scott Pilgrim vs. the World - the latest movie from directory Edgar Wright, the other brain behind "Shawn of the Dead" and "Hot Fuzz" looks like an unusual but light 90 mins of escapism.

Of those there are only three I could see wanting to get on DVD later - Toy Story 3, Iron Man 2 and Inception, the rest I'd probably only be interested in seeing the once, though obviously that might change once I get to see them.. Not as many as other years, definitely less of the big ones this year interest me overly much, e.g. I have no interest in seeing The Expendables.

So, anything pique your interest? Are there other ones you're looking forward to?

Dealing with site spam on Drupal (updated)


A constant annoyance with managing website today is the level of spam that comes in through comments, forum posts, contact requests, user registrations, etc, etc, etc... Not only can spam messages make your site look like crap, if you have any sort of comment reply notification (as this site has) you can end up emailing spam to your visitors, which will turn off a LOT of people. There are times when you don't seem to be getting much and then other times when it seems your site is being flooded with this junk - this week feels like the latter.

There are several ways of dealing with spam:

  1. Allow all content be automatically posted and moderate it after-the-fact,
  2. Manually approve every piece of content from unknown sources or unrecognized users,
  3. Add a plugin / code that blocks content based on certain keywords, e.g. swear words, references to Star Trek, etc.
  4. Add a plugin that requires some sort of identification that the visitor is a legitimate person rather than an automated program, dubbed CAPTCHA ("Completely Automated Public Turing test to tell Computers and Humans Apart"),
  5. Add a plugin / code that uses advanced algorithms to try to automatically detect spam,
  6. Add a plugin / code that identifies spam using distributed user actions, e.g. someone in a foreign country, like Alaska, sees that a message containing "Barney", "submarines", "camfires", "milkshakes" and "UFOs", they mark it as spam and that knowledge then helps identify similar content on your site.

So, the above is all wonderful, but where do you start? The first option above is messy as you end up with a lot of junk to deal with, the second one halts the natural flow of conversations as everything must be approved, and the third option is very limited - what if you *wanted* to discuss the effects of watching Barney-like dinosaur puppet TV shows on the reproductive cycle of goats, that conversation would be sure to cause a few messages to be blocked? So that leaves advanced solutions as the only viable options.

For this site, which is built with the excellent content management system Drupal, I took a look at some different modules that cover some of these concepts. One in particular piqued my interest, a service built by the creator of Drupal, Dries Buytaert, called Mollom. Based on a combination of several of the above ideas, Mollom seems like it would be a great solution, and with a really good Drupal module available so I gave it a spin.

So cut to a year later and the Mollom service has been working really well, leaving almost no spam. Unfortunately in the past ten days it has failed almost completely with thirty to almost one hundred spam messages getting through daily, which is obviously not what I want.

As a result of the influx of spam getting past Mollom I've changed over to using a service called reCAPTCHA (some details on Wikipedia) which provides a simpler though more reliable CAPTCHA. Installation on Drupal is super-simple, you just install the CAPTCHA dependency and then install the reCAPTCHA module itself, sign up for the free reCAPTCHA service, do a little bit of configuration (admin/user/captcha) and then hopefully just forget about it.

I'll let you know how it goes.

UPDATE: Believe it or not but no sooner had I tweeted about this post than Dries himself responded! that after upgrading to the latest version it was necessary to reconfigure the module as it seems the settings structure changed. As a result I've switched back to Mollom to give it one last try. That said, I did suggest that an update script be added that leaves a message for the admin informing them of this. We'll see how it goes!

I'm not that guy


I wanted to share a bit of my philosophy with you.

In the name of work, in the name my employer, in the name of finishing projects so certain managers can save face, I've done a lot of things, worked a lot of extra hours, pushed myself pretty close to my limits.  There have been times I've worked eighty, ninety, one-hundred hour weeks (103.5 is my record) to get projects completed in time, worked two brick-n-mortar jobs where the only time I saw my family was for a few minutes in the morning before arriving home at 2am+.  I've gotten through it, we've gotten through it, learned and moved on.  But with all of this there's one thing I'm unwilling to do - extensive travel.

In 1998 I moved from Ireland to the USA to be with the woman I loved and had just married.  In 2000 I moved with her from New Hampshire to Florida because she needed to, and over the past ten years we've built a life for ourselves, have two amazing children (and a dog & two cats) and look forward to hopefully many years of happiness together.  I have a wonderful family, a lot to be grateful for - I am grateful for it and I want to be with my family.

While a large number of people world-over spend their careers driving & flying all over the country, nay world, I'm not that guy.  I'll sacrifice a lot in the name of my career, but I'd rather have a slightly less awesome job so I can spend more time with my family.

Disclaimer: I know a lot of people who do travel for their jobs, a lot of people I have immense respect for and do not think any less of them for it, but it's not a choice I can make.

Fix for Nodewords module's faulty canonical tag feature


The Drupal module Nodewords is a module that many people have come to love-to-hate - its SEO features are second to none, but a few buggy releases have left a sour taste with many developer.

A key problem with the current stable release is that the canonical URL support is a little faulty.  It can be a problem when sharing content across multiple sites, or allowing other sites to display your content via an RSS feed, that search engines might find two copies of your content, one at your primary site and one elsewhere.  The canonical URL system was developed so that you could add a tag to your page to tell search engines "this URL is the official URL for this piece of content" - simple, and with CMS support, completely painless.

It's a standard feature in Drupal (using the built-in Path module) that you can create friendlier URLs for all of your pages, so instead of "" you can have "".  With a combination of modules (PathAuto, Path_Redirect, GlobalRedirect) you can set your site up to automatically create friendly URLs for all content to match specific structures (e.g. products always look like "products/product-name"), let you add aliases for common misspellings or when you rename a page ("products" instead of "product"), and automatically bounce the user over to the correct path regardless of which version they typed.

So taking those two together, the Nodewords module should be using the friendly URL alias to indicate the canonical URL.  Except it doesn't, by default the latest stable release just uses the internal "" format.  A ticket to fix this was added to and I provided a patch that gave the site administrator a simple option to decide whether to use the internal path or the alias, but so far the module maintainer has only said he intends to handle this via a completely different structure in his forthcoming rewrite.

While we at Bonnier Corp have been using a patched version of this module with the fix for several weeks, I thought it might help others to test it out and see which they prefer to use without having to deal with applying the patch themselves.  Towards that goal, here's a zip archive of the most recent v6.x-1.11 release with the patch applied so you can make your canonical URLs nice and friendly.  If it works for you, please chime in on the issue on



Subscribe to Front page feed