Archive for February, 2008

Microsoft Antigen: Brain Dead Content Filter

Microsoft Antigen for SMTP found a message matching a filter. The message is currently Purged.
Message: “Re_ Pbl.spamhaus.org down_”
Filter name: “KEYWORD= profanity: piss”
Sent from: “Daryl C. W. O_Shea”
Folder: “SMTP Messages\Inbound”
Location: “psp/TRACYSV05″

Piss! Oh noes! The utter profanity of replying to someone who said “Am I blocked? Did I piss someone off?” is simply unacceptable.

I’m simply amazed at the number of rejections I get from users of Microsoft Antigen for SMTP (part of the Microsoft Forefront Security product family) based on single words that I learned during my years in Catholic schools. I think the Forefront Security product family has no forebrain. It’s no wonder why most content filter based anti-spam products have such a bad wrap.

Add comment February 22nd, 2008

US Spy Sat won’t be going on eBay

Reuters is reporting that the US Navy says they hit their failed spy sat three hours ago (at 22:26 PM EST). I guess it won’t be going on eBay as others had hoped.

Add comment February 21st, 2008

Mail::DKIM v0.29 slow? Upgrade to v0.30.1

My nightly SA mass-checks have been hanging up this week on a 1MB email (not sure how a 1MB message got in my mass-check corpus, but that’s not important). It turns out that it was Mail::DKIM v0.29 that was taking about 150 seconds to process the message, while the rest of SA was only taking about 10 seconds. Upgrading to Mail::DKIM v0.30.1 resolves the problem… the DKIM check is fast (I didn’t time it, probably under a second).

The speed-up may be due to Mark Martinec’s optimizations in v0.30. It could be that the optimization was just to not do the crypto on the body, though, since the message in question did not have a signature (the sender doesn’t sign mail).

Add comment February 18th, 2008

Make sure your Hadoop cluster nodes are registered in DNS

One thing I’ve forgotten, twice now, to do before attempting to run a job on Hadoop clusters that I’ve setup in a hurry to demo something is to make sure that all of the nodes are registered in DNS or at least have entires in their hosts files about every node (datanodes, the namenode, the master and all the slaves) in the cluster.

If you start out with the master not knowing what name the slaves’ IPs map to the slaves won’t be able to connect to the master, even if you use IPs in the conf/slaves file. This seems silly to me, but that’s the way it is, at least as of 0.14.4. You’ll discover and fix this first. The slaves will then connect and the first level of links will start to work in the the master’s web interface.

Now the TaskTrackers on the slave nodes will successfully run tasks and will probably complete the map stage. If the nodes have varying performance levels or your data isn’t well distributed on your HDFS file system the map stage may appear to hang (or repeat the same percentage(s) over and over). If you make it through the map stage the reduce stage will fail to complete, for the same reason the map stage may fail, if you didn’t also configure each of the slaves nodes to know the names for all of the other slave nodes. As soon as a slave node needs data off of another datanode (either for a map task or a reduce task, etc) it’ll face the same problem it initially had in contacting the master node (but this time other slave nodes) before you configured DNS for it.

So… make sure that all machines involved in the cluster know the hostname (and it’s IP) of every other machine in the cluster. Configuring just the master to know all the slave’s names/IPs or just all of the slaves to know the master name/IP will not work, you need both — even if you haven’t used a single hostname in either of the conf/masters or conf/slaves files.

Add comment February 8th, 2008


Calendar

February 2008
M T W T F S S
« Jan   Mar »
 123
45678910
11121314151617
18192021222324
2526272829  

Posts by Month

Posts by Category

Ohloh profile for Daryl C. W. O'Shea

LinkedIn

Apache SpamAssassin