Mylos

Pointless referrer spamming

Q: What happens when you cross a mobster with a cell phone company?
A: Someone who makes you an offer you can’t understand.

The HTTP protocol used by web browsers specifies an optional Referer: (sic) header that allows them to tell the server where the link to a page came from. This was originally intended as a courtesy, so webmasters could ask people with obsolete links to update their pages, but it is also a valuable source of information for webmasters who can find out which sites link to them, and in most cases what keywords were used on a search engine. Unfortunately, spammers have found another well to poison on the Internet.

Over the past month, referrer spam on my site has graduated from nuisance to menace, and I am writing scripts that attempt to filter that dross automatically out of my web server log reports. In recent days, it seems most of the URLs spammers are pushing on me point to servers with names that aren’t even registered in the DNS. This seems completely asinine, even for spammers: why bother spamming someone without a profit motive? I was beginning to wonder whether this was just a form of vandalism like graffiti, but it seems the situation is more devious than it seems at first glance.

Referrer spam is very hard to fight (although not quite as difficult as email spam). I am trying to combine a number of heuristics, including behavioral analysis (e.g. whether the purported browser is downloading my CSS files or not), WHOIS lookups, reverse lookups for the client IP address, and so on. Unfortunately, if any of these filtering methods become widespread, the spammers can easily apply countermeasures to make their requests look more legitimate. This looks like another long-haul arms race…

Pay for the razor, pay for the blades

King Gillette is famous for his invention of the disposable-blade razor, and the associated business model, “give away the razor, sell the blades”. This strategy was widely imitated, but it seems marketers have struck an even better one: why give away the razor when you can make the chumps pay for it?

There are a number of products, some high-tech and some not where you actually pay handsomely for a device that is a doorstop without proprietary refills or service. Some examples:

  • In the US, most cell phones are either hard-wired to a specific service provider (CDMA) or SIM-locked (GSM). A consumers’ group is fighting in court to ban or at least limit in time the practice, which is either outlawed or strictly regulated in most other countries.

    Sure, the carrier is subsidizing the handset, but that is offset by extra profit margins in the contract. Once the contract’s minimum term is over, there is no justification whatsoever for maintaining the SIM lock. AT&T was one of the most egregious offenders, it is not clear if their policy will change after their takeover by Cingular.

    I suspect one of the big reasons for SIM lock is so carriers can charge extortionate international roaming charges, since without SIM lock, it would be cheaper to just pop in a prepaid SIM card in the country you are visiting. Actually, roaming charges are so overpriced that it is cheaper to just buy a new phone for the prepaid card and toss it away afterwards.

    There are real externality costs to society due to distortions in consumer behavior from carrier policies. Many people throw away their old cell phones when they change service or renew a contract, as the subsidy is only applicable towards a new phone purchase, never granted as a rebate to people opting to keep their older but perfectly serviceable phone. In California alone, 44,650 cell phones are discarded each day, usually ending up in landfill, at tremendous cost to the environment.

  • MP3.com founder Michael Robertson is suing Vonage for trying to extend the same despicable lock-in model to VoIP, with what he claims is deceptive advertising. Most commentators have rushed to Vonage’s defense — apparently, for many geeks the company can do no wrong, like Google. I have no such compunctions, as I have in the past received completely unsolicited spam from them, and thus as far as I am concerned, they fit in the “scum” category.

  • In a great illustration of the power of cognitive dissonance, TiVo is another company with rabid and uncritical fans. Originally, TiVo PVRs would remain somewhat functional even without the TiVo service. Sure, you would have to program shows manually, but that is no worse than most VCRs. Over successive software updates, TiVo have reduced their PVRs’ autonomy until they are now effectively useless without the service.

  • Inkjet printer manufacturers use all sorts of tricks to protect their racket, including putting in microchips designed to foil refilling or the use of third-party cartridges. Lexmark even tried to abuse the DMCA to prevent a competitor from selling reverse-engineered cartridge chips. All this so inkjet ink can remain the most expensive liquid, at significantly higher cost per milliliter than Chanel No. 5 or vintage Dom Perignon.

As in most cases the utility of the machine without the overpriced refills or service is nil, the fair market price for it should be zero. The Vonage/Linksys situation is a special case as the wireless router remains partially usable, albeit without VoIP features if you switch providers. But marketers will keep trying to have it both ways until consumers push back by implementing a zero-tolerance policy, akin to the “broken-window” theory of policing. Do not accept to pay for a cell phone from a carrier that refuses to unlock it after a reasonable amount of time. Refuse to purchase digital devices that require service from a specific vendor to function.

A reader-writer lock for Python

Python offers a number of useful synchronization primitives in the threading and Queue modules. One that is missing, however, is a simple reader-writer lock (RWLock). A RWLock allows improved concurrency over a simple mutex, and is useful for objects that have high read-to-write ratios like database caches.

Surprisingly, I haven’t been able to find any implementation of these semantics, so I rolled my own in a module rwlock.py to implement a RWLock class, along with lock promotion/demotion. Hopefully it can be added to the standard library threading module. This code is hereby placed in the public domain.

"""Simple reader-writer locks in Python
Many readers can hold the lock XOR one and only one writer"""
import threading

version = """$Id: 04-1.html,v 1.3 2006/12/05 17:45:12 majid Exp $"""

class RWLock:
  """
A simple reader-writer lock Several readers can hold the lock
simultaneously, XOR one writer. Write locks have priority over reads to
prevent write starvation.
"""
  def __init__(self):
    self.rwlock = 0
    self.writers_waiting = 0
    self.monitor = threading.Lock()
    self.readers_ok = threading.Condition(self.monitor)
    self.writers_ok = threading.Condition(self.monitor)
  def acquire_read(self):
    """Acquire a read lock. Several threads can hold this typeof lock.
It is exclusive with write locks."""
    self.monitor.acquire()
    while self.rwlock < 0 or self.writers_waiting:
      self.readers_ok.wait()
    self.rwlock += 1
    self.monitor.release()
  def acquire_write(self):
    """Acquire a write lock. Only one thread can hold this lock, and
only when no read locks are also held."""
    self.monitor.acquire()
    while self.rwlock != 0:
      self.writers_waiting += 1
      self.writers_ok.wait()
      self.writers_waiting -= 1
    self.rwlock = -1
    self.monitor.release()
  def promote(self):
    """Promote an already-acquired read lock to a write lock
    WARNING: it is very easy to deadlock with this method"""
    self.monitor.acquire()
    self.rwlock -= 1
    while self.rwlock != 0:
      self.writers_waiting += 1
      self.writers_ok.wait()
      self.writers_waiting -= 1
    self.rwlock = -1
    self.monitor.release()
  def demote(self):
    """Demote an already-acquired write lock to a read lock"""
    self.monitor.acquire()
    self.rwlock = 1
    self.readers_ok.notifyAll()
    self.monitor.release()
  def release(self):
    """Release a lock, whether read or write."""
    self.monitor.acquire()
    if self.rwlock < 0:
      self.rwlock = 0
    else:
      self.rwlock -= 1
    wake_writers = self.writers_waiting and self.rwlock == 0
    wake_readers = self.writers_waiting == 0
    self.monitor.release()
    if wake_writers:
      self.writers_ok.acquire()
      self.writers_ok.notify()
      self.writers_ok.release()
    elif wake_readers:
      self.readers_ok.acquire()
      self.readers_ok.notifyAll()
      self.readers_ok.release()

if __name__ == '__main__':
  import time
  rwl = RWLock()
  class Reader(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_read()
      print self, 'acquired'
      time.sleep(5)
      print self, 'stop'
      rwl.release()
  class Writer(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_write()
      print self, 'acquired'
      time.sleep(10)
      print self, 'stop'
      rwl.release()
  class ReaderWriter(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_read()
      print self, 'acquired'
      time.sleep(5)
      rwl.promote()
      print self, 'promoted'
      time.sleep(5)
      print self, 'stop'
      rwl.release()
  class WriterReader(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_write()
      print self, 'acquired'
      time.sleep(10)
      print self, 'demoted'
      rwl.demote()
      time.sleep(10)
      print self, 'stop'
      rwl.release()
  Reader().start()
  time.sleep(1)
  Reader().start()
  time.sleep(1)
  ReaderWriter().start()
  time.sleep(1)
  WriterReader().start()
  time.sleep(1)
  Reader().start()

The Malazan Book of the Fallen

Steven Erikson

Bantam Press (UK), ISBN 0553812173/0765310015, 0553813110, 0553813129, 0553813137, 0593046285, Publisher, Buy online: Gardens of the Moon Deadhouse Gates, Memories of Ice, House of Chains, Midnight Tides, The Bonehunters, Reaper’s Gale, Toll the Hounds, Dust of Dreams, The Crippled God

Fantasy, like Science Fiction, is a genre that gets scant respect, in spite of (and perhaps due to) its popular appeal. Literary critics require the turgid prose of a James Joyce or T.S. Eliot to feel a smug sense of superiority over the unwashed masses unable to appreciate pedantry for its own sake. It is true many fantasy novels are serialized hack work designed to be sold by the pound, but the better specimens of the genre are worthwhile reads, beginning with The Lord of the Rings, the book that started the modern phenomenon.

The problem with The Lord of the Rings is that it casts too wide a shadow, and the inevitable comparisons do not do justice to later authors’ originality. One of the weaknesses in the LOTR is its reactionary social value system. Not surprisingly, an Oxford don like Tolkien did not break from the mental shackles of the English class system, still enduring today and much stronger in the early twentieth century. The books show strong dislike of people daring to rise beyond their station, and an uncritical approbation of monarchy.

Many authors have strived to portray grittily realistic worlds that eschew the simplistic good-versus-evil morality plays so beloved of religious fanatics and political extremists worldwide. It is important to note that this moral ambivalence, or more precisely the refusal to make hasty judgments on morality, is not a recent phenomenon. The Tale of Gilgamesh is the first epic, written five thousand years ago in ancient Sumeria, and the eponymous hero is depicted in the beginning of the tale as a tyrant. Among these non-manichean works, Stephen Donaldson’s Chronicles of Thomas Covenant, Unbeliever and Glen Cook’s Black Company stand out. To these major works, we must now add Steven Erikson’s Malazan Book of the Fallen.

This relatively recent series (one installment per year, 10 planned) is not yet famous in the United States. At the end of the Second World War, British and American publishers came to a non-compete agreement dividing the English-speaking world in respective turfs. Steven Erikson is a Canadian living in the United Kingdom. Five of the books in the Malazan series have already been published in the British publishers’ traditional market, when the first one only now reached American bookstores. He is not the only author to suffer these delays, Iain M. Banks excellent Science Fiction and other novels also take several years to cross the Atlantic — but not J. K. Rowling’s Harry Potter series, as sheer demand would probably cause parallel imports to overwhelm the tottering system. In the era of global e-commerce, it is easy to get around these anti-competitive measures, by ordering from Canada, at the cost of higher shipping fees (shipping from Amazon UK is prohibitively expensive). Amazon Canada and Chapters are good sources.

The Malazan Book of the Fallen chronicles the legions of the Malazan Empire, strongly reminiscent of the Roman Empire, in a world ruled by magic and where mortals routinely ascend to divinity, and conversely, where gods are routinely killed or enslaved. This is not without precedent in world mythology, indeed it is very similar to the beliefs of the Greeks. The Malazans soldier on against impossible odds in their efforts to establish good government in the place of squabbling feudalists, to the backdrop of cosmic struggles spanning hundreds of millennia. Only their discipline, adaptability, dogged tenacity and judicious use of sappers allows them to save the day (though with grievous losses). This is a conceit, of course, albeit a common one in Fantasy — every historical army eventually conformed to Brien’s First Law and outstripped its ability to succeed in spite of itself. The supposed benevolence of Malazan Imperial administration would also be a historical first – no empire in history has ever been truly benign. One has only to read Polybius, Flavius Josephus, or Cicero’s Verrines to realize just how rapacious and murderous the Roman Empire really was. The Mongol Empire was noted for its ghastly invention of pyramids of skulls. The British Empire perfected moral hypocrisy, genocide, continent-scale drug dealing and invented the concentration/extermination camp.

Fantasy can be seen as an exercise in speculative metaphysics, and any metaphysics that allows for magic implicitly subscribes to some form of Idealism at its core, but only Borges, a great admirer of Schopenhauer, has truly approached it this way. In the real world, Idealism has led to unspeakable acts of mass murder through its offsprings Marxism-Leninism and Nazism (interestingly, Schopenhauer, possibly influenced by Buddhism, predicted that Idealism would transform good intentions into evil deeds). Is there any reason to suppose an universe that has Idealism as its very essence, not merely the conjecture of philosophers, would escape the same consequences? Indeed, Erikson’s universe has seen its share of genocides, some ongoing. Erikson trained and practiced as an archaeologist, not a philosopher, and while he occasionally stumbles upon the idea that negation of magic would be a major ethical imperative (most noticeably in his invention of the Azath, a force that binds and neutralizes strong foci of magical power, and Otataral, a magic-negating ore resulting from reaction to the cataclysmic unleashing of magic), he does not (yet) make the most of it.

Glen Cook’s influence is clearly visible, and is acknowledged by the author, although Erikson’s world is much vaster in scope and richly developed than Cook’s. The soldier-historian Duiker is clearly modeled on Black Company annalist (and later Captain) Croaker. The rough banter and grumbling of the Malazan legions would not feel out of place in a Black Company mess hall. The backdrop to the Malazan series, including the machinations of the gods and elder races, distinguishes it from the Black Company. Many of the most notable characters penned by Erikson are drawn from this back story and its criss-crossing story lines. It is hard to forget the warrior-mage-dragon Anomander Rake, leading his dying race in an effort to shake it from terminal ennui, or the cocky prehistoric T’lan Imass warriors who pledged themselves to an undead crusade against would-be tyrants.

The first four volumes in the series alternate two story lines, that of the beleaguered Malazan expeditionary force on the far-flung continent of Genabackis, and a brewing rebellion modeled on the Indian war of Independence of 1857 (sometimes incorrectly referred to as the “Sepoy Mutiny” by British Empire apologists). The fifth volume marks a break in continuity and tone. In some respects, notably the intrigues and market manipulations of financial mastermind Tehol Beddict, it reaches almost Pratchett-esque levels of comedy. A common trait with Fantasy series is that inspiration tends to flag with time and latter volumes are pale shadows of the originals. This is particularly flagrant with Robert Jordan’s Wheel of Time where basically nothing happens in over a thousand pages of the last volume. In comparison, the Malazan Book of the Fallen is very densely written. Little space is wasted on protracted narrative sequences or equivocating characters beyond what is necessary for character development. The action is gripping from cover to cover. All in all, a very promising series that ranks among the finest in the genre.

Update (2005-11-13):

I added links to volumes 2 and 3, now published in the US as well. Also check out some glimpses of the series’ future from Steven Erikson’s recent book tour.

Switching to Camino

I mentioned earlier that I had switched to Mozilla Firefox (then called Firebird) as my default web browser, from Mozilla (I still use Mozilla on Solaris). In the last few months, the Firefox bandwagon started becoming mainstream, probably due to exasperation with the continuing security holes in Microsoft’s Internet Explorer.

That said, I have also switched to the Mac at home, and Firefox on Mac OS X often feels like an afterthought. Several bugs have gone unfixed in the last three releases or so, even though patches have been submitted. I am not excessively fond of Safari, Apple’s default browser, and the ability to share profile data between my Windows machine at work and my Mac at home is a big benefit.

Two weeks ago, I tried Camino on my home machine. Camino is a derivative of Mozilla – it uses the same HTML rendering engine, but wraps it in a shell that leverages Apple’s technologies the way a cross-platform browser like Firefox or Mozilla can’t. Earlier versions had been unconvincing, but I switched for the 0.8.1 release. Firefox 1.0PR on the Mac is an unalloyed disaster, buggy and crash-prone, without any visible bug fixes (I switched back to 0.9.3 within a couple of hours), and that was probably the last straw.

The immediate benefits Camino brings me are the following:

  • Middle-clicking on a link opens it in a new tab, the way it does for Firefox on all platforms but the Mac
  • Navigating through Web forms using the tab key works perfectly, when Firefox and Safari will only let you switch between text fields, but not pull-down menus, radio buttons or the like.
  • When minimizing windows using Exposé, there is no annoying Firefox or Mozilla ghost window cluttering the screen.

Of course, not all is perfect, and the migration entails these pitfalls:

  • I have Firefox set up so if I type a few words separated by spaces in the URL bar, it searches Google. This avoids the need for two text boxes, one for th URL and one for searching (the way Firefox does in its default configuration, or Safari), which are redundant and not as usable. Unfortunately Camino does not support this directly and pops up a modal dialog box complaining about the illegal URL format. Fortunately, Camino does support Mozilla’s excellent keywords feature, so I created a keyword “g” to handle Google queries.
  • Camino keeps bookmarks in a OS X style XML plist format, rather than the standard bookmark format used by other Mozilla variants. This makes synchronizing bookmarks a little bit slower, as you have to use the import utility instead of simply copying a file over. Bookmark imports are not perfect, moreover, as they tend to drop separators.
  • The saved passwords are not interoperable, as Camino stores them in OS X’s Keychain manager instead of Mozilla’s encrypted database format (I don’t know if this means Camino and Safari can share passwords). I have started working on Python modules to read and decrypt the Mozilla files, however, and I have a low-priority password sync project on my back burner.
  • Camino doesn’t have the wealth of extensions Firefox does, but then again since they seem to break with every release of Firefox (and many don’t work well on the Mac), this is less of a disadvantage than may seem at first glance.