Python

Street sweeping reminders in iCal

Parking signSan Francisco sweeps streets twice a month in residential neighborhoods, and you will be fined if your car is parked on a street being swept. On my street, the schedule is the first and third Monday of each month, between 9am and 11am. I was trying to create reminders to myself in my calendar. Unfortunately, iCal does not have the ability to specify a recurring event with that definition.

No matter, Python to the rescue, the script below generates a year’s worth of reminders 12 hours before the event, in iCal vCalendar format. It does not correct for holidays, you will have to remove those yourself.

#!/usr/bin/python
"""Idora street sweeping calendar - 1st and 3rd Mondays of the month 9am-11am"""
import datetime
Monday = 0
one_day = datetime.timedelta(1)
today = datetime.date.today()
year = today.year
month = today.month

def output(day):
  print """
BEGIN:VEVENT
DTEND:%(end)s
SUMMARY:Idora street sweeping
DTSTART:%(start)s
BEGIN:VALARM
TRIGGER:-PT12H
ATTACH;VALUE=URI:Basso
ACTION:AUDIO
END:VALARM
END:VEVENT
""" % {
    'end': day.strftime('%Y%m%dT110000'),
    'start': day.strftime('%Y%m%dT090000')
    }

print """BEGIN:VCALENDAR
CALSCALE:GREGORIAN
VERSION:2.0"""

for i in range(12):
  day = datetime.date(year, month, 1)
  while day.weekday() != Monday:
    day += one_day
  output(day)
  output(day + 14 * one_day)
  month += 1
  if month > 12:
    month = 1
    year += 1

print "END:VCALENDAR"

Scanning your iTunes library for DRM-infested books

Tor, the leading publisher for Science Fiction and Fantasy books, announced they would be doing away with DRM in their eBooks. The product pages for their books on iBooks now mention “At the publisher’s request, this title is being sold without Digital Rights Management software (DRM) applied”. I figured it would be a good idea to uncripple the many Tor eBooks I have in my collection.

I wrote a quick little Python script to scan my growing iBooks library for books that could be updated. The procedure is to delete the book from both iTunes and iPads, then download it anew (restarting iTunes is also needed after deleting). Apple keeps track of your purchases and will not charge you again.

#!/usr/bin/env python
import sys, os.path, glob, zipfile, platform, xml.etree.ElementTree

# publishers who have forsaken DRM
good = ['Tom Doherty']

if platform.mac_ver()[0] > '10.8':
  bookdir = os.path.expanduser('~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks')
else:
  bookdir = os.path.expanduser('~/Music/iTunes/iTunes Music/Books')

os.chdir(bookdir)

ok =  '\033[1;32mDRM-free    \033[0m'
bad = '\033[1;31mDRM-infested\033[0m'

count = 0
salvageable = 0

def extract(meta):
  creator = ''
  status = ''
  pub = ''
  et = xml.etree.ElementTree.fromstring(meta)
  try:
    creator = et.findall('*{http://purl.org/dc/elements/1.1/}creator')
    creator = creator[0].text
    title = et.findall('*{http://purl.org/dc/elements/1.1/}title')
    title = title[0].text
  except:
    assert '!DOCTYPE plist' in meta
    next_tag = None
    for e in et[0].iter():
      if e.tag == 'key' and e.text in ('artistName', 'itemName'):
        next_tag = e.text
        continue
      if next_tag == 'artistName':
        creator = e.text
        next_tag = None
        continue
      elif next_tag == 'itemName':
        title = e.text
        next_tag = None
        continue
  pub = [x for x in good if x in meta]
  return creator, title, pub

def find_meta(file_list, opener):
  for m in file_list:
    if m.endswith('.opf') or m == 'iTunesMetadata.plist':
      meta = opener(m).read()
      return extract(meta)
  
for fn in glob.glob('*/*.epub'):
  status = ok
  suffix = ''
  if os.path.isdir(fn):
    suffix = '(directory)'
    if os.path.exists(fn + '/META-INF/encryption.xml'):
      status = bad
      count += 1
    meta = find_meta(os.listdir(fn), lambda x: open(fn + '/' + x))
  else:
    z = zipfile.ZipFile(fn)
    try:
      i = z.getinfo('META-INF/encryption.xml')
      status = bad
    except KeyError:
      pass
    meta = find_meta(z.namelist(), z.open)
    z.close()
  creator, title, pub = meta
  print status, fn, suffix
  print '\t', creator
  print '\t', title
  if status == bad and pub:
    print '\t\033[1;32mThis is published by', pub[0],
    print 'and could be re-downloaded DRM-free\033[0m'
    salvageable += 1

print count, 'books are DRM-infested'
print salvageable, 'could be cured'

Unfortunately, it seems like the DRM-stripping is still work in progress. Out of the Wheel of Time series, for instance, only the first one is now DRM-free on the iBooks store.

undr ~>drmbooks.py
DRM-free     Books/0083D0AEC37E08453347DD12B1C6F980.epub
    Greg Bear
    Blood Music
DRM-free     Books/09178837756A4DFF8347EC377345A37B.epub
    Heinz Wittenbrink
    RSS and Atom
DRM-free     Books/0AD752E995042C7E12F11917AB58C6B8.epub
    Wes McKinney
    Python for Data Analysis
DRM-free     Books/14BDC66A99E878EC232FFAFA73B341EF.epub
    Fritz Leiber
    Swords and Deviltry-Fafhrd and the Gray Mouser-Book1
DRM-free     Books/15A1D7FE9B7D815C6FBE1A9A77D7143E.epub
    Glen Cook
    A Fortress in Shadow
DRM-free     Books/1793F9DE1319B96FDE7E36EB8A1BC961.epub
    Scalzi, John
    Old Man’s War
DRM-free     Books/1D08BE221E8BC8F2A371EFEDE55029AC.epub
    Ben Fry
    Visualizing Data
DRM-free     Books/24D6EC36CDEA0C1E8612CC61A89EA098.epub
    None
    Node Cookbook
DRM-free     Books/29DA285F0051C431BD8BA3D1AEC5EAA6.epub
    Fritz Leiber
    The Swords of Lankhmar: Fafhrd and the Gray Mouser-Book 5
DRM-free     Books/2E88CD68DFD8408CD0E7C0ACB1E78714.epub
    Glen Cook
    A Cruel Wind: A Chronicle of the Dread Empire
DRM-free     Books/32996A9995040064818BAE4DFB66E92F.epub
    Kelly Link
    Magic for Beginners
DRM-free     Books/34D3CD13D47E5FEBC6DCF7EF011113BD.epub
    David Drake
    Lord of the Isles
DRM-infested Books/357298432.epub
    Iain M. Banks
    The Player of Games
DRM-infested Books/357311036.epub
    Iain M. Banks
    Use of Weapons
DRM-infested Books/357377857.epub
    Iain M. Banks
    Against a Dark Background
DRM-infested Books/357396585.epub
    Brent Weeks
    Night Angel: The Complete Trilogy
DRM-infested Books/357657026.epub
    Iain M. Banks
    Transition
DRM-infested Books/357658374.epub
    Po Bronson
    NurtureShock: New Thinking About Children
DRM-infested Books/357662058.epub
    Iain M. Banks
    Consider Phlebas
DRM-infested Books/357669769.epub
    Iain M. Banks
    Matter
DRM-infested Books/357914731.epub
    Herbert, Frank
    Dune Messiah
DRM-infested Books/357918110.epub
    Dalrymple, William
    City of Djinns
DRM-infested Books/357923567.epub
    Patrick Rothfuss
    The Name of the Wind
DRM-infested Books/357929995.epub
    Herbert, Frank
    Dune (40th Anniversary Edition)
DRM-infested Books/357969577.epub
    Herbert, Frank
    God Emperor of Dune
DRM-infested Books/357987322.epub
    Herbert, Frank
    Children of Dune
DRM-infested Books/357994537.epub
    Herbert, Frank
    Heretics of Dune
DRM-infested Books/357994652.epub
    William Dalrymple
    White Mughals: Love and Betrayal in Eighteenth-Century India
DRM-infested Books/357996119.epub
    Stross, Charles
    Wireless
DRM-infested Books/360601506.epub
    Ursula K. Le Guin
    The Dispossessed
DRM-infested Books/360609519.epub
    Greg Egan
    Schild’s Ladder
DRM-infested Books/360627712.epub
    Raymond E. Feist
    Rides a Dread Legion
DRM-infested Books/360627930.epub
    Neal Stephenson
    Anathem
DRM-infested Books/360628773.epub
    Raymond E. Feist
    At the Gates of Darkness
DRM-infested Books/360641088.epub
    Mihaly Csikszentmihalyi
    Flow
DRM-free     Books/361491495.epub
    Basil Hall Chamberlain
    Aino Folk-Tales
DRM-free     Books/361494664.epub
    Poul William Anderson
    Industrial Revolution
DRM-free     Books/361523763.epub
    Lafcadio Hearn
    The Romance of the Milky Way / And Other Studies & Stories
DRM-free     Books/361527545.epub
    Saki
    When William Came
DRM-free     Books/361539032.epub
    Saki
    The Chronicles of Clovis
DRM-free     Books/361557387.epub
    Saki
    Reginald in Russia and other sketches
DRM-free     Books/361557834.epub
    Sir Arthur Conan Doyle
    The Adventure of the Dying Detective
DRM-free     Books/361559391.epub
    Sir Arthur Conan Doyle
    The Valley of Fear
DRM-free     Books/361560694.epub
    Lafcadio Hearn
    Chita: a Memory of Last Island
DRM-free     Books/361561399.epub
    Confucius
    The Analects of Confucius (from the Chinese Classics)
DRM-free     Books/361562678.epub
    Saki
    The Toys of Peace, and other papers
DRM-free     Books/361562764.epub
    Sir Arthur Conan Doyle
    The Memoirs of Sherlock Holmes
DRM-free     Books/361564075.epub
    Poul William Anderson
    The Burning Bridge
DRM-free     Books/361564898.epub
    Henry David Thoreau
    Walden
DRM-free     Books/361565201.epub
    Saki
    Beasts and Super-Beasts
DRM-free     Books/361565806.epub
    Isaac Asimov
    Youth
DRM-free     Books/361572327.epub
    Lafcadio Hearn
    Kokoro / Japanese Inner Life Hints
DRM-free     Books/361573126.epub
    Sir Arthur Conan Doyle
    Through the Magic Door
DRM-free     Books/361575882.epub
    Sir Arthur Conan Doyle
    Tales of Terror and Mystery
DRM-free     Books/361578744.epub
    Lafcadio Hearn
    In Ghostly Japan
DRM-free     Books/361588265.epub
    Lafcadio Hearn
    Books and Habits from the Lectures of Lafcadio Hearn
DRM-free     Books/361673695.epub
    E. C. Babbitt
    More Jataka Tales
DRM-free     Books/361686559.epub
    Poul William Anderson
    Security
DRM-free     Books/361713863.epub
    Sir Arthur Conan Doyle
    The Return of Sherlock Holmes
DRM-free     Books/361721797.epub
    Sir Arthur Conan Doyle
    The Adventure of the Cardboard Box
DRM-free     Books/361725352.epub
    Saki
    The Unbearable Bassington
DRM-free     Books/361725959.epub
    Sir Arthur Conan Doyle
    The Adventure of Wisteria Lodge
DRM-free     Books/361726975.epub
    Saki
    Reginald
DRM-free     Books/361727237.epub
    Sir Arthur Conan Doyle
    The Adventure of the Red Circle
DRM-free     Books/361732286.epub
    Poul William Anderson
    The Valor of Cappen Varra
DRM-free     Books/361736043.epub
    Lafcadio Hearn
    Kwaidan: Stories and Studies of Strange Things
DRM-free     Books/361741563.epub
    Lafcadio Hearn
    Japan: an Attempt at Interpretation
DRM-free     Books/361743007.epub
    Ambrose Bierce
    The Devil’s Dictionary
DRM-free     Books/361743953.epub
    Sir Arthur Conan Doyle
    The Adventure of the Devil’s Foot
DRM-free     Books/361744178.epub
    Sir Arthur Conan Doyle
    His Last Bow
DRM-free     Books/361745602.epub
    Poul William Anderson
    The Sensitive Man
DRM-infested Books/362435686.epub
    Ansary, Tamim
    Destiny Disrupted
DRM-infested Books/366773380.epub
    Esslemont, Ian C. C.
    Return of the Crimson Guard
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/373338999.epub
    Jordan, Robert
    The Path of Daggers
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-free     Books/375554215.epub
    Orson Scott Card
    The Lost Gate
DRM-infested Books/376217648.epub
    Steven Erikson
    Reaper’s Gale
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/376227359.epub
    Jordan, Robert
    Winter’s Heart
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/376227401.epub
    Jordan, Robert
    Crossroads of Twilight
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/376227406.epub
    Jordan, Robert
    Knife of Dreams
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/376227409.epub
    Sanderson, Brandon
    The Gathering Storm
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/376227423.epub
    Jordan, Robert
    New Spring
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-free     Books/376231110.epub
    Cook, Glen
    Surrender to the Will of the Night
DRM-infested Books/376231528.epub
    Robert Jordan and Brandon Sanderson
    Towers of Midnight
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/378317076.epub
    Jordan, Robert
    A Crown of Swords
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/378317808.epub
    Robert Jordan
    Lord of Chaos
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-free     Books/379451459.epub
    Le Guin, Ursula K.
    Word for World is Forest, The
DRM-free     Books/37EC6E895E6BD70BEB48D9F1553D608E.epub
    Eben Hewitt
    Cassandra: The Definitive Guide
DRM-free     Books/380490608.epub
    Jordan, Robert
    The Eye of the World
DRM-free     Books/380494444.epub
    Asimov, Isaac
    The End of Eternity
DRM-infested Books/381497257.epub
    Harold McGee
    On Food and Cooking, The Science and Lore of the Kitchen
DRM-infested Books/381622032.epub
    IAIN M. BANKS
    Look to Windward
DRM-infested Books/381683084.epub
    Ursula K. Le Guin
    Tehanu
DRM-infested Books/381935940.epub
    Richard Adams
    Watership Down
DRM-infested Books/382674388.epub
    Steven Erikson
    Dust of Dreams
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/383912791.epub
    Steven Erikson
    Bauchelain and Korbal Broach
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/385975662.epub
    Jordan, Robert
    The Dragon Reborn
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/385981104.epub
    Steven Erikson
    The Bonehunters
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/385981116.epub
    Steven Erikson
    Midnight Tides
DRM-free     Books/385982966.epub
    Brust, Steven
    To Reign in Hell
DRM-infested Books/385987858.epub
    Steven Erikson
    Gardens of the Moon
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/385989170.epub
    Steven Erikson
    House of Chains
DRM-infested Books/385992628.epub
    Jordan, Robert
    The Fires of Heaven
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/385992927.epub
    Steven Erikson
    Toll the Hounds
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/385992930.epub
    Jordan, Robert
    The Great Hunt
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/385998417.epub
    Jordan, Robert
    The Shadow Rising
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/386016540.epub
    Esslemont, Ian C. C.
    Night of Knives
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/388403394.epub
    Steven Erikson
    The Crippled God
DRM-infested Books/389191300.epub
    Iain M. Banks
    Surface Detail
DRM-infested Books/390877859.epub
    Loewen, James W.
    Lies My Teacher Told Me
DRM-infested Books/393310992.epub
    Erikson, Steven
    Memories of Ice
DRM-infested Books/394745271.epub
    Steven Erikson
    Deadhouse Gates
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-free     Books/394745833.epub
    Walton, Jo
    Among Others
DRM-free     Books/395536306.epub
    Sir Arthur Conan Doyle
    The Adventures of Sherlock Holmes
DRM-free     Books/395537209.epub
    Sir Arthur Conan Doyle
    The Sign of the Four
DRM-free     Books/395539542.epub
    Sir Arthur Conan Doyle
    A Study in Scarlet
DRM-free     Books/395540660.epub
    Sir Arthur Conan Doyle
    The Hound of the Baskervilles
DRM-free     Books/395686685.epub
    Dante Alighieri
    Divine Comedy, Longfellow’s Translation, Complete
DRM-free     Books/395688318.epub
    Edgar Rice Burroughs
    A Princess of Mars
DRM-free     Books/395688375.epub
    Lafcadio Hearn
    Glimpses of an Unfamiliar Japan / First Series
DRM-infested Books/395926792.epub
    Fukuyama, Francis
    Origins of Political Order
DRM-infested Books/396269736.epub
    Herbert, Frank
    Chapterhouse: Dune
DRM-infested Books/398283114.epub
    Rothfuss, Patrick
    The Wise Man’s Fear
DRM-free     Books/3A5FBC58E821CFDF15C8C4E85657481E.epub
    Jon Hicks
    The Icon Handbook
DRM-free     Books/410943153.epub
    Edwin A. Abbott (A Square)
    Flatland: A Romance of Many Dimensions
DRM-free     Books/413463878.epub
    Brust, Steven
    Tiassa
DRM-free     Books/418293515.epub
    Heinlein, Robert A.
    Glory Road
DRM-infested Books/419950945.epub
    Isaac Asimov
    Foundation
DRM-infested Books/419950970.epub
    Isaac Asimov
    Foundation and Empire
DRM-infested Books/419950976.epub
    Isaac Asimov
    Second Foundation
DRM-infested Books/419968238.epub
    Scott Lynch
    The lies of Locke Lamora
DRM-infested Books/419968784.epub
    Scott Lynch
    Red Seas Under Red Skies
DRM-infested Books/420037362.epub
    Kim Stanley Robinson
    The Years of Rice and Salt
DRM-infested Books/420281728.epub
    Richard Wiseman
    59 Seconds: Think a Little, Change a Lot
DRM-infested Books/420445771.epub
    Isaac Asimov
    Foundation’s Edge
DRM-infested Books/420446058.epub
    Isaac Asimov
    Foundation and Earth
DRM-infested Books/420725428.epub
    Mike Resnick
    Kirinyaga: A Fable of Utopia
DRM-infested Books/421025353.epub
    William Dalrymple
    The Last Mughal
DRM-free     Books/421124117.epub
    Brust, Steven
    The Desecrator
DRM-infested Books/422530144.epub
    Max Barry
    Machine Man
DRM-free     Books/422718511.epub
    Apple Inc.
    Mac Integration Basics
DRM-free     Books/426914658.epub
    Brust, Steven
    Five Hundred Years After
DRM-free     Books/428235697.epub
    Vinge, Vernor
    A Fire Upon The Deep
DRM-infested Books/429173089.epub
    Ursula K. Le Guin
    The Other Wind
DRM-infested Books/429173713.epub
    Ursula K. Le Guin
    Tales from Earthsea
DRM-infested Books/429699133.epub
    Rajaniemi, Hannu
    The Quantum Thief
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/431617578.epub
    Walter Isaacson
    Steve Jobs
DRM-infested Books/432519291.epub
    Susan Weinschenk
    100 Things: Every Designer Needs to Know About People
DRM-infested Books/434509014.epub
    Stross, Charles
    Rule 34
DRM-free     Books/434522188.epub
    Larry Niven, Jerry Pournelle
    The Mote In God’s Eye
DRM-free     Books/434811509.epub
    Asher, Neal
    Cowl
DRM-infested Books/436646026.epub
    Neal Stephenson
    Reamde
DRM-infested Books/436691174.epub
    Julia Child
    Mastering the Art of French Cooking
DRM-infested Books/443149884.epub
    Daniel Kahneman
    Thinking, Fast and Slow
DRM-infested Books/446155927.epub
    Esslemont, Ian C. C.
    Stonewielder
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-free     Books/447591195.epub
    Asher, Neal
    The Skinner
DRM-infested Books/454252718.epub
    William B. Norton
    The Internet Peering Playbook: Connecting to the Core of the Internet
DRM-infested Books/455525627.epub
    Amar Chitra Katha
    Birbal The Genius
DRM-infested Books/458461362.epub
    Pamela Druckerman
    Bringing Up Bebe
DRM-free     Books/45B90418E467D479DCDDF23B932C648C.epub
    Douglas Crockford
    JavaScript: The Good Parts
DRM-infested Books/460822066.epub (directory)
    Scott Lynch
    The Republic of Thieves
DRM-free     Books/465679A557523FDB836005CF4BB9380E.epub
    Scott Berkun
    Mindfire
DRM-infested Books/479594044.epub
    Saladin Ahmed
    Throne of the Crescent Moon
DRM-infested Books/479717436.epub
    David Crist
    The Twilight War: The Secret History of America’s Thirty-Year Conflict with Iran
DRM-infested Books/479771801.epub
    William Dalrymple
    In Xanadu
DRM-infested Books/489957500.epub
    Bruce Schneier
    Liars and Outliers
DRM-infested Books/491186678.epub
    James Blish
    Cities in Flight
DRM-infested Books/491668459.epub
    Neal Asher
    Shadow of the Scorpion
DRM-infested Books/491669284.epub
    Glen Cook
    A Matter of Time
DRM-infested Books/491669288.epub
    Glen Cook
    Darkwar
DRM-free     Books/492199230.epub
    Frederik Pohl
    The Tunnel Under the World
DRM-free     Books/492199569.epub
    Frederik Pohl
    The Knights of Arthur
DRM-infested Books/494939678.epub
    Esslemont, Ian C. C.
    Orb Sceptre Throne
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/498634992.epub
    Charles Stross
    The Apocalypse Codex
DRM-infested Books/499392787.epub
    Daniel Suarez
    Kill Decision
DRM-free     Books/4AE7DDA54BEEFCD65157927546B18063.epub
    Roberto Ierusalimschy
    Programming in Lua 2ed
DRM-free     Books/4DCFF682728B765A1CE221F3D7C21536.epub
    Glen Cook
    Starfishers Volume 3: Stars’ End
DRM-free     Books/501278407.epub
    Colette
    Chéri
DRM-free     Books/501758197.epub
    David Brin
    Existence
DRM-infested Books/501758516.epub
    Scalzi, John
    Redshirts
    This is published by Tom Doherty and could be re-downloaded DRM-free
DRM-infested Books/503019669.epub
    J.R.R. Tolkien
    The Lord of the Rings
DRM-infested Books/503153300.epub
    J.R.R. Tolkien
    Tales from the Perilous Realm
DRM-infested Books/503154129.epub
    J. R. R. Tolkien and Christopher Tolkien
    The Book of Lost Tales, Part One
DRM-infested Books/503154991.epub
    J. R. R. Tolkien and Christopher Tolkien
    The Book of Lost Tales, Part Two
DRM-infested Books/503155327.epub
    J.R.R. Tolkien
    The Children of Húrin
DRM-infested Books/503163148.epub
    J.R.R. Tolkien
    The Hobbit Deluxe
DRM-infested Books/503164678.epub
    J.R.R. Tolkien
    The Silmarillion
DRM-infested Books/503167303.epub
    J.R.R. Tolkien
    Unfinished Tales of Númenor and Middle-earth
DRM-infested Books/504209078.epub
    Isaac Asimov
    Prelude to Foundation
DRM-infested Books/504371982.epub
    Iain M. Banks
    The Hydrogen Sonata
DRM-free     Books/511060740.epub
    Frederik Pohl
    The Hated
DRM-free     Books/511143617.epub
    Frederik Pohl
    The Day of the Boomer Dukes
DRM-free     Books/511252896.epub
    Frederik Pohl
    Pythias
DRM-free     Books/513357605.epub
    Hannu Rajaniemi
    The Fractal Prince
DRM-free     Books/513868CF0AA46D293EE27F74BC399760.epub
    Jerry Pournelle
    West of Honor
DRM-infested Books/520233773.epub
    Nate Silver
    The Signal and the Noise
DRM-free     Books/520897403.epub
    Hugh Howey
    Wool Omnibus
DRM-free     Books/525170910.epub
    Cory Doctorow and Charles Stross
    The Rapture of the Nerds
DRM-infested Books/526136048.epub (directory)
    Hetty van de Rijt & Frans Plooij
    The Wonder Weeks
DRM-infested Books/529020127.epub
    Neal Asher
    The Departure
DRM-infested Books/529424632.epub
    Guy Gavriel Kay
    The Lions of Al-Rassan
DRM-infested Books/536312376.epub
    Glen Cook
    Garrett for Hire
DRM-free     Books/537023027.epub
    Erikson, Steven
    Forge of Darkness
DRM-infested Books/541673159.epub
    Ursula K. Le Guin
    The Tombs of Atuan
DRM-infested Books/541673162.epub
    Ursula K. Le Guin
    The Farthest Shore
DRM-infested Books/546126326.epub
    Jan Morris
    HAV
DRM-infested Books/551241606.epub
    Ursula K. Le Guin
    A Wizard of Earthsea
DRM-infested Books/551567785.epub
    Murray R. Spiegel, PhD
    Schaum’s Outline Mathematical Handbook of Formulas and Tables, Fourth Edition
DRM-infested Books/551747038.epub
    Chris Hedges and Joe Sacco
    Days of Destruction Days of Revolt v2b
DRM-infested Books/552144691.epub
    Tamim Ansary
    Games without Rules
DRM-free     Books/553878102.epub
    Charles Stross
    A Tall Tail
DRM-infested Books/563408849.epub
    Iain Banks
    Stonemouth
DRM-infested Books/568731449.epub
    William Dalrymple
    Return of a King: The Battle for Afghanistan, 1839-42
DRM-infested Books/569232538.epub
    Neil Gaiman
    The Ocean at the End of the Lane
DRM-free     Books/571678152.epub
    Glen Cook
    The Return of the Black Company
DRM-free     Books/571678945.epub
    Glen Cook
    The Many Deaths of the Black Company
DRM-free     Books/573656304.epub
    Glen Cook
    Chronicles of the Black Company
DRM-free     Books/573656441.epub
    Glen Cook
    The Books of the South
DRM-free     Books/576233114.epub
    Jack Vance
    Demon Princes
DRM-infested Books/578851675.epub
    Zilpha Keatley Snyder
    Below The Root
DRM-infested Books/580642602.epub
    Max Barry
    Lexicon
DRM-infested Books/588794444.epub
    Charles Stross
    Neptune’s Brood
DRM-free     Books/5CCB889F91585202637A3C5FBFD8409F.epub
    Glen Cook
    Starfishers-The Starfishers Trilogy Volume II
DRM-infested Books/600938002.epub
    Gardner Dozois
    The Year’s Best Science Fiction: Thirtieth Annual Collection
DRM-free     Books/606232503.epub
    Ian C. Esslemont
    Blood and Bone
DRM-free     Books/610920977.epub
    Robert Jordan and Brandon Sanderson
    A Memory of Light
DRM-free     Books/6122FF30560866BD75257E6CCC264371.epub
    Fritz Leiber
    Swords and Ice Magic-Fafhrd and the Gray Mouser-Book 6
DRM-infested Books/619483561.epub
    Raymond E. Feist
    Magician’s End
DRM-free     Books/622837311.epub
    Le Comte De  Lautréamont
    Les chants de Maldoror
DRM-free     Books/62497334A4189B1D00E9BDFD95724E2E.epub
    Fritz Leiber
    The Knight and Knave of Swords-Fafhrd and the Gray Mouser-Book 7
DRM-infested Books/645571245.epub
    Iain Banks
    The Quarry
DRM-free     Books/647688922.epub
    Steven Brust and Skyler White
    The Incrementalists
DRM-infested Books/651331715.epub
    Susan Crawford
    Captive Audience
DRM-infested Books/654456347.epub
    Iain Banks
    The Wasp Factory
DRM-infested Books/662310218.epub (directory)
    Various Authors
    Star Wars: Empire Volume 3 – The Imperial Perspective
DRM-infested Books/662310219.epub
    Paul Gulacy
    Star Wars: Crimson Empire
DRM-infested Books/664297397.epub (directory)
    Various Authors
    Star Wars: Empire, Vol. 4: The Heart of the Rebellion
DRM-infested Books/664343525.epub (directory)
    Paul Chadwick, Doug Wheatley & Tomás Giorello
    Star Wars: Empire, Vol. 2: Darklighter
DRM-infested Books/664894567.epub
    John Ostrander
    Star Wars: Dawn of the Jedi Volume 1—Force Storm
DRM-infested Books/664910993.epub (directory)
    Scott Allie, Ryan Benjamin & Brian Horton
    Star Wars: Empire Vol. 1
DRM-free     Books/666772E4C21E2017E71C10F1990840BC.epub
    Pieter Hintjens
    ZeroMQ - Connecting your Code
DRM-infested Books/674224604.epub (directory)
    Various Authors
    Star Wars: Empire, Vol. 5: Allies and Adversaries
DRM-infested Books/674226603.epub (directory)
    Thomas Andrews, Scott Allie & Various Authors
    Star Wars: Empire, Vol. 6: In the Shadows of Their Fathers
DRM-free     Books/697938901.epub
    Charles Stross
    Equoid: A Laundry Novella
DRM-free     Books/6BBD61A64A46FFFF4AE7C5FCEB9CFCEE.epub
    David Drake
    Balefires
DRM-free     Books/71C79A253586282ABE73C2975237EB08.epub
    Glen Cook
    Sung in Blood
DRM-free     Books/73BC5DD0E51611BDC359CBAB48CC203F.epub (directory)
    Crane, Stephen
    The Red Badge of Courage
DRM-free     Books/7A0D1AEE343638A8A3769DABB90CD4CB.epub
    Clay A. Johnson
    The Information Diet
DRM-free     Books/7B46914B55481714DED3D711288978FA.epub
    Glen Cook
    Shadowline-The Starfishers Trilogy I
DRM-free     Books/809DCC75FE60E749458CB27636EE6777.epub
    Mercedes Lackey
    The Secret World Chronicle
DRM-free     Books/8350F5DFECCD6AA88714400FDF4F6831.epub
    François de La Rochefoucauld
    Réflexions ou Sentences et Maximes Morales
DRM-free     Books/853BD1C65277D276BA09E04EBFEB73EF.epub
    None
    PostgreSQL 9 Administration Cookbook
DRM-free     Books/85907063B2351E91C6E7F5052C090BFD.epub
    None
    PostgreSQL 9.0 High Performance
DRM-free     Books/8663E7BF735C06318FF450532A67F1C2.epub
    Glen Cook
    Passage at Arms
DRM-free     Books/8690D69995483C5D2DF6AD38BE53C1D7.epub
    Paolo Bacigalupi
    The Windup Girl - Second Electronic Edition
DRM-free     Books/8993920EF1C0E8FE8CB47A20BD955F53.epub
    Fritz Leiber
    Swords Against Death-Fafhrd and Gray Mouser-Book 2
DRM-free     Books/8DC6ED0A99161EBB516E370A60ED1121.epub
    Jonathan Zdziarski
    Hacking and Securing iOS Applications
DRM-free     Books/8DD0D22CD05B0B8D52DF9DB93FC8616B.epub
    Fritz Leiber
    Swords in the Mist-Fafhrd And the Gray Mouser-Book 3
DRM-free     Books/912CB8B110736684549EAC4FC36665AB.epub
    Neil Gaiman and Dave McKean
    Signal to Noise
DRM-free     Books/93061FDFD6EEC01FD8CE4295A049C97C.epub
    Cory Doctorow
    Homeland
DRM-free     Books/9592D5001632B94D1FEFC98B3A40E049.epub
    Kelly Link
    Stranger Things Happen
DRM-free     Books/990CA5799084407151488A7C563DF269.epub
    Tom Hughes-Croucher
    Node: Up and Running
DRM-free     Books/9CCCA9E217602948B84BE9A7A21C2753.epub
    Mike Resnick
    Birthright: The Book of Man
DRM-free     Books/9DA2C8D7E941C37FA83192E5849AA850.epub
    Q. Ethan McCallum
    Parallel R
DRM-free     Books/A7DB557515983DB67F74E2BE351FC319.epub
    Glen Cook
    Reap the East Wind
DRM-free     Books/B18AB85267B815E540C7E370F8D97726.epub
    Lars George
    HBase: The Definitive Guide
DRM-free     Books/B1A586D24AE5F2450010F8664F8E059D.epub
    Cory Doctorow
    Pirate Cinema
DRM-free     Books/B562B5D74F8677C946B0A6F81EB34F1B.epub
    Gotthold Ephraim Lessing
    Nathan the Wise; a dramatic poem in five acts
DRM-free     Books/BE42D426242836AA171539B7415732E6.epub
    Glen Cook
    The Swordbearer
DRM-free     Books/BE80AD93781746E45CF95607A7BEE687.epub
    Charlie Stross
    Bit Rot
DRM-free     Books/C39FBC59673713A57D404D62BE85C4DC.epub
    Lauren Beukes
    Zoo City
DRM-free     Books/C81CBBD470EFDBC62359BDAD12FDF551.epub
    Thomas Hobbes
    Leviathan
DRM-free     Books/D17DC1A83BD90AFB24055A01831BFAB4.epub
    Glen Cook
    The Dragon Never Sleeps
DRM-free     Books/DA8FBAB386EC503A0EF21E481D214521.epub
    David Flanagan
    JavaScript: The Definitive Guide
DRM-free     Books/DBAD0DE7A4305B90F6AC33C673FE68E8.epub
    Glen Cook
    A Path to Coldness of Heart
DRM-free     Books/DBCCB54B140D1158250B24B7E3E81B63.epub
    Scott Chacon
    Pro Git
DRM-free     Books/DF6C56F9A8719294B13BA91BED5E1667.epub
    Mike Resnick
    Ivory
DRM-free     Books/E8D76FDD351E68D92AE4A3F3AEA7EC6A.epub
    Paolo Bacigalupi
    Pump Six and Other Stories
DRM-free     Books/EEE65A68378C2510E18DF24CD767AC9A.epub
    Fritz Leiber
    Swords Against Wizardry-Fafhrd and the Gray Mouser-Book 4
DRM-free     Books/F3BC14466A16C96CCD8FFE00DCCF8147.epub
    Glen Cook
    An Empire Unacquainted With Defeat
DRM-free     Books/F534835595041374B814D151A633E69D.epub
    Peter Watts
    Blindsight
DRM-free     Books/F8BF760284ADC800B290DCA6D8EA7EF2.epub
    Ben Klemens
    21st Century C
DRM-free     Books/FE8C57863E40B98CB732FEE4BFDB60BB.epub
    Glen Cook
    An Ill Fate Marshalling
8 books are DRM-infested
26 could be cured

Update (2013-11-06):

OS X 10.9 Mavericks and the new iBooks app changed the location of the iBooks directory, I changed my script accordingly (and made it adjust depending on which OS version you have). Also, the file names have changed and no longer embed author and title, so I am extracting them from the XML metadata files.

Clearing custom crop aspect ratios in Lightroom

Lightroom’s crop tool allows you to constrain the aspect ratio to a proportion of your choice, e.g. to 4:3, defaulting to the same aspect ratio as the original. The last 5 or so custom crop aspect ratios are saved, but a minor annoyance is you are unable to clear the list.

Python on the Mac and SQLite to the rescue: this simple script  lraspect.zip will reset them. If you use a non-default name for your Lightroom catalog, you will need to edit it. To run it, quit Lightroom and run the script. It will back up your catalog for you just in case.

Needless to say, I cannot be held liable if this script corrupts your catalog or eats your dog (who ate your homework), use at your own risk.

#!/usr/bin/python
import sys, os, sqlite3

# edit this to point to your LR3 catalog if you do not use the default location
lrcat = os.path.expanduser('~/Pictures/Lightroom/Lightroom 3 Catalog.lrcat')

os.system('cp -i "%s" "%s.bak"' % (lrcat, lrcat))
db = sqlite3.connect(lrcat)
c = db.cursor()
c.execute("""select value from Adobe_variablesTable
where name='Adobe_customCropAspects'""")
crops = c.fetchone()[0]
print 'aspect ratios:', crops
c.execute("""update Adobe_variablesTable
set value='{}'
where name='Adobe_customCropAspects'""")
db.commit()
print 'Custom crop aspect ratios reset successfully'

Just enough Weave

Note: I am keeping this code around for historical purposes, but it has not worked since Weave 1.0 RC2. I created this because Mozilla’s public sync servers were initially quite unreliable, but they have remedied the situation and performance problems are a thing of the past. I also learned the inner workings of Weave/Firefox Sync in the process, and am satisfied as to the security of the system. Since I no longer use Firefox myself, I do not expect to ever revive this project. Feel free to take it over, otherwise you are best served by using Mozilla’s cloud.

Like most of my readers, I use multiple computers: my Mac Pro at home, my MacBook Air when on the road, 3 desktop PCs at work, a number of virtual machines, and so on. I have Firefox installed on all of them. The Mozilla Weave extension allows me to sync bookmarks, passwords et al between them. Weave encrypts this data before uploading it to the server, but I do not like to rely on third-party web services for mission-critical functions (my Mozilla server was down last Monday, for instance, due to the surge of traffic from people returning to work and performing a full sync against 0.5). Through Weave 0.5, I ran my own instance of the Mozilla public Weave server version 0.3. Unfortunately, Weave 0.6 requires server version 0.5 and I had to upgrade.

The open-source Weave server is implemented in PHP. It doesn’t require Apache compiled with mod_dav as early versions did (I prefer to run nginx), but it is still a fairly gnarly piece of code that is anything but plug-and-play. Somehow I had managed to get version 0.3 running on my home server, but no amount of blundering around got me to a usable state with 0.5. I ended up deciding to implement a minimalist Weave server in Python, as it seemed less painful than continuing to struggle with the Mozilla spaghetti code, which confusingly features multiple pieces of code that appear to do exactly the same thing in three different places. Famous last words…

Three days of hacking later, I managed to get it working. 200 or so lines of Python code replaced approximately 12,000 lines of PHP. Of course, I am not trying to reproduce an entire public cloud infrastructure like Mozilla’s, just enough for my own needs, using the “simplest thing that works” principle. Interestingly, the Mozilla code includes a vestigial Python reference implementation of a Weave server for testing purposes. It does not seem to have been working for a while, though. I used it as a starting point but ended up rewriting almost everything. Here are the simplifying hypotheses:

  • My weave server is meant for a single user (my wife prefers Safari)
  • It does not implement authentication, logging or SSL encryption — it is meant to be used behind a nginx (or Apache) reverse proxy that will perform these functions.
  • It has no configuration file. There are just three variables to set at the top of the source file.
  • It does not implement the full server protocol, just the parts that are actually used by the extension today.
  • More controversially, it does not even implement persistence, keeping all data in RAM instead. Python running on Solaris is very reliable, and the expected uptime of the server is likely months on end. If the server fails, the Firefoxes will just have to perform a full sync and reconciliation. Fortunately, that has been much improved in Weave 0.6, so the cost is minimal. This could even be construed as a security feature, since there is no data on disk to be misplaced. It would take catastrophically losing all my browsers simultaneously to risk data loss. Short of California falling into the ocean, that’s not going to happen, and if it does, I probably have more pressing concerns…

The code could be extended fairly easily to lift these hypotheses, e.g. adding persistence or multiple user support using SQLite, PostgreSQL or MySQL.

Here is the server itself, weave_server.py:

#!/usr/local/bin/python
"""
  Based on tools/scripts/weave_server.py from
  http://hg.mozilla.org/labs/weave/

  do the Simplest Thing That Can Work: just enough to get by with Weave 0.6
  - SSL, authentication and loggin are done by nginx or other reverse proxy
  - no persistence, in case of process failure do a full resync
  - only one user. If you need more, create multiple instances on different
    ports and use rewrite rules to route traffic to the right one
"""

import sys, time, logging, socket, urlparse, httplib, pprint
try:
  import simplejson as json
except ImportError:
  import json
import wsgiref.simple_server

URL_BASE = 'https://your.server.name/'
#BIND_IP = ''
BIND_IP = '127.0.0.1'
DEFAULT_PORT = 8000

class HttpResponse:
  def __init__(self, code, content='', content_type='text/plain'):
    self.status = '%s %s' % (code, httplib.responses.get(code, ''))
    self.headers = [('Content-type', content_type),
                    ('X-Weave-Timestamp', str(timestamp()))]
    self.content = content or self.status

def JsonResponse(value):
  return HttpResponse(httplib.OK, value, content_type='application/json')

class HttpRequest:
  def __init__(self, environ):
    self.environ = environ
    content_length = environ.get('CONTENT_LENGTH')
    if content_length:
      stream = environ['wsgi.input']
      self.contents = stream.read(int(content_length))
    else:
      self.contents = ''

def timestamp():
  # Weave rounds to 2 digits and so must we, otherwise rounding errors will
  # influence the "newer" and "older" modifiers
  return round(time.time(), 2)

class WeaveApp():
  """WSGI app for the Weave server"""
  def __init__(self):
    self.collections = {}

  def url_base(self):
    """XXX should derive this automagically from self.request.environ"""
    return URL_BASE

  def ts_col(self, col):
    self.collections.setdefault('timestamps', {})[col] = str(timestamp())

  def parse_url(self, path):
    if not path.startswith('/0.5/') and not path.startswith('/1.0/'):
      return
    command, args = path.split('/', 4)[3:]
    return command, args

  def opts_test(self, opts):
    if 'older' in opts:
      return float(opts['older'][0]).__ge__
    elif 'newer' in opts:
      return float(opts['newer'][0]).__le__
    else:
      return lambda x: True

  # HTTP method handlers

  def _handle_PUT(self, path, environ):
    command, args = self.parse_url(path)
    col, key = args.split('/', 1)
    assert command == 'storage'
    val = self.request.contents
    if val[0] == '{':
      val = json.loads(val)
      val['modified'] = timestamp()
      val = json.dumps(val, sort_keys=True)
    self.collections.setdefault(col, {})[key] = val
    self.ts_col(col)
    return HttpResponse(httplib.OK)

  def _handle_POST(self, path, environ):
    try:
      status = httplib.NOT_FOUND
      if path.startswith('/0.5/') or path.startswith('/1.0/'):
        command, args = self.parse_url(path)
        col = args.split('/')[0]
        vals = json.loads(self.request.contents)
        for val in vals:
          val['modified'] = timestamp()
          self.collections.setdefault(col, {})[val['id']] = json.dumps(val)
        self.ts_col(col)
        status = httplib.OK
    finally:
      return HttpResponse(status)

  def _handle_DELETE(self, path, environ):
    assert path.startswith('/0.5/') or path.startswith('/1.0/')
    response = HttpResponse(httplib.OK)
    if path.endswith('/storage/0'):
      self.collections.clear()
    elif path.startswith('/0.5/') or path.startswith('/1.0/'):
      command, args = self.parse_url(path)
      col, key = args.split('/', 1)
      if not key:
        opts = urlparse.parse_qs(environ['QUERY_STRING'])
        test = self.opts_test(opts)
        col = self.collections.setdefault(col, {})
        for key in col.keys():
          if test(json.loads(col[key]).get('modified', 0)):
            logging.info('DELETE %s key %s' % (path, key))
            del col[key]
      else:
        try:
          del self.collections[col][key]
        except KeyError:
          return HttpResponse(httplib.NOT_FOUND)
    return response

  def _handle_GET(self, path, environ):
    if path.startswith('/0.5/') or path.startswith('/1.0/'):
      command, args = self.parse_url(path)
      return self.handle_storage(command, args, path, environ)
    elif path.startswith('/1/'):
      return HttpResponse(httplib.OK, self.url_base())
    elif path.startswith('/state'):
      return HttpResponse(httplib.OK, pprint.pformat(self.collections))
    else:
      return HttpResponse(httplib.NOT_FOUND)

  def handle_storage(self, command, args, path, environ):
    if command == 'info':
      if args == 'collections':
        return JsonResponse(json.dumps(self.collections.get('timestamps', {})))
    if command == 'storage':
      if '/' in args:
        col, key = args.split('/')
      else:
        col, key = args, None
      try:
        if not key: # list output requested
          opts = urlparse.parse_qs(environ['QUERY_STRING'])
          test = self.opts_test(opts)
          result = []
          for val in self.collections.setdefault(col, {}).itervalues():
            val = json.loads(val)
            if test(val.get('modified', 0)):
              result.append(val)
          result = sorted(result,
                          key=lambda val: (val.get('sortindex'),
                                           val.get('modified')),
                          reverse=True)
          if 'limit' in opts:
            result = result[:int(opts['limit'][0])]
          logging.info('result set len = %d' % len(result))
          if 'application/newlines' in environ.get('HTTP_ACCEPT', ''):
            value = '\n'.join(json.dumps(val) for val in result)
            return HttpResponse(httplib.OK, value,
                                content_type='application/text')
          else:
            return JsonResponse(json.dumps(result))
        else:
          return JsonResponse(self.collections.setdefault(col, {})[key])
      except KeyError:
        if not key: raise
        return HttpResponse(httplib.NOT_FOUND, '"record not found"',
                            content_type='application/json')

  def __process_handler(self, handler):
    path = self.request.environ['PATH_INFO']
    response = handler(path, self.request.environ)
    return response

  def __call__(self, environ, start_response):
    """Main WSGI application method"""

    self.request = HttpRequest(environ)
    method = '_handle_%s' % environ['REQUEST_METHOD']

    # See if we have a method called 'handle_METHOD', where
    # METHOD is the name of the HTTP method to call.  If we do,
    # then call it.
    if hasattr(self, method):
      handler = getattr(self, method)
      response = self.__process_handler(handler)
    else:
      response = HttpResponse(httplib.METHOD_NOT_ALLOWED,
                              'Method %s is not yet implemented.' % method)

    start_response(response.status, response.headers)
    return [response.content]

class NoLogging(wsgiref.simple_server.WSGIRequestHandler):
  def log_request(self, *args):
    pass

if __name__ == '__main__':
  socket.setdefaulttimeout(300)
  if '-v' in sys.argv:
    logging.basicConfig(level=logging.DEBUG)
    handler_class = wsgiref.simple_server.WSGIRequestHandler
  else:
    logging.basicConfig(level=logging.ERROR)
    handler_class = NoLogging
  logging.info('Serving on port %d.' % DEFAULT_PORT)
  app = WeaveApp()
  httpd = wsgiref.simple_server.make_server(BIND_IP, DEFAULT_PORT, app,
                                            handler_class=handler_class)
  httpd.serve_forever()

Here is the relevant fragment from my nginx configuration file:

# Mozilla Weave
location /0.5 {
  auth_basic            "Weave";
  auth_basic_user_file  /home/majid/web/conf/htpasswd.weave;
  proxy_pass            http://localhost:8000;
  proxy_set_header      Host $http_host;
}
location /1.0 {
  auth_basic            "Weave";
  auth_basic_user_file  /home/majid/web/conf/htpasswd.weave;
  proxy_pass            http://localhost:8000;
  proxy_set_header      Host $http_host;
}
location /1/ {
  auth_basic            "Weave";
  auth_basic_user_file  /home/majid/web/conf/htpasswd.weave;
  proxy_pass            http://localhost:8000;
  proxy_set_header      Host $http_host;
}

This code is hereby released into the public domain. You are welcome to use it as you wish. Just keep in mind that since it is reverse-engineered, it may well break with future releases of the Weave extension, or if Mozilla changes the server protocol.

Update (2009-10-03):

I implemented some minor changes for compatibility with Weave 0.7. The diff with the previous version is as follows:

--- weave_server.py~	Thu Sep  3 17:46:44 2009
+++ weave_server.py	Sat Oct  3 02:59:19 2009
@@ -65,8 +65,7 @@
     command, args = path.split('/', 4)[3:]
     return command, args

-  def opts_test(self, environ):
-    opts = urlparse.parse_qs(environ['QUERY_STRING'])
+  def opts_test(self, opts):
     if 'older' in opts:
       return float(opts['older'][0]).__ge__
     elif 'newer' in opts:
@@ -92,7 +91,7 @@
   def _handle_POST(self, path, environ):
     try:
       status = httplib.NOT_FOUND
-      if path.startswith('/0.5/') and path.endswith('/'):
+      if path.startswith('/0.5/'):
         command, args = self.parse_url(path)
         col = args.split('/')[0]
         vals = json.loads(self.request.contents)
@@ -113,7 +112,8 @@
       command, args = self.parse_url(path)
       col, key = args.split('/', 1)
       if not key:
-        test = self.opts_test(environ)
+        opts = urlparse.parse_qs(environ['QUERY_STRING'])
+        test = self.opts_test(opts)
         col = self.collections.setdefault(col, {})
         for key in col.keys():
           if test(json.loads(col[key]).get('modified', 0)):
@@ -142,10 +142,14 @@
       if args == 'collections':
         return JsonResponse(json.dumps(self.collections.get('timestamps', {})))
     if command == 'storage':
-      col, key = args.split('/')
+      if '/' in args:
+        col, key = args.split('/')
+      else:
+        col, key = args, None
       try:
         if not key: # list output requested
-          test = self.opts_test(environ)
+          opts = urlparse.parse_qs(environ['QUERY_STRING'])
+          test = self.opts_test(opts)
           result = []
           for val in self.collections.setdefault(col, {}).itervalues():
             val = json.loads(val)
@@ -155,6 +159,8 @@
                           key=lambda val: (val.get('sortindex'),
                                            val.get('modified')),
                           reverse=True)
+          if 'limit' in opts:
+            result = result[:int(opts['limit'][0])]
           logging.info('result set len = %d' % len(result))
           if 'application/newlines' in environ.get('HTTP_ACCEPT', ''):
             value = '\n'.join(json.dumps(val) for val in result)

Update (2009-11-17):

Weave 1.0b1 uses 1.0 as the protocol version string instead of 0.5 but is otherwise unchanged. I updated the script and nginx configuration accordingly.

Inserting graphviz diagrams in a CVStrac wiki

CVStrac is an amazing productivity booster for any software development group. This simple tool, built around a SQLite database (indeed, by the author of SQLite) combines a bug-tracking database, a CVS browser and a wiki. The three components are fully cross-referenced and build off the strengths of each other. You can handle almost all aspects of the software development process in it, and since it is built on an open database with a radically simple schema, it is trivial to extend. I use CVStrac for Temboz to track bugs, but also to trace changes in the code base to requirements or to bugs, and last but not least, the wiki makes documentation a snap.

For historical reasons, my company uses TWiki for its wiki needs. We configured Apache with mod_rewrite so that the wiki links from CVStrac lead to the corresponding TWiki entry instead of the one in CVStrac itself, which is unused. TWiki is very messy (not surprising, as it is written in Perl), but it has a number of good features like excellent search (it even handles stemming) and a directed graph plug-in that makes it easy to design complex graphs using Bell Labs’ graphviz, without having to deal with the tedious pixel-pushing of GUI tools like Visio or OmniGraffle. The plug-in makes it easy to document UML or E-R graphs, document software dependencies, map process flows and the like.

CVStrac 2.0 introduced extensibility in the wiki syntax via external programs. This allowed me to implement similar functionality in the CVStrac native wiki. To use it, you need to:

  1. Download the Python script dot.py and install it somewhere in your path. The sole dependency is graphviz itself, as well as either pysqlite2 or the built-in version bundled with Python 2.5
  2. create a custom wiki markup in the CVStrac setup, of type “Program Block”, with the formatter command-line:
    path/dot.py –db CVStrac_database_file –name ‘%m’
    • Insert the graphs using standard dot syntax, bracketed between CVStrac {dot} and {enddot} tags.
For examples of the plugin at work, here is the graph corresponding to this markup:
{dot}
digraph sw_dependencies {
style=bold;
dpi=72;

temboz [fontcolor=white,style=filled,shape=box,fillcolor=red];
python [fontcolor=white,style=filled,fillcolor=blue];
cheetah [fontcolor=white,style=filled,fillcolor=blue];
sqlite [fontcolor=white,style=filled,fillcolor=blue];

temboz -> cheetah -> python;
temboz -> python -> sqlite -> gawk;
temboz -> cvstrac -> sqlite;
python -> readline;
python -> db4;
python -> openssl;
python -> tk -> tcl;

cvstrac -> "dot.py" -> graphviz -> tk;
"dot.py" -> python;
"dot.py" -> sqlite;
graphviz -> gdpng;
graphviz -> fontconfig -> freetype2;
fontconfig -> expat;
graphviz -> perl;
graphviz -> python;
gdpng -> libpng -> zlib;
gdpng -> freetype2;
}
{enddot}

Dot

Another useful plug-in for CVStrac I wrote is one that highlights source code in the CVS browser using the Pygments library. Simply download pygmentize.py, install it Setup/Diff & Filter Programs/File Filter, using the string _pathto/pygmentize.py %F. Here is an example of Pygment applied to pygmentize.py itself:

#!/usr/bin/env python
# $Log: pygmentize.py,v $
# Revision 1.3  2007/07/04 19:54:26  majid
# cope with Unicode characters in source
#
# Revision 1.2  2006/12/23 03:51:03  majid
# import pygments.lexers and pygments.formatters explicitly due to Pygments 0.6
#
# Revision 1.1  2006/12/05 20:19:57  majid
# Initial revision
#
"""
CVStrac plugin to Pygmentize source code
"""
import sys, pygments, pygments.lexers, pygments.formatters

def main():
  assert len(sys.argv) == 2
  block = sys.stdin.read()
  try:
    lexer = pygments.lexers.get_lexer_for_filename(sys.argv[1])
    out = pygments.highlight
    block = pygments.highlight(
      block, lexer, pygments.formatters.HtmlFormatter(
      style='colorful', linenos=True, full=True))
  except ValueError:
    pass
  print unicode(block).encode('ascii', 'xmlcharrefreplace')

if __name__ == '__main__':
  main()

A Python driver for the Symbol CS 1504 bar code scanner

One of my cousins works for Symbol, the world’s largest bar code reader manufacturer. The fashionable action today is in RFID, but the humble bar code is relatively untapped at the consumer level. The unexpected success of Delicious Library shows people want to manage their collection of books, CDs and DVDs, and as with businesses, scanning bar codes is the fastest and least error-prone way to do so. Delicious Library supports scanning bar codes with an Apple iSight camera, but you have to wonder how reliable that is.

If you want something more reliable, you need a dedicated bar code scanner. They come in a bewildering array of sizes and shapes, from thin wands to pistol-like models or flat ones like those used at your supermarket checkout counter. For some reason, the bar code scanner world seems stuck in the era of serial ports (or worse, PS/2 keyboard wedges), but USB models are available, starting at $70 or so. They emulate a keyboard – when you scan a bar code, they will type in the code (as printed on the label), character by character so as to not overwhelm the application, and follow with a carriage return, which means they can work with almost anything from terminal-based applications to web pages. Ingeniously, most will allow you to program the reader’s settings using a booklet of special bar codes that perform changes like enabling or disabling ISBN decoding, and so on.

The problem with tethered bar code readers is, they are not very convenient if you are trying to catalog items on a bookshelf or read in UPC codes in a supermarket. Symbol has a unit buried deep inside its product catalog, the CS 1504 consumer scanner. This tiny unit (shown below with a canister of 35mm film for size comparison) can be worn on a key chain, although I would worry about damaging the plastic window. Most bar code readers are hulking beasts in comparison. It has a laser bar code scanner: just align the line it projects with the bar code and it will chirp once it has read and memorized the code. The memory capacity is up to 150 bar code scans with timestamps, or 300 without timestamps. The 4 silver button batteries (included) are rated for 5000 scans — AAA would have been preferable, but I guess the unit wouldn’t be so compact, but it is clear this scanner was not intended for heavy-duty commercial inventory tracking purposes.

I bought one to simplify the process of listing books with BookCrossing (even though their site is not optimized for bar code readers), but you have other interesting uses like finding out more about your daily purchases such as nutritional information or whether the company behind them engages in objectionable business practices. I can also imagine sticking preprinted bar-coded asset tracking tags on inventory (e.g. computers in the case of an IT department), and keeping track of them with this gizmo. People who sell a lot of books or used records through Amazon.com can also benefit as Amazon has a bulk listing service to which you can upload a file with barcodes. An interesting related service is the free UPC database.

Symbol CS 1504
You can order the scanner in either serial ($100) or USB ($110) versions, significantly cheaper than the competition like Intelliscanner (and much smaller to boot). I highly recommend the USB version, even if you have a serial port today — serial ports seem to be going the way of the dodo and your next computer may not have one. The USB version costs slightly more, but that’s because they include a USB-Serial adapter, and you can’t get one retailing for a mere $10. The one shipped with my unit is the newer PN50 cable which uses a Prolific 2303 chipset rather than the older Digi adapter. Wonder of wonders, they even have a

Mac OS X driver available.

The scanner ships without any software. Symbol mostly sells through integrators to corporations that buy hundreds or thousands of bar code scanners for inventory or point of sale purposes, and they are not really geared to be a direct to consumer business with all the customer support hassles that entails. There are a number of programs available, mostly for Windows, but they don’t seem to have that much by way of functionality to justify their high prices, often as expensive as the scanner itself.

Symbol does make available a SDK to access the scanner, including complete documentation of the protocol used for the device. While you do have to register, they do not make you go through the ridiculous hoops you have to pass to access to the Photoshop plug-in SDK or the Canon RAW decoding SDK. The supplied libraries are Windows-only, however, so I wrote a Python script that works on both Windows and Mac OS X (and probably most UNIX implementations as well, although you will have to use a serial port). The only dependency is the pySerial module.

By default, it will set the clock on the scanner, retrieve the recorded bar codes, correct the timestamps for any drift between the CS 1504’s internal clock and that of the host computer, and if successful clear the unit’s memory and dump the acquired bar codes in CSV format to standard output. The script will also decode ISBN codes (the CS 1504 does not appear to do this by itself in its default configuration). As it is written in Python, it can easily be extended, although it is probably easier to work off the CSV file.

The only configuration you have to do is set the serial port to use at the top of the script (it should do the right thing on a Mac using the Prolific driver, and the Windows driver seems to always use COM8 but I have no way of knowing if this is by design or coincidence). The program is still very rough, specially as concerns error recovery, and I appreciate any feedback.

A sample session follows:

ormag ~>python cs1504.py > barcodes.csv
Using device /dev/cu.usbserial...  connected
serial# 000100000003be95
SW version NBRIKAAE
reading clock for drift
clock drift 0:00:01.309451
resetting scanner clock... done
reading barcodes... done (2 read)
clearing barcodes... done
powering down... done

ormag ~>cat barcodes.csv
UPCA,034571575179,2006-03-27 01:08:48
ISBN,1892391198,2006-03-27 01:08:52

Update (2006-07-21):

At the prompting of some Windows users, I made a slightly modified version, win_cs1504.py, that will copy the barcodes to the clipboard, and also insert the symbology, barcode and timestamp starting on the first free line in the active Excel spreadsheet (creating one if necessary).

Update (2007-01-20):

Just to make it clear: I hereby place this code in the public domain.

Update (2009-11-06):

For Windows users, I have put up videos describing how to install the Prolific USB to serial driver, Python and requisite extensions, and how to use the program itself.

Update (2012-07-05):

I moved the script over to GitHub. Please file bug reports and enhancement requests there. Fatherhood and a startup don’t leave me much time to maintain this, so I make no promises, but this should allow people who make fixes to contribute them back (or fork).

A reader-writer lock for Python

Python offers a number of useful synchronization primitives in the threading and Queue modules. One that is missing, however, is a simple reader-writer lock (RWLock). A RWLock allows improved concurrency over a simple mutex, and is useful for objects that have high read-to-write ratios like database caches.

Surprisingly, I haven’t been able to find any implementation of these semantics, so I rolled my own in a module rwlock.py to implement a RWLock class, along with lock promotion/demotion. Hopefully it can be added to the standard library threading module. This code is hereby placed in the public domain.

"""Simple reader-writer locks in Python
Many readers can hold the lock XOR one and only one writer"""
import threading

version = """$Id: 04-1.html,v 1.3 2006/12/05 17:45:12 majid Exp $"""

class RWLock:
  """
A simple reader-writer lock Several readers can hold the lock
simultaneously, XOR one writer. Write locks have priority over reads to
prevent write starvation.
"""
  def __init__(self):
    self.rwlock = 0
    self.writers_waiting = 0
    self.monitor = threading.Lock()
    self.readers_ok = threading.Condition(self.monitor)
    self.writers_ok = threading.Condition(self.monitor)
  def acquire_read(self):
    """Acquire a read lock. Several threads can hold this typeof lock.
It is exclusive with write locks."""
    self.monitor.acquire()
    while self.rwlock < 0 or self.writers_waiting:
      self.readers_ok.wait()
    self.rwlock += 1
    self.monitor.release()
  def acquire_write(self):
    """Acquire a write lock. Only one thread can hold this lock, and
only when no read locks are also held."""
    self.monitor.acquire()
    while self.rwlock != 0:
      self.writers_waiting += 1
      self.writers_ok.wait()
      self.writers_waiting -= 1
    self.rwlock = -1
    self.monitor.release()
  def promote(self):
    """Promote an already-acquired read lock to a write lock
    WARNING: it is very easy to deadlock with this method"""
    self.monitor.acquire()
    self.rwlock -= 1
    while self.rwlock != 0:
      self.writers_waiting += 1
      self.writers_ok.wait()
      self.writers_waiting -= 1
    self.rwlock = -1
    self.monitor.release()
  def demote(self):
    """Demote an already-acquired write lock to a read lock"""
    self.monitor.acquire()
    self.rwlock = 1
    self.readers_ok.notifyAll()
    self.monitor.release()
  def release(self):
    """Release a lock, whether read or write."""
    self.monitor.acquire()
    if self.rwlock < 0:
      self.rwlock = 0
    else:
      self.rwlock -= 1
    wake_writers = self.writers_waiting and self.rwlock == 0
    wake_readers = self.writers_waiting == 0
    self.monitor.release()
    if wake_writers:
      self.writers_ok.acquire()
      self.writers_ok.notify()
      self.writers_ok.release()
    elif wake_readers:
      self.readers_ok.acquire()
      self.readers_ok.notifyAll()
      self.readers_ok.release()

if __name__ == '__main__':
  import time
  rwl = RWLock()
  class Reader(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_read()
      print self, 'acquired'
      time.sleep(5)
      print self, 'stop'
      rwl.release()
  class Writer(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_write()
      print self, 'acquired'
      time.sleep(10)
      print self, 'stop'
      rwl.release()
  class ReaderWriter(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_read()
      print self, 'acquired'
      time.sleep(5)
      rwl.promote()
      print self, 'promoted'
      time.sleep(5)
      print self, 'stop'
      rwl.release()
  class WriterReader(threading.Thread):
    def run(self):
      print self, 'start'
      rwl.acquire_write()
      print self, 'acquired'
      time.sleep(10)
      print self, 'demoted'
      rwl.demote()
      time.sleep(10)
      print self, 'stop'
      rwl.release()
  Reader().start()
  time.sleep(1)
  Reader().start()
  time.sleep(1)
  ReaderWriter().start()
  time.sleep(1)
  WriterReader().start()
  time.sleep(1)
  Reader().start()

Threadframe: multithreaded stack frame extraction for Python

Note: threadframe is obsolete. Python 2.5 and later include a function sys._current_frames() that does the same thing. Threadframe is only useful for Python 2.2 through 2.4.

Rationale

I was encountering deadlocks in a multi-threaded CORBA server (implemented using omniORB). Debugging using GDB gave me too low-level information, and what I needed was an equivalent of the GDB command “info threads”. There was no such facility available from within Python’s standard library, so I rolled my own.

David Beazley added advanced debugging functions to the Python interpreter, and they have been folded into the 2.2 release.

I used these hooks to build a debugging module that is useful when you are looking for deadlocks in a multithreaded application. It basically has a single function that will return a list of the stack frames for all Python interpreter threads in the process.

Guido van Rossum added in Python 2.3 the thread ID to the interpreter state structure, and this allows us to produce a dictionary mapping thread IDs to frames.

This functionality is now integrated in Python 2.5’s batteries-included sys._current_frames() function.

Of course, I disclaim any liability if this code should crash your system, erase your homework, eat your dog (who also ate your homework) or otherwise have any undesirable effect.

Building and installing

Python 2.2 or later is required. Thread ID to frame dictionary extraction is only available in Python 2.3 and later, and will generate a NotImplementedError if used from 2.2.

Download the source tarball threadframe-0.2.tar.gz. You can use the Makefile or directly with the setup.py script. I have built and tested this only on Solaris 8/x86 and Windows 2000, but the code should be pretty portable. There is a small test program test.py that illustrates how to use this module to dump stack frames of all the Python interpreter threads. A sample run is available for your perusal.

For Windows users, I have available pre-compiled binaries, built using Mingw32 and GCC 2.95.2. Just copy the file threadframe.pyd in any location in your Python path and you should be able to run the test script test.py.

Windows binaries
Python version Download
2.2.1 threadframe.pyd
2.3.4 threadframe.pyd
2.4.x threadframe.pyd

License

This code is licensed under the same terms as Python itself.

Change history

Release 0.2 (2004-06-10)

Distutils based setup.py contributed by Bob Ippolito. Bob also noticed that thread_id was added to the Python interpreter state, and contributed a patch to get a dictionary mapping thread_ids to frames instead of a list.

Release 0.1 (2002-10-11)

Initial release for Python 2.2: threadframe-0.1.tar.gz

The Temboz RSS aggregator

2013-03-14: Google’s announcement that their Reader service will be discontinued has spurred interest in Temboz. This software is not dead, in fact I use it daily, but have not made an official release in a long time. You should use the version from Github instead. There are currently a number of bugs which can lead to Temboz locking up and requiring a restart. I am planning on completing my long overdue overhaul before Google’s July deadline.

Contents

Introduction

Temboz is a RSS aggregator. It is inspired by FeedOnFeeds (web-based personal aggregator), Google News (two column layout) and TiVo (thumbs up and down). I have been using FeedOnFeeds for some time now, but that software seems to have stopped evolving, and I had a number of optimizations to the user experience I wanted to make.

Features

Already implemented:

  • Multithreaded, download feeds in parallel.
  • Built-in web server.
  • Two-column user interface for better readability and information density. Automatic reflow using CSS.
  • Ratings system for articles
  • Real-time hunter-gatherer user interface: items flagged with a “Thumbs down” disappear immediately off the screen (using Dynamic HTML), making room for new articles. No laborious flagging of items as in FeedOnFeeds.
  • Filtering entries (using Python syntax, e.g. ‘Salon’ in feed_title and title == “King Kaufman’s Sports Daily”, or simply by selecting keywords/phrases and hitting “Thumbs down”).
  • Ability to generate a RSS feeds from “Thumbs Up” articles, which is why Temboz would be a true aggregator, not just a reader.
  • Ad filtering
  • Automatic garbage collection: every day between 3AM and 4AM, uninteresting articles (by default those older than 7 days) are purged of their contents (but not metadata such as titles, permalinks or timestamps) to keep the database size manageable. After 6 months (by default), they are deleted altogether
  • Automatic database backups daily (immediately after garbage collection)

On the to do list:

  • Write better documentation
  • Handle permanent HTTP redirects for feed XML URLs
  • Automatic pacing of feed polling intervals using the average and standard deviation of observed feed item inter-arrival times, to reduce bandwidth usage and load for both client and server. Most feeds should be polled on a daily rather than hourly interval (e.g. my own, since I update once a week on average), but the mechanisms for a feed to indicate its polling rate preferences are quite inconsistent from one flavor of RSS/Atom to another.
  • “Survivor mode” – vote feeds that no longer perform off the aggregator based on relevance statistics.
  • Ability to cluster together articles (I tried a heuristic of looking for common URLs they are all pointing to, but this didn’t work well in practice).
  • Portability to Windows, distribution as a standalone package.

History

I have been using it successfully for well over a year. It still has rough edges, with some administration functions only doable using the SQLite command-line utility. Here is a screen shot showing the reader user interface. The article highlighted in yellow was given a “Thumbs Up”. You can also see the user interface at work in a view of the last 50 articles I flagged as “thumbs up” among the feeds I read.

Screen shots

Click on a screen shot thumbnail for a full-sized version

The first screen shot shows the article reading interface, using a two-column layout. Clicking on the “Thumbs down” icon makes the article disappear, bringing a new one in its place (if available). Clicking on the “Thumbs up” icon highlights it in yello and flags it as interesting in the database.

view itemsThe feed summary page shows statistics on feeds, starting with feeds with unread articles, then by alphabetical order. Feeds can be sorted based on other metrics. You have the option of “catching up” with a feed (marking all the articles as read). Feeds with errors are highlighted in red (not shown).

view feedsClicking on the “details” link for a feed brings this page, which allows you to change title or feed URL, and shows the RSS or Atom fields accessible for filtering.

feed detailsFeeds can be filtered using Python expressions.

filtering rules

Known bugs

You can check outstanding bug reports, change requests and more at the public CVStrac site.

Credits

Temboz is written in Python, and leverages Mark Pilgrim’s Ultra-liberal feed parser, SQLite 2.x, Cheetah.

Download

You can download the current version: temboz-0.8.tar.gz I welcome any feedback you may have, specially as concerns improving installation.

The CVS version is far ahead of 0.8 in features. I have not yet had the time to test and document the migration procedure from 0.8 to 1.0, but if you are a new Temboz user I strongly advise you to get a nightly CVS snapshot instead (they are what I run on my own server): temboz-CVS.tar.gz or temboz-CVS.zip.

Updates

For news on Temboz, please subscribe to the RSS feed.

Temboz has a CVStrac where you can submit bug reports or change requests, and a Wiki, where all future documentation will ultimately reside.

Post scriptum

The name “Temboz” is a reference to Malima Temboz, “The mountain that walks”, an elephant whose tormented spirit is the object of Mike Resnick’s excellent SF novel, Ivory.

Data mining Outlook for fun and profit

For a few years now, I have owned the domain name majid.fm. Dot-fm stands for the Federated States of Micronesia, a micro-state in the Pacific Ocean, and they market their domain names to FM radio stations. Those are also my initials. Unfortunately, the registration fees are quite expensive ($200 every two years), and the domain is redundant now that I have acquired majid.info and majid.org (majid.com is reserved by a Malaysian cybersquatter who is demanding a couple thousand dollars for it – I may be vain, but not that vain). I have decided to let the domain lapse when it expires on April 1st.

I used the majid-dot-FM domain for my emails, and set it up so emails sent to anything @majid.fm would be sent to my primary mailbox fazal@majid.fm. For instance, if I registered with Dell, I would give them the email address dell@majid.fm. This was helpful in tracing where I got my email from, and blacklisting companies that started spamming me (they shall remain nameless to protect the guilty yet litigious).

Unfortunately, spammers and some worms attempt dictionary attacks by trying all possible combinations like jim@majid.fm, smith@majid.fm, and so on. My spam filter would catch some, but not all of them, and it would be a terrible hassle. I do not want to have an auto-responder send emails back to people who email me at the old address, as this would at best flood innocent people whose addresses spammers are impersonating, and at worst actually give my new address to the spammers.

My solution to this dilemma is to produce a Python script that scans through all the emails in my Outlook personal folder (PST) files of archived emails, flag all those who sent me an email, and them manually send them a change of address notification (or in the case of websites and online stores, update my contact info online).

Simply using Outlook’s advanced search function will not work, as in many cases the To: header is set to something other than the address the email is delivered to, such as undisclosed-recipients, or the sender’s address when they send the email to multiple Bcc: recipients (the proper way to proceed when you want to send an email to multiple recipients without giving everyone in the list the email addresses of the other recipients). I actually have to sift through the raw message headers to see the envelope destination address.

Here is a simplified version of olmine.py, the script I used. It requires Python 2.x with the win32all extensions, and Outlook 2000 with the Collaboration Data Objects (CDO) option installed (this is not the default). CDO is required to access the full headers. Of course, this script can be useful for all sorts of social network analysis fun on your own Outlook files, or more prosaically to generate a whitelist of email addresses for your spam filter.

import re, win32com.client

srcs = {}
dsts = {}
pairs = {}

# regular expression that scans for valid email addresses in the headers
m_re = re.compile(r'[-A-Za-z0-9.,_]*@majid\.fm')
# regular expression that strips out headers that can cause false positives
strip_re = re.compile(r'(Message-Id:.*$|In-Reply-To:.*$|References:.*$)',
                      re.IGNORECASE | re.MULTILINE)

def dump_folder(folder):
  """Iterate recursively over the given folder and its subfolders"""
  print '-' * 72
  print folder.Name
  print '-' * 72
  for i in range(1, folder.Messages.Count + 1):
    try:
      # PR_SENDER_EMAIL_ADDRESS
      _from = folder.Messages[i].Fields[0x0C1F001F].Value
      # PR_TRANSPORT_MESSAGE_HEADERS
      headers = folder.Messages[i].Fields[0x7d001e].Value
    except:
      # ignore non-email objects like contacts or calendar entries
      continue
    stripped_headers = strip_re.sub('', headers)
    for _to in m_re.findall(stripped_headers):
      srcs[_from] = srcs.get(_from, 0) + 1
      dsts[_to] = dsts.get(_to, 0) + 1
      if (_from, _to) not in pairs:
        print _from, '->', _to
      pairs[_from, _to] = pairs.get((_from, _to), 0) + 1
  # recurse
  for i in range(1, folder.Folders.Count + 1):
    dump_folder(folder.Folders[i])

# connect to Outlook via CDO
cdo = win32com.client.Dispatch('MAPI.Session')
cdo.Logon()
# iterate over all the open PST files
for i in range(1, cdo.InfoStores.Count + 1):
  store = cdo.InfoStores[i]
  root = store.RootFolder
  m = root.Messages
  store.ID
  print '#' * 72
  print store.Name
  print '#' * 72
  dump_folder(root)
cdo.Logoff()

Debugging DCOracle2 applications

DCOracle2 is the Oracle interface module for Python I use most often. It is advertised as “beta”, but quite suitable for production use, aside from a few minor rough edges. There are a few others, most notably cx_oracle, but I can’t vouch for them.

Debugging applications that make use of DCOracle2 can be challenging, as with any database environment, specially in a multi-threaded server context. I have developed a small utility module to aid in development. When it is imported, it will automatically trace all database calls made through DCOracle2, including arguments such as bind variables. More interestingly, it will also automatically run EXPLAIN PLAN on queries taking longer than 2 seconds (by default), to aid in tuning SQL statements. As a side bonus, if run by itself, it provides a (very basic) SQL shell that does offer command-line history and editing, something Oracle hasn’t managed to provide in SQL*Plus in almost 30 years 🙂

This code works with Python 2.2 and DCOracle2 1.1 and 1.3 beta. It will not work with 2.1 and earlier.

The latest version of the module file can be downloaded here: debug_ora.py, as well as the RCS repository debug_ora.py,vfor those who care about this kind of stuff.

An example run of the module:

% python debug_ora.py scott/tiger@repos
SQL> select ename, job, dname from emp, dept where emp.deptno=dept.deptno;
SQL: Oct-03-2003 17:32:39:897
select ename, job, dname from emp, dept where emp.deptno=dept.deptno
ARG: () {}
SQL: !!!!!!!!!!!!!!!! slow query, time = 0.0 sec
SQL: !!!!!!!!!!!!!!!! execution plan follows
000      SELECT STATEMENT Optimizer=CHOOSE
001        NESTED LOOPS
002 001      TABLE ACCESS (FULL) ON EMP
003 001      TABLE ACCESS (BY INDEX ROWID) ON DEPT
004 003        INDEX (UNIQUE SCAN) ON PK_DEPT

ENAME  JOB       DNAME
------ --------- ----------
SMITH  CLERK     RESEARCH
ALLEN  SALESMAN  SALES
WARD   SALESMAN  SALES
JONES  MANAGER   RESEARCH
MARTIN SALESMAN  SALES
BLAKE  MANAGER   SALES
CLARK  MANAGER   ACCOUNTING
SCOTT  ANALYST   RESEARCH
KING   PRESIDENT ACCOUNTING
TURNER SALESMAN  SALES
ADAMS  CLERK     RESEARCH
JAMES  CLERK     SALES
FORD   ANALYST   RESEARCH
MILLER CLERK     ACCOUNTING
SQL>

Obtaining tracebacks on other threads than the current thread

Note: this entry was superseded and is maintained only for historical purposes. Among others, the restriction of not being able to find the stack frame for a specific thread has been lifted with changes in Python 2.3.

David Beazley added advanced debugging functions to the Python interpreter, and they have been folded into the 2.2 release.

I used these hooks to build a debugging module that is useful when you are looking for deadlocks in a multithreaded application. It basically has a single function that will return a list of the stack frames for all Python interpreter threads in the process.

Unfortunately, I was unable to find a way to get a stack frame for a specific thread (either by the thread ID or using threading Thread objects), as Python does not save the thread ID in its thread state.

Of course, I disclaim any liability if this code should crash your system, erase your homework, eat your dog (who also ate your homework) or otherwise have any undesirable effect.

Building and installing

Download threadframe-0.1.tar.gz. You can use the Makefile. I’ve built and tested this only on Solaris 8/x86 and Windows 2000, but the code should be pretty portable. There is a small test program test.py that illustrates how to use this module to dump stack frames of all the Python interpreter threads. A sample run is available for your perusal.

For Windows users, a pre-compiled binary for the standard Python 2.2.1 distribution is available: threadframe.pyd. Just copy this file in any location in your Python path and you should be able to run the test script test.py.

Objects are aristotelician

One of the unquestioned assumptions behind object-oriented programming is that objects are instances of a class, and thus implicitly stay that way. This is akin to the philosophical concept of nature, as in an invariant quality of something, that cannot be changed:

But is there any one thus intended by nature to be a slave, and for whom such a condition is expedient and right, or rather is not all slavery a violation of nature?

There is no difficulty in answering this question, on grounds both of reason and of fact. For that some should rule and others be ruled is a thing not only necessary, but expedient; from the hour of their birth, some are marked out for subjection, others for rule.

Again, the male is by nature superior, and the female inferior; and the one rules, and the other is ruled; this principle, of necessity, extends to all mankind.

It is clear, then, that some men are by nature free, and others slaves, and that for these latter slavery is both expedient and right.

Aristotle, Politics I, 5 (emphasis mine)

Needless to say, this concept is reactionary. One may well object that given slavery’s omnipresence in antiquity, even a great philosopher such as Aristotle could not be entirely free of the prejudices of his time. This conveniently ignores the fact Aristotle was a pupil of Plato, himself a disgruntled aristocrat who collaborated with Spartans when they overthrew Athenian democracy after the Peloponnesian war, and is arguably one of the theoretical founders of the totalitarian state. I would say it is rather the presumed greatness of Aristotle that should be reexamined, but I digress. For more on this subject, read Karl Popper’s The Open Society and its Enemies – Volume 1, The Spell of Plato.

Thus, OOP carries within it the conservatism of Plato and Aristotle, people who resented how the young Athenian democracy had usurped the aristocracy’s natural (in their eyes) right to rule over others. This is not just an academic consideration. Computer programmers influence society, specially those who work for governmental information systems, and if you consider the Sapir-Whorf hypothesis, the language they use affects the way they think.

This is why I like Python’s ability to morph an object from one class to another:

Python 2.2.1 (#1, Apr 18 2002, 13:06:27)
[GCC 2.95.3 20010315 (release)] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> class Slave:
...     def whip(self):
...             return 'Yes, master'
...
>>> class Freeman:
...     def whip(self):
...             return 'Die, fascist scum!'
...
>>> man = Slave()
>>> man.whip()
'Yes, master'
>>> man.__class__ = Freeman
>>> man.whip()
'Die, fascist scum!'

Using Wake-on-LAN with Python

Most modern PCs and Macintoshes feature Wake-on-LAN. This feature, originally called “Magic packet” (PDF) by AMD, allows you to start a PC remotely by sending a specially formed “magic packet” to its Ethernet interface. On Macs running OS X, Wake-on-LAN seems to work only when the Mac is in sleep mode, not when it is completely turned off. The original intent was to allow administrators to boot PCs remotely to run backups, but with the spread of DSL, there are other uses.

For instance, I have a low-noise Solaris machine running 24/7 at my home (angband.majid.fm), and when I need to access my (noisy) home PC, I just log on to that machine via SSH, wake up the PC and then log on remotely using pcAnywhere. The same works with my iMac G4

Here is a very simple Python script that starts a machine with a given MAC address:

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto('\xff'*6 + '\x00\x02\xb3\x07\xb6\xd1'*16, ('192.168.1.255', 80))

It will start the machine with MAC address 00:02:B3:07:B6:D1 on the subnet 192.168.1/24 by sending a Wake-on-LAN magic packet to the subnet-directed broadcast IP address.

Update (2003-12-05):

Now that you have woken your Mac, how do you send it back to sleep? Read this article to find out.

Update (2006-03-19):

On certain versions of Linux, you may get a “permission denied” error message because you are trying to send a packet to a broadcast address. The following code should work:

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
s.sendto('\xff'*6 + '\x00\x02\xb3\x07\xb6\xd1'*16, ('192.168.1.255', 80))

Python used to defend human rights

Patrick Ball is the author of a book called Making the Case describing how information technology and notably databases of human rights abuse reports can yield statistical evidence of wrongdoing by specific individuals (say, policemen).

He testified at Slobodan Milosevic’s trial in The Hague. Apparently, the processing was done in Python, see page 2 of this Wired article for more details.