Web

If WordPress updates hang on a 64-bit OS

The WordPress instance running this site was no longer able to automatically update plugins (and presumably not the core either) after I upgraded from a 32-bit to a sparkling fresh 64-bit PHP install at Joyent. It would start the update, and show a spinning logo and then just hang.

After much debugging, I found out the problem is that the class-pclzip.php that is responsible for unzipping was failing silently with the message:

Downloading update from http://downloads.wordpress.org/plugin/yet-another-related-posts-plugin.3.5.2.zip

Unpacking the update…

Abort class-pclzip.php : Missing zlib extensions

This isn’t terribly helpful, but digging in, it turns out that class depends on the PHP zlib module, and on 64-bit operating systems (more precisely, operating systems with 64-bit large file support enabled), zlib.h #defines gzopen to be gzopen64. PHP does not protect itself adequately and thus the PHP function gzopen gets renamed gzopen64 as well, this throwing class-pclzip.php for a loop, along with a number of other systems like PEAR.

Fixing this requires recompiling PHP. Ubuntu Karmic includes a work-around, but I run Solaris and build from source, so I contributed a patch filed under bug #53829.

Automattic should probably patch class-pclzip.php to deal with gzopen/gzopen64 as there are a great many broken PHP installs out there (the PHP bug has been open for over a year and a half without what I would consider an acceptable solution), and it is surprisingly difficult to find a solution online. I guess a great many WordPress installs are still 32-bit, which is kind of sad.

Changing the WordPress table prefix

This may be of use to people experiencing the dreaded “You do not have sufficient permissions to access this page.” message when trying to reach WordPress’ admin page, even when logging in as a proper administrator. WordPress embeds the table prefix in 4 different locations:

  • The wp-config.php file
  • The name of the tables
  • The name of user metadata keys
  • The name of blog options

Thus if you want to change the prefix, you have to:

  1. Edit wp-config.php to change the prefix
  2. Rename your tables
  3. Rename your user metadata
  4. Rename your site options

Missing steps 1 or 2 will cause WordPress to not find the tables, and it will go through the initial install process again.

Missing step 3 will cause the account to lose its roles, and thus not be authorized to do much besides read public posts.

Missing step 4 is more insidious, as it destroys the option wp_user_roles, the link between roles and capabilities, and thus even if your account is an administrator, it is no longer authorized for anything.

It feels quite clunky to embed the database prefix in column values, not just tables, just like WordPress’ insistence on converting relative links to absolute links. The former makes moving tables around (e.g. when consolidating multiple blogs on a single MySQL database) harder than necessary. The latter makes moving a blog around in a site’s URL hierarchy break internal links. I suppose there are security reasons underlying Automattic’s design choice, but security by obscurity of the WordPress table prefix is hardly a foolproof measure.

If you are renaming the tables, say, from the default prefix wp to foo, the MySQL commands necessary for steps 2–4 would be the following:

UPDATE wp_usermeta SET meta_key=REPLACE(meta_key, 'wp_', 'foo_')
WHERE meta_key LIKE 'wp_%';
UPDATE wp_options SET option_name=REPLACE(option_name, 'wp_', 'foo_')
WHERE option_name LIKE 'wp_%';
ALTER TABLE wp_commentmeta RENAME TO foo_commentmeta;
ALTER TABLE wp_comments RENAME TO foo_comments;
ALTER TABLE wp_links RENAME TO foo_links;
ALTER TABLE wp_options RENAME TO foo_options;
ALTER TABLE wp_postmeta RENAME TO foo_postmeta;
ALTER TABLE wp_posts RENAME TO foo_posts;
ALTER TABLE wp_redirection_groups RENAME TO foo_redirection_groups;
ALTER TABLE wp_redirection_items RENAME TO foo_redirection_items;
ALTER TABLE wp_redirection_logs RENAME TO foo_redirection_logs;
ALTER TABLE wp_redirection_modules RENAME TO foo_redirection_modules;
ALTER TABLE wp_term_relationships RENAME TO foo_term_relationships;
ALTER TABLE wp_term_taxonomy RENAME TO foo_term_taxonomy;
ALTER TABLE wp_terms RENAME TO foo_terms;
ALTER TABLE wp_usermeta RENAME TO foo_usermeta;
ALTER TABLE wp_users RENAME TO foo_users;

Incensed at Mozilla

One of the greatest features in the Webkit-based browsers (Apple’s Safari and Google Chrome) is WebSQLdatabase, the ability for a web site to store information in a SQLite database on your browser accessible via JavaScript. This allows web developers to build database-enabled applications that run entirely in the browser, without requiring a server. This is very useful for mobile devices, which in the US enjoy flaky network connectivity at best. One very handsome example is the iPad-optimized Every Time Zone webapp.

SQLite is probably the most important open-source project you have never heard of. It is a simple, streamlined and efficient embedded database. Firefox stores its bookmarks in it. Google distributes its database of phishing sites in that format. Sun’s industrial-strength Solaris operating system stores the list of services it runs on boot in it—if it were to fail, a server would be crippled so that is a pretty strong vote of confidence. Adobe Lightroom and Apple’s Aperture use it to store their database, as do most Mac applications that use the CoreData framework, and many iPhone apps. In other words, it is robust and proven mission-critical software that is widely yet invisibly deployed.

WebSQLdatabase basically makes the power of SQLite available to web developers trying to build apps that work offline, specially on mobile devices. No good deed goes unpunished, and the Mozilla foundation teamed up with unlikely bedfellow Microsoft to torpedo formal adoption of WebSQLdatabase as a web standard, on spurious grounds, and pushed an alternate standard called IndexedDB instead. To quote the Chromium team:

Q: Why this over WebSQLDatabase?

A: Microsoft and Mozilla have made it very clear they will not implement SQL in the browser.  If you want to argue this is silly, talk to them, not me.

IndexedDB is several steps backwards. Instead of using powerful, expressive and mature SQL technology, it uses a verbose JavaScript B-tree API that is a throwback to the 1960s bad old days of hierarchical databases and ISAM, requires a lot more work from the developer, for no good reason. To add injury to insult, Firefox 4’s implementation of IndexedDB is actually built on top of SQLite. The end result will be that web developers will need to build a SQL emulation library on top of IndexedDB to restore the SQLite functionality deliberately crippled by IndexedDB. If there is one constant in software engineering, it is that multiple layers add brittleness and impair performance.

Of course, both Mozilla and Microsoft are irrelevant on mobiles, where WebKit has essentially won the day, so why should this matter? Microsoft has always been a hindrance to the development of the web, since they have to protect the Windows API from competition by increasingly capable webapps, but I cannot understand Mozilla’s attitude, except possibly knee-jerk not-invented-here syndrome and petulance at being upstaged by WebKit. WebSQLdatabase is not perfect—to reach its full potential, it needs and automatic replication and sync facility between the local database and the website’s own database, but it is light years ahead of IndexedDB in terms of power and productivity.

I am so irritated by Mozilla’s attitude that after 10 years of using Mozilla-based browsers, I switched today from Firefox to Chrome as my primary browser. Migrating was surprisingly easy. Key functionality like bookmark keywords, AdBlock, FlashBlock, a developer console, and the ability to whitelist domains for cookies, all have equivalents on Chrome. The main regressions are bookmark tags, and Chrome’s sync options are not yet equivalent to Weave‘s. At some point I will need to roll my own password syncing facility (Chrome stores its passwords in the OS X keychain, which is also used by Safari and Camino).

RapidSSL 1 – GoDaddy 0

My new company’s website uses SSL. I ordered an “extended validation” certificate from GoDaddy, instead of my usual CA, RapidSSL/GeoTrust, because GoDaddy’s EV certificates were cheap. EV certificates are security theater more than anything else, I probably should not have bothered.

Immediately after switching from my earlier “snake oil” self-signed test certificate to the production certificate, I saw SSL errors on Google Chrome for Mac and Safari for Mac, i.e. the two browsers that use OS X’s built-in crypto and certificate store. I suppose I should have tested the certificate on another server before going live, but I trusted GoDaddy (they are my DNS registrars, and competent, if garish).

Big mistake.

I called their tech support hotline, which is incredibly grating because of the verbose phone tree that keeps trying to push add-ons (I guess it is consistent with the monstrosity that is their home page).

After a while, I got a first-level tech. He asked whether I saw the certificate error on Google Chrome for Windows. At that point, I was irate enough to use a four-letter word. Our customers are Android mobile app developers. A significant chunk of them use Macs, and almost none (less than 5%) use IE, so know-nothing “All the world is IE” demographics are not exactly applicable.

After about half an hour of getting the run-around and escalating to level 2, with my business partner Michael getting progressively more anxious in the background, the level 1 CSR tells me the level 2 one can’t reproduce the problem (I reproduced it on three different Macs in two different locations). I gave them an ultimatum: fix it within 10 minutes or I would switch. At this point, the L1 CSR told me he had exhausted all his options, but I could call their “RA” department, and offered to switch me. Inevitably, the call transfer failed.

I dialed their SSL number, and in parallel started the certificate application process on RapidSSL. They offered a free competitive upgrade, I tried it, and within 3 minutes I had my fresh new, and functional certificate, valid for 3 years, all for free and in less time than it takes to listen to GoDaddy’s obnoxious phone tree (all about “we pride ourselves in customer service” and other Orwellian corporate babble).

I then called GoDaddy’s billing department to get a refund. Surprisingly, the process was very fast and smooth. I guess it is well-trod.

The moral of the story: GoDaddy—bad. RapidSSL—good.

Update (2012-08-26)

I switched my DNS business from GoDaddy to Gandi.net in December 2011 after Bob Parsons’ despicable elephant-hunting stunt.

Scientific papers now citing blogs?

In my misspent youth I spent about a year as a visiting scholar researching wavelets under Raphy Coifman’s supervision at Yale’s small but excellent Mathematics Department. Professor Coifman was head of a department that also featured Benoît Mandelbrot (of fractals fame), the late Walter Feit (as in the Feit-Thompson theorem), and Fields medalist Gregory Margulis. He was kind enough to credit me on a published paper, even though he did all the work, reverse delegation in action. That paper had modest success and was cited, so I can claim (not with a straight face) to be a published mathematician.

While perusing my blog’s web analytics referrer report, I was surprised to find out my article on the Nikon D70’s not-so-raw RAW format is actually cited in a serious scientific paper on human vision. We keep hearing about students getting flunking grades for citing Wikipedia, are blogs really considered more authoritative?

The citation uses the old URL for the blog entry, . When I migrated to WordPress at the end of 2009, I took great pains to provide redirects whenever possible and avoid broken links. Many bloggers don’t have the time or expertise to do this, and simply leave dangling permalinks around. If quoting blogs is to become standard practice, authors should probably provide some sort of fallback mechanism like linking to Archive.org, but dead-tree journals do not have this capability. Absent that, linkrot may spread to an entire new category of documents, scientific papers.

Update (2011-07-18):

Here’s another paper (PDF) referencing the same article. What’s next, CiteSeer?

Securing WordPress

WordPress has been getting a lot of bad press the last few days, as a worm is out in the wild exploiting a security vulnerability. This is leading to somewhat unfair comparisons with Windows, and thoughtful articles from John Gruber and Maciej Ceglowski.

To be sure, the ease of programming in PHP leads a great many people to contribute to projects, who may not have the experience or security awareness they should. This is not helped by poorly designed features in PHP that were enabled by default in previous versions, and cannot always be disabled outright due to legacy compatibility concerns, reminiscent of the persistent security woes due to the C standard library’s insecure old string processing facilities.

For many users, migrating away from WordPress may not be a practical option. My recommendations would be:

  • Reduce your exposure by exporting a static HTML version of your site, as suggested by Maciej. This is really only simple if you use a non-default permalink structure that does not use question mark characters in URLs, like that used by the SEO plugins. Otherwise you would need quite a bit of mod_rewrite jiggery-pokery to get it to work. In any case, this will also disable quite a bit of functionality on your site, such as comments.
  • If you are an Apache user, install modsecurity, a truly outstanding Apache module that acts as a firewall of sorts and will inspect requests for suspicious behavior like SQL injection attempts and malformed requests. Configuring modsecurity is not for the faint of heart, but there are some papers online like this one by Daniel Cuthbert (PDF) that walk you through this. This is probably the single most significant thing you can do to make your WordPress blog safer.
  • Practice security in depth — keep regular backups of both your wordpress directory and database, so you can recover in case of attack, and if possible run WordPress in an isolated account with minimal privileges.

Mozilla Weave

Mozilla Weave is a project of the Mozilla Labs to build synchronization of bookmarks, tabs, passwords and so on between multiple instances of the Firefox browser. It used to be a private beta, but with the release of version 0.4 recently, it has been opened up to the general public.

Where version 0.2 was pretty rough, 0.4 actually works quite well, even if it is not yet feature complete. Bookmarks and passwords are handled just fine. Furthermore, you can set up your own server, all that is needed is PHP. Previous versions required WebDAV support, and the WebDAV module in nginx is not functional enough for Weave (or anything else, for that matter).

The first synchronization is painfully slow, but once it is done, later synchronizations are essentially instant. When combined with the Awesome bar’s tagging components, it has completely supplanted Del.icio.us for my bookmarking needs (I never liked the rewritten user interface).

Thomas Pink weave cufflinks

Amusingly, I came across these cufflinks at Thomas Pink in San Francisco last Friday — they are the mirror image of the Weave logo.

Thomas Pink weave cufflinks

Below are the relevant sections of my nginx config.

php.ini

magic_quotes_gpc = Off
session.auto_start = 0
file_uploads = On
error_reporting = E_ALL & ~E_NOTICE
allow_url_include = Off
allow_url_fopen = Off
session.use_only_cookies = 1
session.cookie_httponly = 1
expose_php = Off
display_errors = Off
register_globals = Off
disable_functions = phpinfo
error_log = /home/majid/web/logs/php_error_log

nginx.conf

root /home/majid/web/html;
location ~ .php$ {
  auth_basic		"gondwana";
  auth_basic_user_file	/home/majid/web/conf/htpasswd;
  fastcgi_pass		127.0.0.1:8888;
  fastcgi_index		index.php;
  fastcgi_param		SCRIPT_FILENAME  /home/majid/web/html$fastcgi_script_name;
  include		/home/majid/web/conf/fastcgi.conf;
}
# Mozilla Weave
rewrite ^/weave/admin$	/weave/admin.php;
location /0.3/api {
  return		404;
}
location /0.3/user {
  fastcgi_pass		127.0.0.1:8888;
  fastcgi_index		index.php;
  include		/home/majid/web/conf/fastcgi.conf;
  fastcgi_param		SCRIPT_FILENAME	/home/majid/web/html/weave/index.php;
  fastcgi_param		SCRIPT_NAME	/home/majid/web/html/weave/index.php;
  if ( $request_uri ~ "/0.3/user/([^?]*)" ) {
    set $path_info	/$1;
  }
  fastcgi_param		PATH_INFO	$path_info;
}

fastcgi.conf

fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
fastcgi_param  SERVER_SOFTWARE    nginx;

fastcgi_param  QUERY_STRING       $query_string;
fastcgi_param  REQUEST_METHOD     $request_method;
fastcgi_param  CONTENT_TYPE       $content_type;
fastcgi_param  CONTENT_LENGTH     $content_length;

fastcgi_param  SCRIPT_NAME        $fastcgi_script_name;
fastcgi_param  REQUEST_URI        $request_uri;
fastcgi_param  DOCUMENT_URI       $document_uri;
fastcgi_param  DOCUMENT_ROOT      $document_root;
fastcgi_param  SERVER_PROTOCOL    $server_protocol;

fastcgi_param  REMOTE_ADDR        $remote_addr;
fastcgi_param  REMOTE_PORT        $remote_port;
fastcgi_param  SERVER_ADDR        $server_addr;
fastcgi_param  SERVER_PORT        $server_port;
fastcgi_param  SERVER_NAME        $server_name;

@font-face embedding

I updated my wife’s home page to use embedded fonts (in this case the Fonthead GoodDog typeface for headings) with the @font-face CSS primitive. With the introduction of Firefox 3.5, all the major browsers now support embedded typography.

As usual, Microsoft had to do its proprietary thing in Internet Exploder and devised a crackpot font format called EOT (Embedded OpenType), ostensibly at font foundries’ request, with weak DRM-like metadata that allows the font supplier to restrict which sites the font can be used on. Microsoft has an incredibly convoluted tool called WEFT (Web Embedded Font Tool) to do this, but I used the open-source and incredibly easy to use ttf2eot tool instead. The only hitch in this case was that this tool takes a TrueType TTF font as input, and GoodDog is a (PostScript-ish) OpenType OTF instead. Fortunately, TypeTool can do the conversion.

We finally have semi-decent typography on the web without having to embed images (bad for page load times or accessibility) or the even worse sIFR hacks using the noxious Adobe Flash. The only question remains whether type foundries will follow. Fonthead has enlightened licensing policies for GoodDog (free for up to 5 sites, no insistence on DRM). Typeface design is a painstaking craft and designers certainly deserve what they charge for their fonts, but I hope the typographic industry does not follow the RIAA in its self-destructive crusade against its own customers.

Update (2011-03-03):

One option for hassle-free embedded font licensing is TypeKit. It does require JavaScript in the browser to work, unlike a pure CSS solution like the one I used, but the convenience can’t be beat. We use it on Apsalar’s public website.

Feedburner down again

I just tried unsuccessfully to subscribe to a feed hosted by the annoying bozos at FeedBurner. From my Temboz feed error counters, it seems FB feeds have been failing with 503 errors for at least the last 5 hours or so, par for the course.

Just another reason why outsourcing vital services to the cloud is not always a good strategy.

gondwana ~>GET -eUS http://feeds.feedburner.com/Fooducate
GET http://feeds.feedburner.com/Fooducate
User-Agent: lwp-request/1.39

GET http://feeds.feedburner.com/Fooducate --> 503 Service Unavailable
Connection: close
Server: NS_6.1
Content-Length: 62
Client-Date: Thu, 30 Apr 2009 01:31:18 GMT
Client-Peer: 66.150.96.119:80

<HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</h1>
503 Service Unavailable
</BODY>
</HTML>

Update (2009-04-30):

It is possible the problem lies with my ISP (although I could replicate it at work as well). I can ping FB from my Joyent accelerator but not from home where my Temboz instance runs.

A work-around is to use the newer Google server feeds2.feedburner.com instead. For Temboz, all you need to do is run sqlite3 rss.db and the command:

update fm_feeds
set feed_xml=replace(feed_xml, 'feeds.feedburner.com', 'feeds2.feedburner.com')
where feed_xml like 'http://feeds.feedburner%';

The importance of short iteration feedback cycles

I blog at best once or twice a month on my regular low-intensity blog, which runs my home-grown Mylos software, but am surprising myself by blogging on an almost daily schedule with this WordPress-based blog. Mylos is batch-based: you edit a post, run the script to regenerate the static pages, review, edit and iterate. It takes a minute to regenerate the entire site.

This is a similar effect to using an interpreted language like Python or PHP vs. a compiled language like C or Java. Even though I am more comfortable editing in Emacs (used by Mylos) than in a browser window, the short cycle between edit and preview in WordPress makes for a more satisfying experience and encourages me to blog more freely.

I suspect I will end up importing my Mylos weblog into WordPress, once I figure out how to address some niggling differences in functionality, such as the way images or attachments are handled, and how to use nginx as a caching reverse proxy in front of WordPress for performance reasons.

Amazon wishlist optimizer

I wrote a script several months ago to go through an Amazon wish list and find the combination of items that will best fit within a given budget. Given that the Christmas holiday shopping season seems to have started before Thanksgiving, it seemed topical to release it.

It used the Amazon Web Services API, which is a complete crock (among other failings, it will consistently not return the Amazon.com price for an item, even when explicitly instructed to do so). It does not look like Amazon pays any particular attention to the bug reports I filed. I just gave up on the API and re-implemented it the old-fashioned way, by “scraping” Amazon’s regular (and most definitely not XML-compliant) HTML pages.

It is still very much work in progress, but already somewhat useful. You can use it directly by stuffing your wish list ID in the URL (or using the form below):

Wish list IDAmount

A better way is to drag and drop the highlighted Amazon optimizer bookmarklet link (version 6 as of 2007-05-08) to your browser’s toolbar. You can then browse through Amazon, and once you have found the wish list you are looking for, click on the bookmarklet to open the optimizer in a new window (or tab). By default, it will try and fit a budget of $100 (my decadent tastes are showing, are they not?), but you can change that amount and experiment with different budgets. Surprisingly often, it will find an exact fit. Otherwise, it will try to find the closest match under the budget with as little left over as possible.

There are many caveats. The wishlist optimizer only works for public Amazon.com (US) wish lists. There does not seem to be an easy way to buy multiple items for somebody else’s wish list in one step, although I am working on it, so you will have to go through the wish list and add the items by hand. Shipping costs and wish list priorities are currently not taken into account. Sometimes Amazon will not show a price straight away but instead require you to click on a link, the optimizer will decline to play these marketer’s games and just skip those products.

Be patient – Amazon.com is rather slow right now — it seems they did not learn the lessons of their poor performance towards the end of last year. One of my coworkers ran the optimizer through an acid test with his wife’s 13-page wish list, and it took well over a minute and half to fetch the list, let alone optimize it. One can only imagine how bad it will get when the Christmas shopping season begins in earnest. To mitigate this somewhat, I have added caching – the script will only hit Amazon once per hour for any given wish list. As it works by scraping the web site rather than using the buggy and unreliable Amazon Web Services API, there is a real risk it will stop working if Amazon blocks my server’s IP or if they radically change their wish list UI (they would do better to add additional machines and load-balancers, but that would be too logical).

Update (2005-12-02):

Predictably, Amazon changed their form (they changed the form name from edit-items to editItems) and broke not only the wishlist optimizer, but also the bookmarklet. I fixed this and upgraded to the scraping module BeautifulSoup, but you will need to use the revised bookmarklet above to make it work again.

Update (2010-04-27):

The script has been broken for quite a while, but I fixed it and it should work again.

The Temboz RSS aggregator

2013-03-14: Google’s announcement that their Reader service will be discontinued has spurred interest in Temboz. This software is not dead, in fact I use it daily, but have not made an official release in a long time. You should use the version from Github instead. There are currently a number of bugs which can lead to Temboz locking up and requiring a restart. I am planning on completing my long overdue overhaul before Google’s July deadline.

Contents

Introduction

Temboz is a RSS aggregator. It is inspired by FeedOnFeeds (web-based personal aggregator), Google News (two column layout) and TiVo (thumbs up and down). I have been using FeedOnFeeds for some time now, but that software seems to have stopped evolving, and I had a number of optimizations to the user experience I wanted to make.

Features

Already implemented:

  • Multithreaded, download feeds in parallel.
  • Built-in web server.
  • Two-column user interface for better readability and information density. Automatic reflow using CSS.
  • Ratings system for articles
  • Real-time hunter-gatherer user interface: items flagged with a “Thumbs down” disappear immediately off the screen (using Dynamic HTML), making room for new articles. No laborious flagging of items as in FeedOnFeeds.
  • Filtering entries (using Python syntax, e.g. ‘Salon’ in feed_title and title == “King Kaufman’s Sports Daily”, or simply by selecting keywords/phrases and hitting “Thumbs down”).
  • Ability to generate a RSS feeds from “Thumbs Up” articles, which is why Temboz would be a true aggregator, not just a reader.
  • Ad filtering
  • Automatic garbage collection: every day between 3AM and 4AM, uninteresting articles (by default those older than 7 days) are purged of their contents (but not metadata such as titles, permalinks or timestamps) to keep the database size manageable. After 6 months (by default), they are deleted altogether
  • Automatic database backups daily (immediately after garbage collection)

On the to do list:

  • Write better documentation
  • Handle permanent HTTP redirects for feed XML URLs
  • Automatic pacing of feed polling intervals using the average and standard deviation of observed feed item inter-arrival times, to reduce bandwidth usage and load for both client and server. Most feeds should be polled on a daily rather than hourly interval (e.g. my own, since I update once a week on average), but the mechanisms for a feed to indicate its polling rate preferences are quite inconsistent from one flavor of RSS/Atom to another.
  • “Survivor mode” – vote feeds that no longer perform off the aggregator based on relevance statistics.
  • Ability to cluster together articles (I tried a heuristic of looking for common URLs they are all pointing to, but this didn’t work well in practice).
  • Portability to Windows, distribution as a standalone package.

History

I have been using it successfully for well over a year. It still has rough edges, with some administration functions only doable using the SQLite command-line utility. Here is a screen shot showing the reader user interface. The article highlighted in yellow was given a “Thumbs Up”. You can also see the user interface at work in a view of the last 50 articles I flagged as “thumbs up” among the feeds I read.

Screen shots

Click on a screen shot thumbnail for a full-sized version

The first screen shot shows the article reading interface, using a two-column layout. Clicking on the “Thumbs down” icon makes the article disappear, bringing a new one in its place (if available). Clicking on the “Thumbs up” icon highlights it in yello and flags it as interesting in the database.

view itemsThe feed summary page shows statistics on feeds, starting with feeds with unread articles, then by alphabetical order. Feeds can be sorted based on other metrics. You have the option of “catching up” with a feed (marking all the articles as read). Feeds with errors are highlighted in red (not shown).

view feedsClicking on the “details” link for a feed brings this page, which allows you to change title or feed URL, and shows the RSS or Atom fields accessible for filtering.

feed detailsFeeds can be filtered using Python expressions.

filtering rules

Known bugs

You can check outstanding bug reports, change requests and more at the public CVStrac site.

Credits

Temboz is written in Python, and leverages Mark Pilgrim’s Ultra-liberal feed parser, SQLite 2.x, Cheetah.

Download

You can download the current version: temboz-0.8.tar.gz I welcome any feedback you may have, specially as concerns improving installation.

The CVS version is far ahead of 0.8 in features. I have not yet had the time to test and document the migration procedure from 0.8 to 1.0, but if you are a new Temboz user I strongly advise you to get a nightly CVS snapshot instead (they are what I run on my own server): temboz-CVS.tar.gz or temboz-CVS.zip.

Updates

For news on Temboz, please subscribe to the RSS feed.

Temboz has a CVStrac where you can submit bug reports or change requests, and a Wiki, where all future documentation will ultimately reside.

Post scriptum

The name “Temboz” is a reference to Malima Temboz, “The mountain that walks”, an elephant whose tormented spirit is the object of Mike Resnick’s excellent SF novel, Ivory.

Mylos

I switched to WordPress at the end of 2009 for the reasons expressed elsewhere and this entry is here for historical purposes only.

Mylos is my home-grown weblog management software. I wrote my first web pages by hand in Emacs and RCS in 1993, but stopped maintaining them in 1996 or so. I only restarted one with Radio last year. After a year of weblogging, however, I find I am frustrated by the limitations of Radio as well as its web-based user interface (I am one of those rare people who prefer command-line user interfaces and non-WYSIWYG HTML editors). I guess I could have extended Radio using UserLand’s Frontier language it is implemented in, but I have no interest in learning yet another oddball scripting language.

I decided in April 2003 to roll my own system, implemented in Python. In my career at various ISPs, I had to kill home-grown content-management system (CMS) projects gone awry, and I was certainly aware that these projects have a tendency to go overboard. Still, it has taken me three months of (very) part-time work to get the system to a point where it generates usable pages and imports my legacy pages from Radio without a hitch.

The implemented requirements for Mylos are:

  • Migration of my existing Radio weblog entries and stories (done, but not in an entirely generic fashion, is theme-dependent)
  • All pages are static HTML, no requirements for CGI scripts, PHP, databases or the like
  • Implemented and extensible in Python
  • Separation of content and presentation using themes (based on Webware Python Server Pages and CSS)
  • Support for navigational hierarchy
  • Articles are stored as regular files on the filesystem where they can be edited using conventional tools if necessary, no need for proprietary databases
  • Extensible article metadata
  • Atom 1.0 syndication, with separate feeds for subcategories
  • Use only relative URLs in hyperlinks to allow easy relocation
  • Automatic entry HTML cleanup for XHTML compliance
  • A CSS-based layout where the blogroll doesn’t wrap around short bodies (e.g. on permalink pages for short articles).
  • reasonable defaults, e.g. don’t try to create a weblog entry for an image that is colocated with an article, just copy it
  • Built-in multithreaded external link validation.
  • Automatic URL remapping (/mylos/ becomes relative to the Mylos root, relative URLs in an entry are automatically prefixed in containers like home pages).
  • Ability to review an article before publishing
  • Lynx compliance
  • Automatically cache external images in weblog entries in case they disappear (but do not use them as such due to potential copyright issues)
  • Set robots meta tag so only permalinks are indexed and cached by search engines, for better relevance to search engine users (albeit at the cost of lower rankings for the home page).
  • Sophisticated image galleries fully integrated with the navigation
  • Automatic code fragment colorization using Pygments

These features are planned but not yet implemented:

  • Keyword index.
  • Enhanced support for books via Allconsuming and Amazon.
  • Automated dependency tracking to re-render only the pages affected by a change (via SCons)
  • Multi-threaded rendering (via SCons)
  • Automatically add height, width and alt tags to img tags
  • Auto abbreviation glossary as tooltip help using tags
  • Typographically clean results, as done by SmartyPants
  • Feedback loop via on-page comments
  • Notification of new comments by email
  • Ability to promote a weblog entry to a story if it reaches critical mass

These features are “blue-sky”, don’t hold your breath for them:

  • Updates by email
  • User-submitted ratings for articles
  • Support for multilingual weblogs

Features thet are not planned at all (anti-requirements) include:

  • Synchronization or upload to server – rsync does this far better
  • Text editor – use $VISUAL or $EDITOR, whether Emacs, vi, or whatever
  • Web user interface – Radio’s web interface has very poor usability in my personal opinion, and this is due to the fact it is web-based, not any fault of Userland’s
  • RSS 1.0 – RDF seems like an exercise in intellectual masturbation
  • Blogger API or similar – although someone else could certainly write a bridge in Python if needed

The software is currently not in a state where it can be used by anyone else. I am not sure if there is any demand for such a tool in any case, if so, I would certainly consider documenting it better and making a SourceForge project out of it.

By the way, the system is named “Mylos” after a city in the magnificent illustrated series “Les Cités Obscures” by Belgian architects and writers Schuiten and Peeters, more specifically L’Enfant Penchéee

Cover for L'Enfant Penchée