Mylos

Aperture internals

Last in a series:

  1. First impressions
  2. Asset management
  3. Under the hood: file format internals

This article was never completed because I switched to Lightroom and lost interest. What was done may be of interest to Aperture users, although the data model probably changed since 1.0

Aperture stores its library as a bundle with the extension .aplibrary. This is a concept inherited from NeXTstep, where an entire directory that has the bundle bit set is handled as if it were a single file. A much more elegant system than Mac OS Classic’s data and resource forks.

Inside the bundle, there is a directory Aperture.aplib which contains the metadata for the library in a file Library.apdb. This file is actually a SQLite3 database. SQLite is an excellent, lightweight open-source embedded relational database engine. Sun uses SQLite 2 as the central repository for SMF, the next-generation service management facility that controls booting the Solaris operating system and its automatic fault recovery, a strong vote of confidence by Sun in SQLite’s suitability for mission-critical use. SQLite is also one of the underlying data storage mechanisms used by Apple’s over-engineered Core Data framework.

You don’t have to use Core Data to go through the database, the /usr/bin/sqlite3 command-line utility is perfectly fine for this purpose. Warning: using sqlite3 to access Aperture’s data directly is obviously unsupported by Apple, and should not be done on a mission-critical library. At the very least, make sure Aperture is not running.

ormag ~/Pictures/Aperture Library.aplibrary/Aperture.aplib>sqlite3 Library.apdb
SQLite version 3.1.3
Enter ".help" for instructions
sqlite> .tables
ZRKARCHIVE             ZRKIMAGEADJUSTMENT     ZRKVERSION
ZRKARCHIVERECORD       ZRKMASTER              Z_10VERSIONS
ZRKARCHIVEVOLUME       ZRKPERSISTENTALBUM     Z_METADATA
ZRKFILE                ZRKPROPERTYIDENTIFIER  Z_PRIMARYKEY
ZRKFOLDER              ZRKSEARCHABLEPROPERTY
sqlite> .schema z_metadata
ormag ~/Pictures/Aperture Library.aplibrary/Aperture.aplib>sqlite3 Library.apdb
SQLite version 3.3.7
Enter ".help" for instructions
sqlite> .tables
ZRKARCHIVE             ZRKKEYWORD             ZRKVOLUME
ZRKARCHIVERECORD       ZRKMASTER              Z_11VERSIONS
ZRKARCHIVEVOLUME       ZRKPERSISTENTALBUM     Z_9VERSIONS
ZRKFILE                ZRKPROPERTYIDENTIFIER  Z_METADATA
ZRKFOLDER              ZRKSEARCHABLEPROPERTY  Z_PRIMARYKEY
ZRKIMAGEADJUSTMENT     ZRKVERSION
sqlite> .schema zrkfile
CREATE TABLE ZRKFILE ( Z_ENT INTEGER, Z_PK INTEGER PRIMARY KEY, Z_OPT INTEGER, ZASSHOTNEUTRALY FLOAT, ZFILECREATIONDATE TIMESTAMP, ZIMAGEPATH VARCHAR, ZFILESIZE INTEGER, ZUUID VARCHAR, ZPERMISSIONS INTEGER, ZNAME VARCHAR, ZFILEISREFERENCE INTEGER, ZTYPE VARCHAR, ZFILEMODIFICATIONDATE TIMESTAMP, ZASSHOTNEUTRALX FLOAT, ZFILEALIASDATA BLOB, ZSUBTYPE VARCHAR, ZCHECKSUM VARCHAR, ZPROJECTUUIDCHANGEDATE TIMESTAMP, ZCREATEDATE TIMESTAMP, ZISFILEPROXY INTEGER, ZDATELASTSAVEDINDATABASE TIMESTAMP, ZISMISSING INTEGER, ZVERSIONNAME VARCHAR, ZISTRULYRAW INTEGER, ZPROJECTUUID VARCHAR, ZEXTENSION VARCHAR, ZISORIGINALFILE INTEGER, ZISEXTERNALLYEDITABLE INTEGER, ZFILEVOLUME INTEGER, ZMASTER INTEGER );
CREATE INDEX ZRKFILE_ZCHECKSUM_INDEX ON ZRKFILE (ZCHECKSUM);
CREATE INDEX ZRKFILE_ZCREATEDATE_INDEX ON ZRKFILE (ZCREATEDATE);
CREATE INDEX ZRKFILE_ZFILECREATIONDATE_INDEX ON ZRKFILE (ZFILECREATIONDATE);
CREATE INDEX ZRKFILE_ZFILEMODIFICATIONDATE_INDEX ON ZRKFILE (ZFILEMODIFICATIONDATE);
CREATE INDEX ZRKFILE_ZFILESIZE_INDEX ON ZRKFILE (ZFILESIZE);
CREATE INDEX ZRKFILE_ZFILEVOLUME_INDEX ON ZRKFILE (ZFILEVOLUME);
CREATE INDEX ZRKFILE_ZISEXTERNALLYEDITABLE_INDEX ON ZRKFILE (ZISEXTERNALLYEDITABLE);
CREATE INDEX ZRKFILE_ZMASTER_INDEX ON ZRKFILE (ZMASTER);
CREATE INDEX ZRKFILE_ZNAME_INDEX ON ZRKFILE (ZNAME);
CREATE INDEX ZRKFILE_ZPROJECTUUIDCHANGEDATE_INDEX ON ZRKFILE (ZPROJECTUUIDCHANGEDATE);
CREATE INDEX ZRKFILE_ZUUID_INDEX ON ZRKFILE (ZUUID);
sqlite> .schema z_metadata
CREATE TABLE Z_METADATA (Z_VERSION INTEGER PRIMARY KEY, Z_UUID VARCHAR(255), Z_PLIST BLOB);
sqlite> .schema z_primarykey
CREATE TABLE Z_PRIMARYKEY (Z_ENT INTEGER PRIMARY KEY, Z_NAME VARCHAR, Z_SUPER INTEGER, Z_MAX INTEGER);
sqlite> select * from z_primarykey;
1|RKArchive|0|1
2|RKArchiveRecord|0|0
3|RKArchiveVolume|0|1
4|RKFile|0|2604
5|RKFolder|0|23
6|RKProject|5|0
7|RKProjectSubfolder|5|0
8|RKImageAdjustment|0|1086
9|RKKeyword|0|758
10|RKMaster|0|2604
11|RKPersistentAlbum|0|99
12|RKPropertyIdentifier|0|119
13|RKSearchableProperty|0|84191
14|RKVersion|0|2606
15|RKVolume|0|0
sqlite>

One useful command is .dump, which will dump the entire database in the form of the SQL commands required to recreate it. Even with a single-photo library, this generates many pages of output.

Here is my attempt to reverse engineer the Aperture 1.5.2 data model. CoreData, like all object-relational mappers (ORMs) leaves much to be desired from the relational perspective. The fact SQLite foreign keys constraints are not enforced (and not even set by CoreData) doesn’t help. Click on the diagram below to expand it.

Aperture 1.5.1. data model

All tables are linked to Z_PRIMARYKEY which implements a form of inheritance using the column Z_ENT to identify classes. The only table that seems to use this today is ZRKFOLDER, where the rows can have a Z_ENT of 5 (Folder), 6 (Project) or 7 (ProjectSubfolder). For clarity, I have omitted the links between all tables and Z_PRIMARYKEY.

ZRKIMAGEADJUSTMENT looks like the table that records the transformations that turn a master image into a version, Live Image style.

Opening up Aperture

Apple introduced Aperture, its professional workflow management application at the Photo Plus trade show in New York on 2005-10-19. Initial speculation was that Apple had finally decided to bring its deteriorating relationship with Adobe to a head and release a Photoshop competitor. This was quickly dispelled and its true positioning as a high-end workflow application for professional digital photographers became more apparent. This is not just a fig leaf to appease Adobe — Aperture does indeed lack most of Photoshop (or even Elements’) functionality. It is a Bridge-killer of sorts, not that the sluggish, resource-hogging piece of bugware that is Bridge could be really be qualified as “alive”. The simplest description of Aperture is that it is iPhoto for professional photographers.

I received my copy today, and will be putting it through its paces in a series of longer articles over the coming weeks, with this article as the central nexus:

  1. First impressions
  2. Asset management
  3. Under the hood: file format internals

Amazon wishlist optimizer

I wrote a script several months ago to go through an Amazon wish list and find the combination of items that will best fit within a given budget. Given that the Christmas holiday shopping season seems to have started before Thanksgiving, it seemed topical to release it.

It used the Amazon Web Services API, which is a complete crock (among other failings, it will consistently not return the Amazon.com price for an item, even when explicitly instructed to do so). It does not look like Amazon pays any particular attention to the bug reports I filed. I just gave up on the API and re-implemented it the old-fashioned way, by “scraping” Amazon’s regular (and most definitely not XML-compliant) HTML pages.

It is still very much work in progress, but already somewhat useful. You can use it directly by stuffing your wish list ID in the URL (or using the form below):

Wish list IDAmount

A better way is to drag and drop the highlighted Amazon optimizer bookmarklet link (version 6 as of 2007-05-08) to your browser’s toolbar. You can then browse through Amazon, and once you have found the wish list you are looking for, click on the bookmarklet to open the optimizer in a new window (or tab). By default, it will try and fit a budget of $100 (my decadent tastes are showing, are they not?), but you can change that amount and experiment with different budgets. Surprisingly often, it will find an exact fit. Otherwise, it will try to find the closest match under the budget with as little left over as possible.

There are many caveats. The wishlist optimizer only works for public Amazon.com (US) wish lists. There does not seem to be an easy way to buy multiple items for somebody else’s wish list in one step, although I am working on it, so you will have to go through the wish list and add the items by hand. Shipping costs and wish list priorities are currently not taken into account. Sometimes Amazon will not show a price straight away but instead require you to click on a link, the optimizer will decline to play these marketer’s games and just skip those products.

Be patient – Amazon.com is rather slow right now — it seems they did not learn the lessons of their poor performance towards the end of last year. One of my coworkers ran the optimizer through an acid test with his wife’s 13-page wish list, and it took well over a minute and half to fetch the list, let alone optimize it. One can only imagine how bad it will get when the Christmas shopping season begins in earnest. To mitigate this somewhat, I have added caching – the script will only hit Amazon once per hour for any given wish list. As it works by scraping the web site rather than using the buggy and unreliable Amazon Web Services API, there is a real risk it will stop working if Amazon blocks my server’s IP or if they radically change their wish list UI (they would do better to add additional machines and load-balancers, but that would be too logical).

Update (2005-12-02):

Predictably, Amazon changed their form (they changed the form name from edit-items to editItems) and broke not only the wishlist optimizer, but also the bookmarklet. I fixed this and upgraded to the scraping module BeautifulSoup, but you will need to use the revised bookmarklet above to make it work again.

Update (2010-04-27):

The script has been broken for quite a while, but I fixed it and it should work again.

A book signing with Steven Erikson

Steven EriksonI reviewed the Malazan Book of the Fallen last year — it is one of the very finest Fantasy series, in my opinion. I met Steven Erikson today during a book signing at Borderlands Books in San Francisco. Sadly, there were enough people in the audience who had not read all first five volumes that he read from Memories of Ice rather than from the final manuscript of the sixth volume, The Bonehunters (due out in February 2006) that he carries with him on his Palm PDA.

Tor Books has acquired the rights to the series for the US market. They have already published the first three volumes, and are expected to catch up with the British publishers by the eighth or so. The cover art on the Bantam British edition is better though. The publishing industry has an adage, “mugs sell mags”, and the US covers have more figurative illustrations, sometimes unhappily so, as with the slightly cheesy cover of Memories of Ice

Erikson described the genesis of the series and Malazan universe in a series of literary role-playing games with fellow archeologist Ian Cameron Esslemont, author of Night of Knives, a novel set in the same universe, also the first in a series. He mentioned he is also working on a series of six novelettes featuring the psychopathic necromancers Bauchelain and Korbal Broach (whom he managed to work into Memories of Ice), from the point of view of their long-suffering manservant. The novelettes will be, in Erikson’s own words, “more over the top”. The first two, Blood Follows and The Healthy Dead have already been published (even if Amazon incorrectly claims the latter not available yet), the third one is coming shortly.

When asked whether he was planning on extending the series beyond the planned ten volumes, he mentioned he had the outline of all ten almost from the very beginning (keep in mind it took him 8 years to get The Gardens of the Moon published, and that only happened after he moved to England). There is still a lot of room for spontaneity — as he puts it, if the author is bored when writing the actual books because he put too much effort in preparatory notes, the readers are likely to be bored as well. Erikson also committed to giving “payback” to his readers for sticking with the story (sounds ominous, doesn’t it?), with some snide remarks referring to Robert Jordan’s ever-lengthening Wheel of Time series. The anecdote he mentioned was that of a 75 year old woman who was asking a bookseller when the next installment by Jordan would be published, because she was afraid she might die before that series was completed… In all fairness, Jordan has announced the next volume will be the last, bringing closure to long-suffering fans.

I asked him about the whole extinction of magic as a moral imperative angle, and he indicated the later volumes in the decalogue would bear on the issue. He also said he is in no way endorsing imperialism (Deadhouse Gates is in part inspired from events in the British Empire’s oppression of India and Afghanistan). I also mentioned how difficult I found the abrupt transition introduced by volume 5, Midnight Tides. He agreed, but it was required by the 10-volume story arc, and postponing it would only make things worse. Among other matters, we will read more of the Forkrul Assail, whom he describes as the nastiest of the four founding races.

As a final note, I have been to book signings with Raymond Feist, Robert Jordan and Steven Erikson, and I am always amazed by the inconsiderate people who come with cartons full of books to sign, presumably to make them more collectible and valuable. The value in these events is in meeting the authors and interacting with them, not in giving them tennis elbow for financial gain.

Interesting factoids

Harper’s Magazine, a left-leaning (by American standards) literary gazette, is fairly insipid, but it publishes amusing tidbits in each issue known as Harper’s Index. In a similar vein, here are some surprising bits I have read recently.

  • All 9 members of China’s Politburo are engineers. Source: IEEE Spectrum
  • Western Europe has a population and GDP comparable to the United States, but it has 42% of the world’s WiFi hotspots, compared to 26% in the US. Source: Informa Telecoms and Media.
  • Medical Doctors’ median income in the US is $200,000. Often maligned, median malpractice insurance premiums are only $11,000. Source: Paul von Hippel, Ohio State University.
  • “Administrative costs” represent 19% to 24% of the cost of health care in the US, compared to about 10% in most OECD countries. Source: University of Maine.
  • The French universal medical coverage, despite being rife with abuse and fraud by people who would flunk the means-test for state coverage, costs about 1.4 billion euros per year, slightly under 0.1% of GDP, with approximately 5 million people covered, and health care in general represents about 9% of GDP. Of course, as health insurance is mandatory for all salaried workers, only the unemployed lack coverage in the first place, so the cost of universal coverage in the US would be higher as a proportion of GDP. The French medical system was rated first in the world for general health care by the WHO’s last survey in 2000, so it is not a question of skimping on the quality of care as in the UK.
  • The US spends 15% of its GDP on health care, if that were lowered by 10%, by bringing administrative costs in line with Europe or Canada, the savings would easily cover universal insurance for all Americans.
  • The Philippines and India are respectively ranked No. 4 and 5 destinations for international telephone calls from the US. India hardly registered in 1991. Source: Telegeography.