Photo

Trimming the fat from JPEGs

I use Adobe Photoshop CS2 on my Mac as my primary photo editor. Adobe recently announced that the Intel native port of Photoshop would have to wait for the next release CS3, tentatively scheduled for Spring 2007. This ridiculously long delay is a serious sticking point for Photoshop users, specially those who jumped on the MacBook Pro to finally get an Apple laptop with decent performance, as Photoshop under Rosetta emulation will run at G4 speeds or lower on the new machines.

This nonchalance is not a very smart move on Adobe’s part, as it will certainly drive many to explore Apple’s Aperture as an alternative, or be more receptive to newcomers like LightZone. I know Aperture and Photoshop are not fully equivalent, but Aperture does take care of a significant proportion of a digital photographer’s needs, and combined with Apple’s recent $200 price reduction for release 1.1, and their liberal license terms (you can install it on multiple machines as long as you are the only user of those copies, so you only need to buy a single license even if like me you have both a desktop and a laptop).

There is a disaffection for Adobe among artists of late. Their anti-competitive merger with Macromedia is leading to complacency. Adobe’s CEO, Bruce Chizen, is also emphasizing corporate customers for the bloatware that is Acrobat as the focus for Adobe, and the demotion of graphics apps shows. Recent releases of Photoshop have been rather ho-hum, and it is starting to accrete the same kind of cruft as Acrobat (to paraphrase Borges, each release of it makes you regret the previous one). Hopefully Thomas Knoll can staunch this worrisome trend.

Adobe is touting its XMP metadata platform. XMP is derived from the obnoxious RDF format, a solution in search of a problem if there ever was one. RDF files are as far from human-readable as a XML-based format can get, and introduce considerable bloat. If Atom people had not taken the RDF cruft out of their syndication format, I would refuse to use it.

I always scan slides and negatives at maximal bit depth and resolution, back up the raw scans to a 1TB external disk array, then apply tonal corrections and spot dust. One bizarre side-effect of XMP is that if I take a 16-bit TIFF straight from the slide scanner, then apply curves and reduce it to 8 bits, somewhere in the XMP metadata that Photoshop “helpfully” embedded in the TIFF the bit depth is not updated and Bridge incorrectly shows the file as being 16-bit. The only way to find out is to open it (Photoshop will show the correct bit depth in the title bar) or look at the file size.

This bug is incredibly annoying, and the only work-around I have found so far is to run ImageMagick‘s convert utility with the -strip option to remove the offending XMP metadata. I did not pay the princely price for the full version of Photoshop to be required to use open-source software as a stop-gap in my workflow.

Photoshop will embed XMP metadata and other cruft in JPEG files if you use the “Save As…” command. In Photoshop 7, all that extra baggage actually triggered a bug in IE that would break its ability to display images. You have to use the “Save for Web…” command (actually a part of ImageReady) to save files in a usable form. Another example of poor fit-and-finish in Adobe’s software: “Save for Web” will not automatically convert images in AdobeRGB or other color profiles to the Web’s implied sRGB, so if you forget to do that as a previous step, the colors in the resulting image will be off.

“Save for Web” will also strip EXIF tags that are unnecessary baggage for web graphics (and can actually be a privacy threat). While researching the Fotonotes image annotation scheme, I opened one of my “Save for Web” JPEGs under a hex editor, and I was surprised to see literal strings like “Ducky” and “Adobe” (apparently the ImageReady developers have an obsession with rubber duckies). Photoshop is clearly still embedding some useless metadata in these files, even though it is not supposed to. The overhead corresponds to about 1-2%, which in most cases doesn’t require more disk space because files use entire disk blocks, whether they are fully filled or not, but this will lead to increased network bandwidth utilization because packets (which do not have the block size constraints of disks) will have to be bigger than necessary.

I wrote jpegstrip.c, a short C program to strip out Photoshop’s unnecessary tags, and other optional JPEG “markers” from JPEG files, like the optional “restart” markers that allow a JPEG decoder to recover if the data was corrupted — it’s not really a file format’s job to mitigate corruption, more TCP’s or the filesystem’s. The Independent JPEG Group’s jpegtran -copy none actually increased the size of the test file I gave it, so it wasn’t going to cut it. jpegstrip is crude and probably breaks in a number of situations (it is the result of a couple of hours’ hacking and reading the bare minimum of the JPEG specification required to get it working). The user interface is also pretty crude: it takes an input file over standard input, spits out the stripped JPEG over standard output and diagnostics on standard error (configurable at compile time).

ormag ~/Projects/jpegstrip>gcc -O3 -Wall -o jpegstrip jpegstrip.c
ormag ~/Projects/jpegstrip>./jpegstrip < test.jpg > test_strip.jpg
in=2822 bytes, skipped=35 bytes, out=2787 bytes, saved 1.24%
ormag ~/Projects/jpegstrip>jpegtran -copy none test.jpg > test_jpegtran.jpg
ormag ~/Projects/jpegstrip>jpegtran -restart 1 test.jpg > test_restart.jpg
ormag ~/Projects/jpegstrip>gcc -O3 -Wall -DDEBUG=2 -o jpegstrip jpegstrip.c
ormag ~/Projects/jpegstrip>./jpegstrip < test_restart.jpg > test_restrip.jpg
skipped marker 0xffdd (4 bytes)
skipped restart marker 0xffd0 (2 bytes)
skipped restart marker 0xffd1 (2 bytes)
skipped restart marker 0xffd2 (2 bytes)
skipped restart marker 0xffd3 (2 bytes)
skipped restart marker 0xffd4 (2 bytes)
skipped restart marker 0xffd5 (2 bytes)
skipped restart marker 0xffd6 (2 bytes)
skipped restart marker 0xffd7 (2 bytes)
skipped restart marker 0xffd0 (2 bytes)
in=3168 bytes, skipped=24 bytes, out=3144 bytes, saved 0.76%
ormag ~/Projects/jpegstrip>ls -l *.jpg
-rw-r--r--   1 majid  majid  2822 Apr 22 23:17 test.jpg
-rw-r--r--   1 majid  majid  3131 Apr 22 23:26 test_jpegtran.jpg
-rw-r--r--   1 majid  majid  3168 Apr 22 23:26 test_restart.jpg
-rw-r--r--   1 majid  majid  3144 Apr 22 23:27 test_restrip.jpg
-rw-r--r--   1 majid  majid  2787 Apr 22 23:26 test_strip.jpg

Update (2006-04-24):

Reader “Kam” reports jhead offers JPEG stripping with the -purejpg option, and much much more. Jhead offers an option to strip mostly useless preview thumbnails, but it does not strip out restart markers.

Another one bites the dust

After a brief period of 100% digital shooting in 1999–2001, I went back to primarily shooting with film, both black & white and color slides. I process my B&W film at home but my apartment is too small for a darkroom to make prints, not do I have a room dark enough, so I rent time at a shared darkroom. I used to go to the Focus Gallery in Russian Hill, but when I called to book a slot about a month ago, the owner informed me he was shutting down his darkroom rental business and relocating. He did recommend a suitable replacement, which actually has nicer, brand new facilities, albeit in not as nice a neighborhood. Learning new equipment and procedures was still an annoyance

Color is much harder than B&W, and requires toxic chemicals. I shoot slides, which use the E-6 process, not the C-41 process for more common color negative film. For the last five years, I have been going to ChromeWorks, a Mom-and-Pop lab on Bryant Street, San Francisco’s closest equivalent to New York’s photo district. The only thing they did was E-6 film processing, and they did it exceedingly well, with superlative customer service and quite reasonable rates. When I went there today to hand them a roll for processing, I discovered they closed down two months ago, apparently a mere week after I last went there.

I ended up giving my roll to the NewLab, another pro lab a few blocks away, which is apparently the last E-6 lab in San Francisco (I had used their services before for color negative film, which I almost never use apart from the excellent Fuji Natura 1600).

Needless to say, these developments are not encouraging for a film enthusiast.

Update (2007-12-14):

There is at least one other E-6 lab in San Francisco, Fotodepo (1063 Market @ 7th). They cater mostly to Academy of Arts students and are not a pro lab by any means (I have never seen a more cluttered and untidy lab). In and in any case they are more expensive than the New Lab, if more conveniently located.

Update (2009-08-27):

The Newlab itself closed as well few months ago. I now use Light Waves instead.

Shoebox review

For a very long time, the only reason I still used a Windows PC at home (apart from games, of course) was my reliance on IMatch. IMatch is a very powerful image cataloguing database program (a software category also known as Digital Asset Management), The thing that sets IMatch apart from most of its competition is its incredibly powerful category system, which essentially puts the full power of set theory at your fingertips.

Most other asset management programs either pay perfunctory attention to keywords, or require huge amounts of labor to set up, which is part of the cost of doing business for a stock photo agency, but not for an individual. The online photo sharing site Flickr popularized an equivalent system, tagging, which has the advantage of spanning multiple users (you will never be able to get many users to agree on a common classification schema for anything, tags are a reasonable compromise).

Unfortunately, IMatch is not available on the Mac. Canto Cumulus is cross-platform and has recently introduced something similar to IMatch’s categories, but it is expensive, and has an obscenely slow image import process (it took more than 30 hours to process 5000 or so photos from my collection on my dual-2GHz PowerMac G5 with 5.5GB of RAM!). Even Aperture is not that slow… I managed to kludge a transfer from IMatch to Cumulus using IMatch’s export functions and jury-rigging category import in Cumulus by reverse-engineering one of their data formats.

Cumulus is very clunky compared to IMatch (it does have the edge in some functions like client-server network capabilities for workgroups), and I had resigned myself to using it, until I stumbled upon Shoebox (thanks to Rui Carmo’s Tao of Mac). Shoebox (no relation to Kodak’s discontinued photo database bearing the same name) offers close to all the power of IMatch, with a much smoother and more usable interface to boot (IMatch is not particularly difficult if you limit yourself to its core functionality, but it does have a sometimes overwhelming array of menus and options).

screenshot

Andrew Zamler-Carhart, the programmer behind Shoebox, is very responsive to customer feedback, just like Mario Westphal, the author of IMatch — he actually implemented a Cumulus importer just for me, so moving to it was a snap (and much faster than the initial import into Cumulus). That in itself is a good sign that there will always be a place in the software world for the individual programmer, even in the world of “shrinkwrap software”, especially since the distribution efficiencies of the Internet have lowered the barrier to entry.

Shoebox is a Mac app through and through, with an attention to detail that shows. It makes excellent use of space, as on larger monitors like mine (click on the screen shot to see it at full resolution) or dual-monitor setups, and image categorization is both streamlined and productive. As an example, Shoebox fully supports using the keyboard to quickly classify images by typing the first few letters of a category name, with auto-completion, without requiring you to shift focus to a specific text box (this non-modal keyboard synergy is quite rare in the Macintosh world). It also has the ability to export categories to Spotlight keywords so your images can be searched by Spotlight. I won’t describe the user interface, since Kavasoft has an excellent guided tour.

No application is perfect, and there are a few minor issues or missing features. Shoebox does not know how to deal with XMP, limiting possible synergies with Adobe Photoshop and the many other applications that support XMP like the upcoming Lightroom. It would also benefit from improved RAW support – my Canon Digital Rebel XT CR2 thumbnails are not auto-rotated, for instance, but the blame for that probably lies with Apple. The application icon somehow invariably reminds me of In-n-Out burgers. The earlier versions of Shoebox had some stability problems when I first experimented with them, but the last two have been quite solid.

I haven’t started my own list of the top ten “must have” Macintosh applications, but Shoebox certainly makes the cut. If you are a Mac user and photographer, you owe it to yourself to try it and see how it can make your digital photo library emerge from chaos. I used to say IMatch was the best image database bar none, but nowadays I must add the qualification “for Windows”, and Shoebox is the new king across all platforms.

Aperture: first impressions

First in a series:

  1. First impressions
  2. Asset management
  3. Under the hood: file format internals

Cost and hardware requirements

The first thing you notice about Aperture, even before you buy it, is its hefty hardware requirements. I had to upgrade the video card on my PowerMac G5 (dual 2GHz, 5.5GB RAM) to an ATI X800XT, as the stock nVidia 5200FX simply doesn’t make the cut.

Aperture costs $500, not far from the price of the full Photoshop CS2. Clearly, this product is meant for professionals, just like Final Cut Pro. The pricing is not out of line with similar programs like Capture One PRO, but it is rather steep for the advanced amateurs who have flocked to DSLRs and the RAW format. Hopefully, Apple will release a more reasonably priced “Express” version much as they did with Final Cut Express.

File management

Like its sibling iPhoto, Aperture makes the annoying assumption that it will manage your photo collection under its own file hierarchy. It does not play nice and share with other applications, apart from the built-in Photoshop integration, which merely turns Photoshop into an editor, with Aperture calling all the shots and keeping both image files and metadata databases firmly to itself. Photoshop integration does not seem to extend to XMP interoperability, for instance.

This is a major design flaw that will need to be addressed in future versions — most pros use a battery of tools in their workflow, and expect their tools to cooperate using the photographer’s directory structure, not one imposed by the tool. Assuming one Aperture to rule them all is likely going to be too confining, and the roach-motel like nature of Aperture libraries is going to cause major problems down the road. Copying huge picture files around is very inefficient — HFS supports symbolic and hard links, there is no reason to physically copy files. This scheme also renders Aperture close to useless in a networked environment like a magazine or advertising agency, where media files are typically stored on a shared SAN, e.g. on XServe RAID boxes using Fibre Channel.

Fortunately, the file layout is relatively easy to reverse-engineer and it is probably just a question of time until third-party scripts become available to synchronize an Aperture library with regular Pictures folders and XMP sidecar metadata files, or other asset management and metadata databases like Extensis Portfolio or even (shudder) Canto Cumulus. Apple should not make us jump through these hoops – the purpose of workflow is to boost productivity, not hinder it. In any case, Apple is apparently hinting Aperture can output sidecar files, at least according to PDN’s first look article

Performance

Aperture is not the lightning-fast RAW converter we have been dreaming of. Importing RAW files is quite a sluggish affair, taking 2 minutes 15 seconds to import a 665MB folder with 86 Canon Rebel XT CR2 RAW files. In comparison, Bridge takes about a minute to generate thumbnails and previews for the same images. The comparison is not entirely fair, as Aperture’s import process yields high-resolution previews that allow you to magnify the image to see actual pixels with the loupe tool, whereas Bridge’s previews are medium resolution at best. The CPU utilization on my dual G5 is far from pegged however, which suggests the import process was not particularly tuned for SMP or multi-core systems, nor that it even leverages OS X’s multithreading. Aperture will work with other formats like scanned TIFFs as well (import times are even slower, though).

Once import is complete, viewing the files is very smooth and fast. The built-in loupe tool is particularly addictive, and very natural for anyone who has worked with a real loupe on a real light table. A cute visual effect (Quicktime, 6MB) has the loupe flip over as you reach the edges of the screen. The loupe will also magnify the thumbnails, although that will pause execution for the time it takes to read the thumbnail’s preview data into memory.

Workflow innovations

Aperture has two very interesting concept: stacks and versions. Stacks group together multiple images as one. It is very common for a photographer to take several very similar photos. Think of bracketed exposures, or a sports photographer shooting a fast action sequence at 8 frames per second, or a VR photographer making a series of 360-degree shots for use in an immersive panorama. Aperture’s stacks allow you to manage these related images as a single unit, the stack. It is even capable of identifying candidates for a stack automatically using timestamps.

Stacks

This article is work-in-progress, and this section has to be fleshed out

Versions

Versions is a concept clearly drawn from the world of software configuration control systems like CVS or Visual SourceSafe. Aperture does not touch the original image, adjustments like changing the color balance simply record the series of operations to achieve the new version of the image in the metadata database, just like CVS only stores diffs between versions of a file, to save space. This suggests Apple plans future versions of Aperture with shared image repositories, as most modern systems work that way, with a shared central repository, and individual copies for each user, with a check-in/check-out mechanism with conflict resolution.

The parameters for a transform take a trifling amount of memory, and the photographer can experiment to his heart’s content with multiple variants. Photoshop now has equivalent functionality with the introduction of layers comps in CS, but they still feel like bolted-on features rather than integral to the product.

In the early nineties, French firm FITS introduced a groundbreaking program named Live Picture to compete with Photoshop. Its chief claim to fame was that it could deal with huge images very efficiently, because it recorded the operations as a sequence of mathematical transforms rather than the result, in a way eerily reminiscent of PostScript. The transforms were only applied as needed at the display resolution, thus avoiding the need to apply them to the full-resolution image until the final output was required, while still managing to deal with zooming adequately. The software was promising, but the transformation logging technique limited the types of operations that could be performed on images, and due in part to its high price and specialized scope, the product died a slow and painful death. In its current incarnation, it lives on, after a fashion, in the moribund Flashpix image format and a web graphics server program imaginatively named ImageServer.

Chained transforms is a very elegant approach when compared to the brute-force of Photoshop — in Aperture a finished image is represented as a master image and a series of transformations to be applied to it to achieve each version, much like Photoshop keeps a history of changes made to the image in memory (but not in the final disk file). Since Aperture’s transforms are fairly simple, they can be executed in real time by modern graphics cards that support Core Image.

Target market

Keep in mind there are two types of pro photographers: those who divide photographers in two groups, and the others… More seriously:

  1. Those who try to build up a portfolio of images over their career, where royalties and residual rights will provide them with financial support when they retire. Most fine art, landscape or nature photographers are in this category, and photojournalists could be assimilated (except their employer owns the rights to the archive in the latter case).
  2. Those who do work-for-hire. Wedding photographers, event photographers, product, catalog, advertising and industrial photographers fit in this category.

The first type need a digital asset management database to retrieve and market their images more effectively, and a distribution channel. Most farm out that representation work to the likes of Corbis or Getty Images.

The second type will work intensely on a project, take a lot of frames and show many variants to a client for approval. Once the project is done, it is archived, probably never to be used again, and they move on to the next one. In most cases, the rights to the images remain with those who commissioned them, not the photographer, and thus there is no incentive to spend much effort in organizing them with extensive metadata beyond what is required for the project. These users need a production workflow tool that will streamline the editing and client approval process, the latter mostly performed via email or password-protected websites nowadays.

Aperture’s projects (flat, not nested hierarchically, and thus not all that scalable) and vaults (archives where projects go when they are no longer needed or finished) are a clear indication it is intended mostly for the second type of photographer. Apple did not convey the specialized focus of the product forcefully enough, but the blogosphere’s buzz machine bears much of the blame for raising unwarranted expectations about what the product is about.

Aperture is good for one thing: let wedding photographers and the like go through the editing (as in sorting through slides on a light table, not retouching an individual image) as efficiently as possible. Most wedding pros simply cannot afford the time to individually edit a single picture beyond white balance, tonal adjustments, cropping and sharpening, and Aperture’s built-in tools are perfectly adequate for them.

That is also why there is such a slow import process — this prep work is required to make the actual viewing, side by side comparison, sorting and manipulation as smooth, interactive and responsive as possible to avoid disrupting the photographer’s “flow”. The goal is to have a very smooth virtual light table, not a filing cabinet, where you can move slides around, group them in stacks, toy around with versions, and compare them side by side on dual 30 inch Cinema Displays. The user experience was clearly designed around what the average art director (with his or her loupe almost surgically implanted) is familiar and comfortable with.

The positioning as “iPhoto Pro” obscures the fact that Aperture is sub-optimal for managing or indexing large archives with rich metadata. For the first type of pro photographer (which is also what the average advanced amateur will relate to), a digital asset management database like the excellent Kavasoft Shoebox or the more pedestrian but cross-platform and workgroup-oriented Extensis Portfolio and Canto Cumulus will better fit their needs, with Adobe Bridge being probably sufficient for browsing and reviewing/editing freshly imported photos (when it is not keeling over and crashing, that is).

Conclusion

Aperture is clearly a 1.0 product. It shows promise, but is likely to disappoint as the reality cannot match the hype that developed in the month or so between the announcement and the release. The problem is that there is a Cult of the Mac that raises unrealistic expectations of anything coming out of Cupertino.

The hype around Aperture was certainly immense, I am sure Apple was as surprised as any by how positive the response was (after all, they released Aperture at a relatively obscure pro photo show). They are probably furiously revising plans for the next release right now. I consider Aperture 1.0 more a statement of direction than a finished product.

Aperture internals

Last in a series:

  1. First impressions
  2. Asset management
  3. Under the hood: file format internals

This article was never completed because I switched to Lightroom and lost interest. What was done may be of interest to Aperture users, although the data model probably changed since 1.0

Aperture stores its library as a bundle with the extension .aplibrary. This is a concept inherited from NeXTstep, where an entire directory that has the bundle bit set is handled as if it were a single file. A much more elegant system than Mac OS Classic’s data and resource forks.

Inside the bundle, there is a directory Aperture.aplib which contains the metadata for the library in a file Library.apdb. This file is actually a SQLite3 database. SQLite is an excellent, lightweight open-source embedded relational database engine. Sun uses SQLite 2 as the central repository for SMF, the next-generation service management facility that controls booting the Solaris operating system and its automatic fault recovery, a strong vote of confidence by Sun in SQLite’s suitability for mission-critical use. SQLite is also one of the underlying data storage mechanisms used by Apple’s over-engineered Core Data framework.

You don’t have to use Core Data to go through the database, the /usr/bin/sqlite3 command-line utility is perfectly fine for this purpose. Warning: using sqlite3 to access Aperture’s data directly is obviously unsupported by Apple, and should not be done on a mission-critical library. At the very least, make sure Aperture is not running.

ormag ~/Pictures/Aperture Library.aplibrary/Aperture.aplib>sqlite3 Library.apdb
SQLite version 3.1.3
Enter ".help" for instructions
sqlite> .tables
ZRKARCHIVE             ZRKIMAGEADJUSTMENT     ZRKVERSION
ZRKARCHIVERECORD       ZRKMASTER              Z_10VERSIONS
ZRKARCHIVEVOLUME       ZRKPERSISTENTALBUM     Z_METADATA
ZRKFILE                ZRKPROPERTYIDENTIFIER  Z_PRIMARYKEY
ZRKFOLDER              ZRKSEARCHABLEPROPERTY
sqlite> .schema z_metadata
ormag ~/Pictures/Aperture Library.aplibrary/Aperture.aplib>sqlite3 Library.apdb
SQLite version 3.3.7
Enter ".help" for instructions
sqlite> .tables
ZRKARCHIVE             ZRKKEYWORD             ZRKVOLUME
ZRKARCHIVERECORD       ZRKMASTER              Z_11VERSIONS
ZRKARCHIVEVOLUME       ZRKPERSISTENTALBUM     Z_9VERSIONS
ZRKFILE                ZRKPROPERTYIDENTIFIER  Z_METADATA
ZRKFOLDER              ZRKSEARCHABLEPROPERTY  Z_PRIMARYKEY
ZRKIMAGEADJUSTMENT     ZRKVERSION
sqlite> .schema zrkfile
CREATE TABLE ZRKFILE ( Z_ENT INTEGER, Z_PK INTEGER PRIMARY KEY, Z_OPT INTEGER, ZASSHOTNEUTRALY FLOAT, ZFILECREATIONDATE TIMESTAMP, ZIMAGEPATH VARCHAR, ZFILESIZE INTEGER, ZUUID VARCHAR, ZPERMISSIONS INTEGER, ZNAME VARCHAR, ZFILEISREFERENCE INTEGER, ZTYPE VARCHAR, ZFILEMODIFICATIONDATE TIMESTAMP, ZASSHOTNEUTRALX FLOAT, ZFILEALIASDATA BLOB, ZSUBTYPE VARCHAR, ZCHECKSUM VARCHAR, ZPROJECTUUIDCHANGEDATE TIMESTAMP, ZCREATEDATE TIMESTAMP, ZISFILEPROXY INTEGER, ZDATELASTSAVEDINDATABASE TIMESTAMP, ZISMISSING INTEGER, ZVERSIONNAME VARCHAR, ZISTRULYRAW INTEGER, ZPROJECTUUID VARCHAR, ZEXTENSION VARCHAR, ZISORIGINALFILE INTEGER, ZISEXTERNALLYEDITABLE INTEGER, ZFILEVOLUME INTEGER, ZMASTER INTEGER );
CREATE INDEX ZRKFILE_ZCHECKSUM_INDEX ON ZRKFILE (ZCHECKSUM);
CREATE INDEX ZRKFILE_ZCREATEDATE_INDEX ON ZRKFILE (ZCREATEDATE);
CREATE INDEX ZRKFILE_ZFILECREATIONDATE_INDEX ON ZRKFILE (ZFILECREATIONDATE);
CREATE INDEX ZRKFILE_ZFILEMODIFICATIONDATE_INDEX ON ZRKFILE (ZFILEMODIFICATIONDATE);
CREATE INDEX ZRKFILE_ZFILESIZE_INDEX ON ZRKFILE (ZFILESIZE);
CREATE INDEX ZRKFILE_ZFILEVOLUME_INDEX ON ZRKFILE (ZFILEVOLUME);
CREATE INDEX ZRKFILE_ZISEXTERNALLYEDITABLE_INDEX ON ZRKFILE (ZISEXTERNALLYEDITABLE);
CREATE INDEX ZRKFILE_ZMASTER_INDEX ON ZRKFILE (ZMASTER);
CREATE INDEX ZRKFILE_ZNAME_INDEX ON ZRKFILE (ZNAME);
CREATE INDEX ZRKFILE_ZPROJECTUUIDCHANGEDATE_INDEX ON ZRKFILE (ZPROJECTUUIDCHANGEDATE);
CREATE INDEX ZRKFILE_ZUUID_INDEX ON ZRKFILE (ZUUID);
sqlite> .schema z_metadata
CREATE TABLE Z_METADATA (Z_VERSION INTEGER PRIMARY KEY, Z_UUID VARCHAR(255), Z_PLIST BLOB);
sqlite> .schema z_primarykey
CREATE TABLE Z_PRIMARYKEY (Z_ENT INTEGER PRIMARY KEY, Z_NAME VARCHAR, Z_SUPER INTEGER, Z_MAX INTEGER);
sqlite> select * from z_primarykey;
1|RKArchive|0|1
2|RKArchiveRecord|0|0
3|RKArchiveVolume|0|1
4|RKFile|0|2604
5|RKFolder|0|23
6|RKProject|5|0
7|RKProjectSubfolder|5|0
8|RKImageAdjustment|0|1086
9|RKKeyword|0|758
10|RKMaster|0|2604
11|RKPersistentAlbum|0|99
12|RKPropertyIdentifier|0|119
13|RKSearchableProperty|0|84191
14|RKVersion|0|2606
15|RKVolume|0|0
sqlite>

One useful command is .dump, which will dump the entire database in the form of the SQL commands required to recreate it. Even with a single-photo library, this generates many pages of output.

Here is my attempt to reverse engineer the Aperture 1.5.2 data model. CoreData, like all object-relational mappers (ORMs) leaves much to be desired from the relational perspective. The fact SQLite foreign keys constraints are not enforced (and not even set by CoreData) doesn’t help. Click on the diagram below to expand it.

Aperture 1.5.1. data model

All tables are linked to Z_PRIMARYKEY which implements a form of inheritance using the column Z_ENT to identify classes. The only table that seems to use this today is ZRKFOLDER, where the rows can have a Z_ENT of 5 (Folder), 6 (Project) or 7 (ProjectSubfolder). For clarity, I have omitted the links between all tables and Z_PRIMARYKEY.

ZRKIMAGEADJUSTMENT looks like the table that records the transformations that turn a master image into a version, Live Image style.