Mac

Batch-changing PDF files to open in Preview on Mac OS X 10.4

Adobe Acrobat 7 is a necessary evil for some operations such as OCR, rotating or deleting pages (although as an editor PDFPen is cheaper and more useable). Acrobat has grown into a sumo of an application, with incredibly sluggish load times. If you peek under the hood, you will notice it actually has a full copy of MySQL embedded in it. I am not fond of MySQL, but it is not that bad of a database as to account for all the bloat in Acrobat, who knows what else lurks in that package?

At any rate, Acrobat is to be used as a last resort, and most definitely not the viewer of choice. Apple’s own Preview.app is far snappier and more pleasant to use (no annoying Yahoo toolbars or spurious JavaScript warnings when you have the good sense to disable a potential security hole). Acrobat insists on taking over ownership of a file once you save it, however. You could manually change the HFS creator code for each file by hand, but this is where Spotlight’s efficiently indexed metadata database shows synergy with OS X’s underlying UNIX scripting magic. The following one-liner will clear the creator code for all PDF files on your system so they open with Preview instead of Photoshop. It can easily be added to a crontab or launchd script.

mdfind ‘kMDItemCreator==“Adobe Acrobat*”’|tr “\012” “\0”|xargs -0 -n 1 /Developer/Tools/SetFile -c ‘’

If you simply want to strip the creator application from all PDF documents altogether, this script will do the trick:

mdfind ‘kMDItemKind=“Adobe PDF document” and kMDItemCreator!=""’|tr “\012” “\0”|xargs -0 -n 1 /Developer/Tools/SetFile -c ‘’

mdfind queries the Spotlight metadata for all files with a creator that starts with “Adobe Acrobat”. xargs reads arguments from standard input and passes them as command-line arguments to the SetFile utility (part of the OS X Developer Tools). tr replaces line feeds with ASCII 0 NUL characters used as argument separators when xargs is invoked with the -0 option, so embedded spaces in filenames are passed unscathed. I have tested this with Unicode file names, and they are handled correctly as well.

Thanks to Spotlight, the whole process takes only a couple of seconds on my machine, something that would not be possible if it had to scan through 300GB’s worth of files. It also goes to show that Spotlight’s apparent sluggishness when you use it from the GUI stems entirely from the overhead of the GUI struggling to progressively display search results as they are returned by the metadata index, not from the underlying database engine. There is no justification for Google searches of the entire Internet being faster than a local desktop search.

Of course, the technique can be generalized to other ownership changes for contested file formats like HTML, JPEG or TIFF.

Switch complete

Today I have officially completed my switch from Windows to the Mac for my personal stuff. The last remaining tie that bound me to the boys from Redmond was my Canon DR-2080C document scanner, which does not have drivers for Mac OS X. I have now replaced it with a Fujitsu fi-5110EOX2 and the excellent translated Japanese driver for OS X (shame on Fujitsu for not releasing the English version, despite their presence at MacWorld 2005).

My ultimate goal is the elusive paperless office. I would like to scan to PDF and shred all my paperwork, apart from that required for legal reasons like pay stubs, invoices and important documents like diplomas and property titles. Electronic documents are easier to file, faster to retrieve, backed up more reliably for disaster recovery purposes, and take far less bulk. The fact I had to reluctantly boot up my PC to scan in batches runs counter to the streamlined workflow advocated by efficiency experts like David Allen of Getting Things Done fame.

My first computer was an Apple II, the first computer I bought with my own money was a Macintosh Plus, but as I became a hard-core UNIX user, the limitations of the old Mac OS became all too apparent. I once had to write a program in Think C using low-level disk I/O primitives to do a simple task that could be handled by a shell one-liner, to batch rename TeX font files as System 6 did not have a shell or AppleScript, and I could not afford MPW back then. I decided I would not go back until Apple came up with an OS with robust UNIX underpinnings like NeXTstep. When OS X was first released, I bought an iBook, which was followed by an iMac G4 (ordered the very day it was announced, and which now serves as a video conferencing terminal at my parents’ place). The Macs were intended to be auxiliary machines, my primary home computers being a fast Windows box used for digital photography and games, and a dual-processor Solaris machine.

The PCs were in my bedroom, the iMac in my living room, as it is an elegant artifact that does not look out of place there. It also wakes up from sleep instantly, making it the ideal machine for quick web or email use. I began to notice I was increasingly using the Mac for real work, and started to evince an almost physical reluctance to boot up the Windows machine, even though the iMac was significantly slower. The noise and unreliability of the PC probably accounted for much of this (flaky ATX power supplies and motherboards). I had also never managed to get Adobe Premiere running smoothly to edit video, and iMovie+iDVD just work.

The logical next step was to upgrade to a full-featured PowerMac, which I did in June of last year. It thus took me over a year to complete the migration, despite being very knowledgeable about both platforms. The hardest items to switch were not the usual suspects like Microsoft Office — Office is fully supported on the Mac and I have a license, even though I never use it and haven’t even bothered to reinstall it after upgrading to Tiger. No, the two hardest applications to switch from were IMatch and the document scanning. Kavasoft’s amazing Shoebox has all the power of IMatch but a far superior user interface (I will post a review Real Soon Now).

What use is left for the PC? For now, games, although even this will disappear in Spring of 2006 when the HDTV-capable PlayStation 3 is released.

Ripping your CD library and building a home network

Since I moved six years ago, I keep my CDs in binders (four of them, plus one for DVDs and two for CD-ROMs) and the jewel cases in storage. I just finished ripping the first folder’s worth, about 250 CDs and SACDs in iTunes. The bulk of the time spent is actually in cleaning up inconsistent CDDB metadata and locating scans of the cover art. As I mentioned earlier, I am ripping to Apple’s lossless encoding, which is a lossless zip-style compression of the 16-bit, 44.1kHz stereo PCM CD audio stream. There is no loss of quality and my iTunes library should now be a bit-for-bit exact copy of my CD collection (or at least the third or so I have already ripped).

iTunes status

Because there is no loss of quality, I won’t have to go through the effort again, whereas if I had ripped to a lossy format like MP3 or AAC, I would need to do so again to play on my HiFi setup or if the level of compression was too high. Hard drives are cheap, and storing 250GB of music is no longer the daunting prospect it was a few years ago. Lossy formats like MP3 take detail away, rather than introducing noise, and thus it is not immediately obvious just how much damage was done, but side-by-side listening makes it clear. I always find it very amusing to read people nit-picking about subtle details of audiophile gear, and then basing their subjective judgment on testing with MP3s or (even worse) video game soundtracks played over a PC sound card.

Due to electromagnetic interference, a PC chassis is the last place you want to put quality analog audio circuitry. The way to go is to hook up a PC or Mac’s Toslink optical or SPDIF coaxial digital audio output to an external digital to analog converter (“DAC”, such as the one built into every home theater surround receiver). This situation is reminiscent of high-end CD players, where the laser pickup mechanism (“transport”) is in a different box from the DAC to improve quality. I do most of my listening from my Mac connected to my Yamaha home theater connected by Toslink, or from Sennheiser HD-650 headphones connected to a Headroom DAC and headphone amp (via USB).

There are alternatives to a direct connection, such as a Squeezebox, Apple AirPort Express, or one of the lesser devices that allow you to stream music from your computer to your amplifier via a wired or wireless local area network. WiFi may be fashionable, but I don’t recommend using it for streaming audio or video because the jitter introduced by interferences degrades sound quality.

Streaming sound over the house is of course the first step in building a home network. Market studies show most people use a home network only to share Internet access or a printer between several computers, and they haven’t yet reached even this first stage. One of the most obvious uses for home networking is remote monitoring using a webcam, but this hasn’t been marketed effectively. Remote control of TiVo is another non-contrived application, much appreciated by their users.

Given the overwhelming amount of gadgetry that clutters my tiny San Francisco-size apartment, it is natural I have a home network, entirely wired, although I do have WiFi for visitors. In an idle moment, I mapped it (PDF) as a practical exercise in using OmniGraffle instead of Visio. One conclusion I drew from a cursory analysis of it is that all the networking gear combined did not amount as much in value as my headphones alone. Home networking as a category is not going to dominate consumer electronics anytime soon…

MacWorld SF roundup

I work a mere four blocks away from the Moscone Center, where the annual MacWorld SF trade show is held, so naturally I just drift there during my lunch break, possibly extended… Here is a list of strange and wonderful things I saw during the show, and that might have been overlooked by the more mainstream sites:

iLugger

The iLugger is a carrying case for the iMac G5 (it fits both the 17″ and 20″ models). Most laptops are always connected to the mains and seldom used as real mobile devices, and an iMac G5 will give significantly better performance at 2/3 the price of a PowerBook. Interestingly, the company making it is a blimp manufacturer, clearly a case of someone scratching their own itch.

Epson RD-1

Epson repNot a new product, but I got to handle an Epson R-D1, a limited edition Leica M compatible rangefinder digital camera (the only one of its kind) based on a Voigtländer-Cosina Bessa R2 body. I shot a few samples with a 50mm Summicron and Noctilux, and the resulting pictures are remarkable clear and sharp. Noise levels at ISO 800 are significantly better than my Canon EOS 10D, no small feat, and given a rangefinder’s 2-3 stop advantage over a SLR, this looks like an ideal available-light camera.

The Bessa R2 has a relatively short rangefinder base length, which reduces its focusing accuracy compared to a Leica. The hardest lens to focus is the Noctilux-M 50mm f/1.0 (yes, you read that right, the fastest production lens in the world), due to its very shallow depth of field at low aperture, as shown in the picture to the left. I took it with a Noctilux (ISO 200, f/1.0, 1/125) at close to its minimum distance of 1 meter, and focusing accuracy seems adequate… Click on the image for the full-size JPEG with EXIF metadata (not including the manually set aperture and focus, of course). For comparison purposes, here is the corresponding JPEG I shot yesterday (ISO 800, Summicron-M 50mm f/2, 1/30, aperture unrecorded, probably f/4).

The gentleman portrayed is an Epson representative who was apparently given the charge of watching over this $3000 camera (apparently his only task). The sight of me pawing over it might explain his expression…

I won’t duplicate Luminous Landscape’s review, and didn’t have that much time to play with the RD-1 in any case. Build quality is good, as good as the 10D at least. It does not have the satisfying heft of my Leica MP, nor its superlative 0.85x viewfinder, but then again what does? Some retro touches like the dials are an affectation, as well as the manually cocked shutter. The shutter cocking lever does not have to advance film, and its short travel feels somewhat odd.

X-Rite Pulse ColorElite

X-Rite, a maker of color calibration hardware, was demonstrating its Pulse ColorElite bundle, resulting from its acquisition of the color management software vendor Monaco Systems. This package allows you to calibrate with precision the color characteristics of a monitor, scanner, digital camera and printer, for consistent, professional-grade color management. It goes much beyond simple and now relatively inexpensive monitor calibration colorimeters, by also using a spectrophotometer (an instrument that measures light across the visible spectrum, wavelength by wavelength), and the price is correspondingly higher. The market-leading product is the GretagMacbeth Eye-One Photo. X-Rite has clearly replicated the Eye-One package, but at a slightly lower price, and with some nice touches that significantly improve usability. The Eye-One spectrophotometer (which is used both for calibrating monitors and prints, a GretagMacbeth patented technology) is reportedly more accurate, however (3nm vs. 20nm). The Pulse bundle retails for $1300, the Eye-One for $1400.

FrogPad

The FrogPad is a small one-handed Bluetooth keyboard designed to be used with PDAs or smartphones, but it can also be used with a Mac or PC as it follows the standard Bluetooth Human Interface Device (HID) profile. You can hold it in one hand and type with the other. I don’t know how long it takes to get used to it, but at any rate they are offering $50 off the regular price of $179 if you use the code Apple50. They also has a mockup of a folding version in cloth, for use in wearable computing.

Interwrite Bluetooth tablet

CalComp used to make high-end tablets and digitizers for architects, engineers and artists. The tablet market is pretty much monopolized by Wacom, nowadays, but CalComp is still around (after being bought out by GTCO). They were demonstrating a Bluetooth tablet for use by teachers in a classroom setting (although I am not sure how many cash-strapped school districts can afford the $800 device).

JetPhoto

There was a cluster of small Chinese companies exhibiting. One of the more interesting was Atomix, a company that makes JetPhoto, a digital photo asset management database, similar to Canto Cumulus or Extensis Portfolio. Apparently, their forte is the integration of GPS metadata and the image database, you can do geographical selections on a map to find photos. It also had many export functions with a comprehensive database of cell phones and PDAs to export photos to. Unfortunately, the current version does not support sophisticated hierarchical, set-oriented categories, the one feature in IMatch I find absolutely vital.

The program looked impressively polished for a first version, and is available free to download for now. This is yet another illustration of how the Chinese are rapidly advancing up the value chain, and American firms could be in for a nasty surprise if they maintain the complacent belief high-end jobs are their birthright and only unqualified manufacturing jobs or menial IT tasks are vulnerable to Chinese (or Indian) competition.

Fujitsu ScanSnap

One of the few things I still use my Windows game console PC for is to drive my Canon DR-2080C document scanner. This small machine, the size of a compact fax machine, can scan to PDF 20 pages per minute (and it can scan both sides simultaneously). It is intended for corporate document management, but is also very useful to tame the paper tiger by batch-converting invoices, bills and so onto purely electronic form, in a way that is not practical using a flatbed scanner.

It seems Fujitsu is bringing that functionality to the Mac with the similarly specified ScanSnap fi-5110EOX. The scanner is driven with a bundled version of Adobe Acrobat 6.0. I can well see this becoming popular in small businesses run on Macs, although the Fujitsu reps on the stand implied they were here to gather potential customer feedback to make a stronger case for enhanced Mac support with their management and accelerate the release of Mac drivers for it.

My office PBX is actually a PC-based CTI unit made by Altigen, and voice mails left to me are automatically forwarded to me as WAV attachments in an email. That has major usability benefits – I can set email rules to drop voice mails when the attachment is too small (usually someone who hanged up on the voice mail prompt), or fast forward and rewind during message replay. This feature is addictive – voice mail still sucks compared to email (disk hogging, not searchable or quickly scannable), but being liberated from excruciatingly slow voice-driven user interfaces, replete with unnecessarily deliberate and verbose prompting, makes it somewhat bearable.

I did not have this kind of functionality at home, however. It is possible VoIP devices will offer it at some point, but that does not seem to be the case in low-end home VoIP for now. I tried experimenting with the open-source Asterisk PBX, but did not have the time to pursue this, and in any case I’d rather not have to install a dedicated Linux machine at home just for this purpose (my home network runs on Solaris/x86, thank you very much).

Fortunately, Ovolab, an Italian company based near Milan, has introduced Phlink, just what I was searching for, and I actually bought one on the spot. It is a small USB telephony attachment that plugs into a phone line and turns your Mac into a sophisticated CTI voice-mail system. It is fully scriptable using AppleScript and supports Caller ID. I have yet to use it extensively (the hardest part, interestingly, is bringing a phone cord close enough to my Mac).

Switching to Camino

I mentioned earlier that I had switched to Mozilla Firefox (then called Firebird) as my default web browser, from Mozilla (I still use Mozilla on Solaris). In the last few months, the Firefox bandwagon started becoming mainstream, probably due to exasperation with the continuing security holes in Microsoft’s Internet Explorer.

That said, I have also switched to the Mac at home, and Firefox on Mac OS X often feels like an afterthought. Several bugs have gone unfixed in the last three releases or so, even though patches have been submitted. I am not excessively fond of Safari, Apple’s default browser, and the ability to share profile data between my Windows machine at work and my Mac at home is a big benefit.

Two weeks ago, I tried Camino on my home machine. Camino is a derivative of Mozilla – it uses the same HTML rendering engine, but wraps it in a shell that leverages Apple’s technologies the way a cross-platform browser like Firefox or Mozilla can’t. Earlier versions had been unconvincing, but I switched for the 0.8.1 release. Firefox 1.0PR on the Mac is an unalloyed disaster, buggy and crash-prone, without any visible bug fixes (I switched back to 0.9.3 within a couple of hours), and that was probably the last straw.

The immediate benefits Camino brings me are the following:

  • Middle-clicking on a link opens it in a new tab, the way it does for Firefox on all platforms but the Mac
  • Navigating through Web forms using the tab key works perfectly, when Firefox and Safari will only let you switch between text fields, but not pull-down menus, radio buttons or the like.
  • When minimizing windows using Exposé, there is no annoying Firefox or Mozilla ghost window cluttering the screen.

Of course, not all is perfect, and the migration entails these pitfalls:

  • I have Firefox set up so if I type a few words separated by spaces in the URL bar, it searches Google. This avoids the need for two text boxes, one for th URL and one for searching (the way Firefox does in its default configuration, or Safari), which are redundant and not as usable. Unfortunately Camino does not support this directly and pops up a modal dialog box complaining about the illegal URL format. Fortunately, Camino does support Mozilla’s excellent keywords feature, so I created a keyword “g” to handle Google queries.
  • Camino keeps bookmarks in a OS X style XML plist format, rather than the standard bookmark format used by other Mozilla variants. This makes synchronizing bookmarks a little bit slower, as you have to use the import utility instead of simply copying a file over. Bookmark imports are not perfect, moreover, as they tend to drop separators.
  • The saved passwords are not interoperable, as Camino stores them in OS X’s Keychain manager instead of Mozilla’s encrypted database format (I don’t know if this means Camino and Safari can share passwords). I have started working on Python modules to read and decrypt the Mozilla files, however, and I have a low-priority password sync project on my back burner.
  • Camino doesn’t have the wealth of extensions Firefox does, but then again since they seem to break with every release of Firefox (and many don’t work well on the Mac), this is less of a disadvantage than may seem at first glance.