Mylos

Transfer complete. Or is it?

I finally completed my CD ripping project and now have lossless copies of all my CDs (and the CD-audio layer of my SACDs) on my Mac.

iTunes status bar

As I mentioned before, the bulk of the work is tagging the music with correct metadata, locating cover art when the majority of my CD jewel cases and booklets are moldering in a cellar in France. (Amazon is helpful, specially now that it allows users to upload their own scans of cover art). Doug’s AppleScripts for iTunes make short work of normalizing CDDB metadata like correcting the people who stuff the composer name in the title or vice versa.

iTunes scripts menu

I wrote my own scripts to tackle these common operations:

  1. Strip numbers from titles. That’s the “Track #” field’s job. This script requires the Satimage AppleScript Regex OSAX plug-in to work.
  2. Renumber a selection sequentially, so I can split a CD into its constituent parts and renumber them independently from each other or the original CD track order.
  3. Strip prefix strings from titles.

This does not mean I am finished, however. About 3/4 of the way through, I realized iTunes is far from perfect at extracting CD audio. For various reasons related to how the Redbook CD audio format was designed without computers in mind, it is very hard to get a perfect, repeatable rip from one attempt to the other. iTunes has an “error-correction” option that seems not to have any effect. For reliable ripping, you have to use specialized programs like EAC on Windows and a cdparanoia-based program like Max on OS X. This complicates the workflow as Max is slightly buggy, and nowhere as good at managing metadata as iTunes is, so the one-step import in iTunes becomes a less streamlined affair:

  1. Rip the CD to AIFF in Max
  2. Import the AIFF into iTunes
  3. Number the tracks (very important!) using my “Renumber tracks” script
  4. Convert to Apple lossless
  5. Copy the metadata from CDDB using Doug Adams’ Copy info tracks to tracks script.
  6. Add album cover art and mention the track was ripped with Max
  7. Backup to another hard drive!

The good thing is, now that I have collected the metadata and cover art, I can rerip trouble tracks with clicks or pops, and copy the metadata in one step using Doug’s action, so re-ripping won’t be as much of a hassle as the first time. The next step is to convert everything to FLAC so I have a non-proprietary library that works with SlimServer on my Solaris home server.

If you are not as obsessive about your music metadata as I am, the process will be much easier if you just use whatever CDDB supplies you. In any case, remember, just say No! to DRM-infested lossy-compressed tracks from the iTunes Music Store.

Gates opening new Vistas

Bill Gates announced yesterday he is progressively going to disengage himself from day-to-day participation in Microsoft over the next two years, to concentrate exclusively on his foundation. However questionable Microsoft’s business practices may be, they are no worse than Standard Oil’s. The Rockefellers or Carnegie bought social respectability by endowing institutions for the already comfortable. No matter what the IRS may claim, donating to places like Harvard in exchange for naming rights does not qualify as charity in my book.

In contrast, Gates’ humanitarian work has been remarkable — his money is comforting the truly afflicted of this world, like sufferers of leprosy or malaria. His example is highly unlikely to be emulated by Silicon Valley’s skinflint tycoons (Larry Ellison, Steve Jobs, I’m looking at you). The latter conveniently convinced themselves their wealth is due entirely to their own efforts, never to luck, or government funding in the case of the Internet moguls. This leads to the self-serving belief that they are absolved of any obligation to society or to those less fortunate (in both senses of the term).

This decision is not entirely unexpected. Microsoft has been floundering for the last several years, and has accrued severe managerial bloat, something the ruthless and paranoid Bill Gates circa 1995 would never have allowed to continue. There is remarkable dearth of insightful commentary on the announcement. My take is that the harrowing and humiliating process of the DoJ anti-trust trial proved cathartic and led him to review his priorities, even if the lawsuit itself ended up with an ineffective slap on the wrist.

Some equally interesting reading coming out of Redmond: Broken Windows Theory, an article by a Microsoft project manager on the back story behind the Windows Vista delay, with some really interesting metrics. Apparently Vista takes no less than 24 hours to compile on a fast dual-processor PC. It has 50 levels of dependencies, 50 million lines of code (one metric I personally find meaningless, as you can get more done in one line of Python than in a hundred lines of C/C++). His conclusion is that due to its scale, Vista could simply be structurally unmanageable. Certainly, the supporting infrastructure, as in automation tools, code and dependency analysis, project management et al. ought to be a project in itself of the same scope as, say, Microsoft Word.

When I worked at France Télécom in the late nineties, they were reeling from the near total failure of Frégate, a half-billion dollar billing system of the future project (another interesting metric: two-thirds of billing systems projects worldwide end in failure). The grapevine even devised a unit of measurement, the Potteau, after an eponymous Ingénieur Général (a typically French title with roots in the military engineering side of the civil service) involved in the project. One potteau equals one man-century. It is deemed the unit beyond which any software project is doomed to failure.

Vista involved 2000 developers over 5 years. That’s over 100 potteaux.

Spare the strap, spoil the camera

There are many ways to carry a camera. Most are supplied with a neck strap (and there is a non-slip shoulder equivalent, the UPstrap). Wearing a camera around the neck gets tiresome really quickly, makes you look like a goofy tourist, and potentially attracts the undesirable attention of thieves and would-be muggers.

I usually carry my camera discreetly inside a shoulder bag. A regular bag, mind you, not one of those obesely over-padded camera bags that are so bulky as to preclude walking around with them. You still need something to secure the camera, prevent it from slipping from your grasp and falling onto the hard pavement.

For pocket cameras, the wrist strap usually supplied will do just fine. You can get a tighter fit by attaching a cord lock (Google comes up with a bewildering variety of them) and reduce the risk of the lanyard slipping off your wrist. For some reason, only Contax had the sense to supply lanyards with a built-in cord lock.

For larger cameras, you need a hand strap. They are very common with camcorders, but unfortunately, very few camera manufacturers think of offering them as an option, or even provide bottom eyelets to make attaching them convenient. You have to hunt for third-party accessories and attach them using the tripod screw mount at the bottom of the camera.

For some time, I have mounted a cheap Sunpak hand strap on my Rebel XT. It does the job, but the plastic tripod mount is flimsy and unscrews all to easily, and the vinyl is not very pleasant to the touch. Another issue is that it precludes the use of an Arca-Swiss type quick-release plate. About a year ago, I wrote to Acratech, the people who make my ballhead and the QR plate on my Rebel XT, to suggest they drill an eyelet in the plate to allow mounting a strap, but never got a reply back.

Sunpak wrist strap

I recently found out that Markins, a Korean maker of fine photographic ballheads, apparently took a patent on the idea and sells leather hand straps to go with some of their QR plates. Despite the princely price, I immediately ordered a set.

You have to unwind the strap to thread it through the eyelets on the camera and the QR plate, and back through the leather knuckle guard. This is fiendishly difficult to do if you don’t know the trick to it: wrap the tip of the strap in packing tape to produce a leader, and cut to a taper with scissors to ease insertion.

making a leader

threading through the eyelet

threading through the leather guard

front view

rear view

This strap works because the Rebel XT has a protruding hand grip. For a camera like the Leica MP, which does not have an ample grip (unless you attach an accessory grip), I use a sturdy strap liberated from my father’s old 8mm movie camera.

Tripod mount wrist strap on a Leica MP

If you don’t have one of these lying around, you can always try one of Gordy Coale’s wrist straps, or if they lack snob appeal, Artisan & Artist makes ridiculously fancy (and expensive) ones for Japanese Leica fetishists.

Update (2022-11-24):

I use a Peak Design hand strap on my Nikon Z7. It attaches to a standard Peak Design anchor at the bottom (in this case, attached to a RRS QR plate) and has a gate clip strap at the top that goes through an slot-type eyelet (or in this case a triangular split ring).

Put whiny computers to work

I have noticed a trend lately of computers making an annoying whining sound when they are running at low utilization. This happens with my Dual G5 PowerMac, the Dells I ordered 18 months ago for my staff (before we ditched Dell for HP due to the former’s refusal to sell desktops powered with AMD64 chips instead of the inferior Intel parts), and I am starting to notice it with my MacBook Pro when in a really quiet room.

These machines emit an incredibly annoying high-pitched whine when idling, one that disappears when you increase the CPU load (I usually run openssl speed on the Macs). Probably the fan falls in some oscillating pattern because no hysteresis was put into the speed control firmware. It looks like these machines were tested under full load, but not under light load, which just happens to be the most common situation for computers. The short-term work-around is to generate artificial load by running openssl in an infinite loop, or to use a distributed computing program like Folding@Home.

Load-testing is a good thing, but idle-testing is a good idea as well…

Taming the paper tiger

A colleague was asking for some simple advice about all-in-one printer/copier/fax devices and got instead a rambling lecture on my paper workflow. There is no reason the Internet should be exempted from my long-winded rants, so here goes, an excruciatingly detailed description of my paper workflow. It shares the same general outline as my digital photography workflow, with a few twists.

Formats

The paperless office is what I am striving for. Digital files are easier to protect than paper from fire or theft, and you can carry them with you everywhere on a Flash memory stick. As for file formats, you don’t want to be locked in, so you should either use TIFF or PDF, both of which have open-source readers and are unlikely to disappear anytime soon, unlike Microsoft’s proprietary lock-in format of the day.

TIFF is easier to retouch in an image editing program, but:

  1. Few programs cope correctly with multi-page TIFFs
  2. PDF allows you to combine a bitmap layer to have an exact fac-simile with a searchable OCR text layer for retrieval, TIFF does not.
  3. TIFF is inefficient for vector documents, e.g. receipts printed from a web page.
  4. The TIFF format lacks many of the amenities designed in a format like PDF expressly designed as a digital replacement for paper.

Generating PDFs from web pages or office documents is as simple as printing (Mac OS X offers this feature out of the box, for Windows, you can print to PostScript and use Ghostscript to convert the PS to PDF.

Please note the bloated Acrobat Reader is not a must-have to view PDFs, Mac OS X’s Preview does a much better job, and on Windows Foxit Reader is a perfectly serviceable alternative that easily fits on a Flash USB stick. UNIX users have Ghostscript and the numerous UI wrappers that make paging and zooming easy..

Acquisition

You should process incoming mail as soon as you receive it, and not let it build up. If you have a backlog, set it aside and start your new system, applicable to all new snail mail. That way the situation does not degrade further, and you can revisit old mail later.

Junk mail that could lead to identity theft (e.g. credit card solicitations) should be shredded or even better, burnt (assuming your local environmental regulations permit this). if you get a powerful enough shredder, it can swallow the entire envelope without even forcing you to open it. Of course, you should only consider a cross-cut shredder. Junk mail that does not contain identifiable information should be recycled. When in doubt, shred. Everything else should be scanned.

Forget about flatbed scanners, what you want is a sheet-fed batch document scanner. It should support duplex mode, i.e. be capable of scanning both sides of a sheet of paper in a single pass. For Mac users Fujitsu ScanSnap is pretty much the only game in town, and for Windows users I recommend the Canon DR-2050C (the ScanSnap is available in a Windows version, but the Canon has a more reliable paper feed less prone to double-feeding). Either will quickly scan a sheaf of paperwork to a PDF file at 15–20 pages per minute.

Filing

Paper is a paradox: it is the most intuitive medium to deal with in the short-term, but also the most unwieldy and unmanageable over time. As soon as you layer two sheets into a pile, you have lost the fluidity that is paper’s essential strength. Shuffling through a pile takes an ever increasing amount of time as the pile grows.

For this reason, you want to organize your filing plan in the digital domain as much as possible. Many experts set up elaborate filing plans with color-coded manila folders and will wax lyrical about the benefits of ball-bearing sliding file cabinets. In the real world, few people have the room to store a full-fledged file cabinet.

The simplest form of filing is a chronological file. You don’t even need file folders — I just toss my mail in a letter tray after I scan it. At the end of each month, I dump the accumulated mail into a 6″x9″ clasp envelope (depending on how much mail you receive, you may need bigger envelopes), and label it with the year and month. In all likelihood, you will never access these documents again, so there is no point in arranging them more finely than that. This filing arrangement takes next to no effort and is very compact – you can keep a year’s worth in the same space as a half dozen suspended file folders, as can be seen with 9 months’ worth of mail in the photo below (the CD jewel case is for scale).

Monthly filesThere are some sensitive documents you should still file the old-fashioned way for legal reasons, such as birth certificates, diplomas, property titles, tax returns and so on. You should still scan them to have a backup in case of fire.

Date stamping

As you may have to retrieve the paper original for a scanned document, is important to date stamp every page (or at least the first page) of any mail you receive. I use a Dymo Datemark, a Rube Goldberg-esque contraption that has a rubber ribbon with embossed characters running around an ink roller and a small moving hammer that strikes when the right numeral passes by. All you really need is a month resolution so you know which envelope to fetch, thus an ordinary month-year rubber stamp would do as well. Ideally you would have software to insert a digital date stamp directly in the document, but I have not found any yet. A tip: stamp your document diagonally so the time stamp stands out from the horizontal text.

Management

Much as it pains me to admit it, Adobe Acrobat (supplied with the Fujitsu ScanSnap) is the most straightforward way to manage PDF files on Windows, e.g. merge multiple files together, insert new pages, annotate documents and so on. Through web capture OCR, it can create an invisible text layer that makes the PDF searchable with Spotlight. There are alternatives, such as Foxit PDF Page Organizer or PaperPort on Windows, and PDFPen on OS X. Since Leopard, Apple’s Preview app has included most of the PDF editing functionality required, so I take great pains to ensure my Macs are untainted by Acrobat (e.g. unselecting it when installing CS3). See also my article on resetting the creator code for PDF files on OS X so they are opened by Preview for viewing.

Encryption

If you are storing a backup of your personal papers at work or on a public service like Google’s rumored Gdrive, you don’t want third-parties to access your confidential information. Similarly, you don’t want to be exposed to identity theft if you lose a USB Flash stick with the data on it. The solution is simple: encryption.

There are many encryption packages available. Most probably have back doors for the NSA, but your threat model is the ID fraudster rummaging through your trash for backup DVDs or discarded bank statements, not the government. I use OpenSSL’s built-in encryption utility as it is cross-platform and easily scripted (I compiled a Windows executable for myself, and it is small enough to be stored on a Flash card). Mac and UNIX computers have it preinstalled, of course, do man enc for more details.

To encrypt a file using 256-bit AES, you would use the command:

openssl enc -aes-256-cbc -in somefile.pdf -out somefile.pdf.aes

to decrypt it, you would issue the command:

openssl enc -d -aes-256-cbc -in somefile.pdf.aes -out somefile.pdf

OpenSSL will prompt you for the password, but you can also supply it as a command-line argument, e.g. in a script.

Backup

Backing up scanned documents is no different than backing up photos (apart from the encryption requirements), so I will just link to my previous essay on the subject or my current backup scheme. In addition to my external Firewire hard drive rotation scheme, I have a script that does an incremental encryption of modified files using OpenSSL, and then uploads the encrypted files to my office computer using rsync.

Retention period

I tend to agree with Tim Bray in that you shouldn’t bother erasing old files, as the minimal disk space savings are not worth the risk of making a mistake. As for paper documents, you should ask your accountant what retention policy you should adopt, but a default of 2 years should be sufficient (the documents that need more, such as tax returns, are in the “file traditionally” category, in any case).

Fax

The original question was about fax. OS X can be configured to receive faxes on a modem and email them to you as PDF attachments, at which point you can edit them in Acrobat, and fax it back if required, without ever having to kill a tree with printouts. Windows has similar functionality. Of course, fax belongs in the dust-heap of history, along with clay tablets, but habits change surprisingly slowly.

Update (2006-08-26):

I recently upgraded my shredder to a Staples SPL-770M micro-cut shredder. The particles generated by the shredder are incredibly minute, much smaller than those of conventional home or office grade shredders, and it is also very quiet to boot.

Unfortunately, it isn’t able to shred an entire unopened junk mail envelope, and the micro-cut shredding action does not work very well if you feed it folded paper (the particles at the fold tend to cling as if knitted together). This unit is also more expensive than conventional shredders (but significantly cheaper than near mil-spec DIN level 5 shredders that are the nearest equivalent). Staples regularly has specials on them, however. Highly recommended.

Update (2007-04-12):

I recently upgraded my document scanner to a Fujitsu fi-5120C. The ScanSnap has a relatively poor paper feed mechanism, which often jams or double-feeds. Many reviews of the new S500M complain it also sufffers from double-feeding. The 5120C is significantly more expensive but it has a much more reliable paper feed with hitherto high-end features like ultrasonic double-feed detection. You do need to buy ScanTango software to run it on the Mac, however.

Update (2009-01-21):

I moved recently, and realized I have never yet had to open one of those envelopes. From now on, all papers not required for legal reasons (e.g. tax documents) go straight to the shredder after scanning.

Update (2009-09-08):

The new ScanSnap 1500 has ultrasonic double-feed detection. I bought a copy of ABBYY FineReader Express for the Mac. It used to be only available as bundled software with certain scanners like recent ScanSnaps, or software packages like DEVONthink, but you can now buy it as a standalone utility. It is not full-featured, missing some of the more esoteric OCR functionality of the Windows version, batch capabilities and scripting, but works well, unlike the crash-prone ReadIRIS I had but seldom used.

Update (2009-09-22):

Xamance is a really interesting French startup. Their product, the Xambox, integrates a document scanner, document management software and a physical paper filing system. The system can tell you exactly where to find the paper original for a scanned document (“use box 2, third document after tab 7”). In other words, essentially the same filing system I suggest above, but systematically managed in a database for easy retrieval.

It is quite expensive, however, making it more of a solution for businesses. I have moved on and no longer need the safety blanket of keeping the originals, but I can easily see how a complete solution like this would be valuable for businesses that are required for compliance to keep originals, such as notaries, or even government public records offices.

Credit card receipt slips and business cards are problematic for a paperless workflow. They are prone to jam in scanners, have non-standard layouts so hunting for information takes more time than it should, and are usually so trivial you don’t really feel they are worth scanning in the first place. I just subscribed to the Shoeboxed service to manage mine.They take care of the scanning and for pouring the resulting data in a form that can be directly imported into personal finance or contact-management software. I don’t yet have sufficient experience with the service, but on paper at least it seems like a valuable service that will easily save me an hour a week.

Update (2011-01-13):

I finally broke down and upgraded to a ScanSnap S1500M (we have one at work, and it is indeed a major improvement over the older models). In theory this is a downgrade as the fi-5120C is a business scanner, whereas the S1500M is a consumer/SoHo model, but with some simple customization, the integrated software bundle makes for a much more streamlined workflow: put the paper in the hopper, press the button, that’s it. With the fi-5120C, I had to select the scan settings in ScanTango, scan, press the close button, select a filename, drag the file into ABBYY FineReader, select OCR options, click save, click to confirm I do want to overwrite the original file, then dismiss the scan detection window. One step vs. nine.

Update (2012-06-19):

For portable storage of the documents, I don’t bother with manually encrypting the files any more. The IronKey S200 is a far superior option: mil-spec security and hardware encryption, with tamper-resistant circuitry, potted for environment resistance and using SLC flash memory for speed. Sure, it’s expensive, but you get what you pay for (I tried to cut costs by getting the MLC D200, and ended up returning it because it is so slow as to be unusable).