Fazal Majid's low-intensity blog

Sporadic pontification

Fazal Fazal

Deep packet inspection rears it ugly head

Last Friday I started noticing error messages in my production environment. URLs were being mangled, two consecutive characters being replaced by 0x80 and 0x01 or 0x80 and 0x04, causing UTF-8 decode exceptions to be logged, as well as failures for the cryptographic hash function we use to secure our URLs. As a general principle, I take any such unexpected exceptions very seriously and started investigating them, one concern being that some of our custom C extensions to nginx could be responsible for data corruption under heavy load.

I ran snoop (a Solaris utility similar to tcpdump) on one of our production servers, and after combing through 180MB of packet traces with Wireshark, it turned out the data was being corrupted before even hitting our web servers. While it was a relief to find out our own infrastructure was not to blame, I still had to identify the culprit, e.g. whether our hosting provider’s switches, firewalls or load-balancers were to blame.

TCP has built-in checksums, so a malfunctioning switch working at layers 1–3 would not cause this problem, a corrupted packet would be dropped and resent, with a slight hit on performance but no errors. Thus the problem would need to be at a L4 or higher device such as a load balancer.

I added some extra logging and let it run over the weekend. After analyzing the data, it turns out the problem is very circumscribed (76 requests out of hundreds of millions), and all the affected IP addresses come from the same ISP, Singapore Telecom Magix (AS9506). The only plausible explanation is that SingTel is running some sort of deep packet inspection gear, and some of the DPI gateways have corrupt memory or software bugs, that are causing the data flowing through them to get corrupted,

Deep Packet Inspection is a scourge the general public is insufficiently aware of. At a high level, DPI gateways watch over your shoulder as you use the Internet. They decode the data packets passing through them, reconstruct unencrypted HTTP requests (in other words, spy on your browsing history). In their transparent proxy incarnation, they can rewrite the requests or responses. Verizon Wireless uses the technology to resize and recompress images or videos requested by smartphones. Back when I used to work for France Telecom (circa 1996-1999), vendors would regularly approach us to peddle their wares and how they would allow us to price-gouge our customers more effectively. Hardware has progressed dramatically since and a single Xeon processor is capable of inspecting at least 10 Gbps of data.

The whole premise of DPI and other snooping devices is profoundly repugnant to me as a former network engineer, on both moral and technical grounds. Any additional “bump in the wire” slows things down and is yet another potential point of failure, as shown by this incident, but the potential for abuse is the real concern. Not to mince words, the legitimate purposes for the technology, such as fighting cybercrime, are just rationalizations, it was really developed for purposes most people would consider abusive.

When I joined FT, I had to go to a Paris courthouse and swear a solemn oath to defend the privacy of our customers’ communications, and report any infringement of the same. DPI technology originates in spy agencies, and is much beloved of authoritarian governments. China uses the technology, combined with voice recognition, to drop calls at the merest mention of the word “protest”. The Ben Ali regime in Tunisia used it to snoop Facebook users’ authentication cookies. Singapore’s government has a well-demonstrated intolerance of criticism, and who knows what SingTel is doing with their defective gear? Western companies like Cisco were disgracefully eager to sell censorware to dictatorships, but those governments now have homegrown capabilities from the likes of Huawei.

For telco oligopolies, the endgame is to practice perfect price discrimination, e.g. charge you more for packets that carry a voice over IP call or a Netflix video on demand session that compete with the carriers’ own services. Telcos and cablecos cannot be permitted to use their stranglehold over public networks for what is essentially racketeering. Strowger invented the automatic telephone switch because the operator at his manual exchange would divert his calls to one of his competitors, her husband. Telcos, in their monopolistic arrogance, feel a sense of entitlement to all the value the network creates, even when they are not responsible, and want to reverse this. Letting them get away with it, as is consistently the case in the US, is a recipe for long-term economic stagnation.

What can we as the general public do to fight back? The telcos are one of the largest lobbies in Washington, and wireless spectrum auction fees are one of the crutches propping up Western budgets, so help is unlikely to come from the venal legislatures. The most practical option is to start using SSL and DNSSEC for everything. Google now offers an encrypted search option and Facebook has an option to use SSL for the entire session, not just for login.

Update (2012-10-16):

It seems Verizon also uses DPI to build marketing profiles on its users, i.e. categorizes you based on your browsing history and sells you to marketers. You can opt out, but the practice is deeply worrisome.

Hey Apple…

Some improvements you should consider:

  • Sync iPods, iPhones and iPads over WiFi. Cables are so twentieth century. Palm had bluetooth sync working ten years ago, and 802.11n has the same real-world speed as USB. You could then simply extend this to sync the device to the cloud instead of a specific computer.
  • Ditching DVDs to offer an OS reinstall USB flash drive on the new MacBook Airs and Pros is a good idea, but the stick is easy to misplace. How about soldering a read-only USB drive directly onto the motherboard so it can never be lost?
  • When someone enters an address in a Calendar entry on iOS, make it clickable and linked to the Maps app, the way addresses in Contacts are. Copying and pasting them manually is a drag.
  • Stop adding useless frills like “stationery” to Mail.app, and make the default chronological sort order switchable to “most recent on top”.
  • Add HDMI CEC support to the AppleTV. It would be nice to have a HDTV automatically switch over to the AppleTV’s HDMI input when you try to access it. Speaking of which, it would be nice to have an option to disable the audio out on HDMI, e.g. if you have a decent surround sound system connected to it over Toslink and don’t want the TV’s tinny speakers to kick in.

Is this a Google Street View car?

Update (2011-05-12): the answer is no, it’s a Navteq 3D mapping car with a LIDAR array. Thanks to Darrell Kresge for the clarification.

As I was walking to lunch today, I caught sight of this weird contraption, and had just enough presence of mind to grab a few snaps of it.

One strange feature is a spinning white cylinder inside the arm canted at a 45 degree angle.

It doesn’t look like any of the Google Street View vehicles captured before, nor does it have the Google markings. The Michigan license plate is a bit odd as well. A prototype, perhaps? Or is some other company is getting into this racket, perhaps Microsoft?

Changing the WordPress table prefix

This may be of use to people experiencing the dreaded “You do not have sufficient permissions to access this page.” message when trying to reach WordPress’ admin page, even when logging in as a proper administrator. WordPress embeds the table prefix in 4 different locations:

  • The wp-config.php file
  • The name of the tables
  • The name of user metadata keys
  • The name of blog options

Thus if you want to change the prefix, you have to:

  1. Edit wp-config.php to change the prefix
  2. Rename your tables
  3. Rename your user metadata
  4. Rename your site options

Missing steps 1 or 2 will cause WordPress to not find the tables, and it will go through the initial install process again.

Missing step 3 will cause the account to lose its roles, and thus not be authorized to do much besides read public posts.

Missing step 4 is more insidious, as it destroys the option wp_user_roles, the link between roles and capabilities, and thus even if your account is an administrator, it is no longer authorized for anything.

It feels quite clunky to embed the database prefix in column values, not just tables, just like WordPress’ insistence on converting relative links to absolute links. The former makes moving tables around (e.g. when consolidating multiple blogs on a single MySQL database) harder than necessary. The latter makes moving a blog around in a site’s URL hierarchy break internal links. I suppose there are security reasons underlying Automattic’s design choice, but security by obscurity of the WordPress table prefix is hardly a foolproof measure.

If you are renaming the tables, say, from the default prefix wp_ to foo_, the MySQL commands necessary for steps 2–4 would be the following:

UPDATE wp_usermeta SET meta_key=REPLACE(meta_key, 'wp_', 'foo_')
WHERE meta_key LIKE 'wp_%';
UPDATE wp_options SET option_name=REPLACE(option_name, 'wp_', 'foo_')
WHERE option_name LIKE 'wp_%';
ALTER TABLE wp_commentmeta RENAME TO foo_commentmeta;
ALTER TABLE wp_comments RENAME TO foo_comments;
ALTER TABLE wp_links RENAME TO foo_links;
ALTER TABLE wp_options RENAME TO foo_options;
ALTER TABLE wp_postmeta RENAME TO foo_postmeta;
ALTER TABLE wp_posts RENAME TO foo_posts;
ALTER TABLE wp_redirection_groups RENAME TO foo_redirection_groups;
ALTER TABLE wp_redirection_items RENAME TO foo_redirection_items;
ALTER TABLE wp_redirection_logs RENAME TO foo_redirection_logs;
ALTER TABLE wp_redirection_modules RENAME TO foo_redirection_modules;
ALTER TABLE wp_term_relationships RENAME TO foo_term_relationships;
ALTER TABLE wp_term_taxonomy RENAME TO foo_term_taxonomy;
ALTER TABLE wp_terms RENAME TO foo_terms;
ALTER TABLE wp_usermeta RENAME TO foo_usermeta;
ALTER TABLE wp_users RENAME TO foo_users;