Network

Deep packet inspection rears it ugly head

Last Friday I started noticing error messages in my production environment. URLs were being mangled, two consecutive characters being replaced by 0x80 and 0x01 or 0x80 and 0x04, causing UTF-8 decode exceptions to be logged, as well as failures for the cryptographic hash function we use to secure our URLs. As a general principle, I take any such unexpected exceptions very seriously and started investigating them, one concern being that some of our custom C extensions to nginx could be responsible for data corruption under heavy load.

I ran snoop (a Solaris utility similar to tcpdump) on one of our production servers, and after combing through 180MB of packet traces with Wireshark, it turned out the data was being corrupted before even hitting our web servers. While it was a relief to find out our own infrastructure was not to blame, I still had to identify the culprit, e.g. whether our hosting provider’s switches, firewalls or load-balancers were to blame.

TCP has built-in checksums, so a malfunctioning switch working at layers 1–3 would not cause this problem, a corrupted packet would be dropped and resent, with a slight hit on performance but no errors. Thus the problem would need to be at a L4 or higher device such as a load balancer.

I added some extra logging and let it run over the weekend. After analyzing the data, it turns out the problem is very circumscribed (76 requests out of hundreds of millions), and all the affected IP addresses come from the same ISP, Singapore Telecom Magix (AS9506). The only plausible explanation is that SingTel is running some sort of deep packet inspection gear, and some of the DPI gateways have corrupt memory or software bugs, that are causing the data flowing through them to get corrupted,

Deep Packet Inspection is a scourge the general public is insufficiently aware of. At a high level, DPI gateways watch over your shoulder as you use the Internet. They decode the data packets passing through them, reconstruct unencrypted HTTP requests (in other words, spy on your browsing history). In their transparent proxy incarnation, they can rewrite the requests or responses. Verizon Wireless uses the technology to resize and recompress images or videos requested by smartphones. Back when I used to work for France Telecom (circa 1996-1999), vendors would regularly approach us to peddle their wares and how they would allow us to price-gouge our customers more effectively. Hardware has progressed dramatically since and a single Xeon processor is capable of inspecting at least 10 Gbps of data.

The whole premise of DPI and other snooping devices is profoundly repugnant to me as a former network engineer, on both moral and technical grounds. Any additional “bump in the wire” slows things down and is yet another potential point of failure, as shown by this incident, but the potential for abuse is the real concern. Not to mince words, the legitimate purposes for the technology, such as fighting cybercrime, are just rationalizations, it was really developed for purposes most people would consider abusive.

When I joined FT, I had to go to a Paris courthouse and swear a solemn oath to defend the privacy of our customers’ communications, and report any infringement of the same. DPI technology originates in spy agencies, and is much beloved of authoritarian governments. China uses the technology, combined with voice recognition, to drop calls at the merest mention of the word “protest”. The Ben Ali regime in Tunisia used it to snoop Facebook users’ authentication cookies. Singapore’s government has a well-demonstrated intolerance of criticism, and who knows what SingTel is doing with their defective gear? Western companies like Cisco were disgracefully eager to sell censorware to dictatorships, but those governments now have homegrown capabilities from the likes of Huawei.

For telco oligopolies, the endgame is to practice perfect price discrimination, e.g. charge you more for packets that carry a voice over IP call or a Netflix video on demand session that compete with the carriers’ own services. Telcos and cablecos cannot be permitted to use their stranglehold over public networks for what is essentially racketeering. Strowger invented the automatic telephone switch because the operator at his manual exchange would divert his calls to one of his competitors, her husband. Telcos, in their monopolistic arrogance, feel a sense of entitlement to all the value the network creates, even when they are not responsible, and want to reverse this. Letting them get away with it, as is consistently the case in the US, is a recipe for long-term economic stagnation.

What can we as the general public do to fight back? The telcos are one of the largest lobbies in Washington, and wireless spectrum auction fees are one of the crutches propping up Western budgets, so help is unlikely to come from the venal legislatures. The most practical option is to start using SSL and DNSSEC for everything. Google now offers an encrypted search option and Facebook has an option to use SSL for the entire session, not just for login.

Update (2012-10-16):

It seems Verizon also uses DPI to build marketing profiles on its users, i.e. categorizes you based on your browsing history and sells you to marketers. You can opt out, but the practice is deeply worrisome.

RapidSSL 1 – GoDaddy 0

My new company’s website uses SSL. I ordered an “extended validation” certificate from GoDaddy, instead of my usual CA, RapidSSL/GeoTrust, because GoDaddy’s EV certificates were cheap. EV certificates are security theater more than anything else, I probably should not have bothered.

Immediately after switching from my earlier “snake oil” self-signed test certificate to the production certificate, I saw SSL errors on Google Chrome for Mac and Safari for Mac, i.e. the two browsers that use OS X’s built-in crypto and certificate store. I suppose I should have tested the certificate on another server before going live, but I trusted GoDaddy (they are my DNS registrars, and competent, if garish).

Big mistake.

I called their tech support hotline, which is incredibly grating because of the verbose phone tree that keeps trying to push add-ons (I guess it is consistent with the monstrosity that is their home page).

After a while, I got a first-level tech. He asked whether I saw the certificate error on Google Chrome for Windows. At that point, I was irate enough to use a four-letter word. Our customers are Android mobile app developers. A significant chunk of them use Macs, and almost none (less than 5%) use IE, so know-nothing “All the world is IE” demographics are not exactly applicable.

After about half an hour of getting the run-around and escalating to level 2, with my business partner Michael getting progressively more anxious in the background, the level 1 CSR tells me the level 2 one can’t reproduce the problem (I reproduced it on three different Macs in two different locations). I gave them an ultimatum: fix it within 10 minutes or I would switch. At this point, the L1 CSR told me he had exhausted all his options, but I could call their “RA” department, and offered to switch me. Inevitably, the call transfer failed.

I dialed their SSL number, and in parallel started the certificate application process on RapidSSL. They offered a free competitive upgrade, I tried it, and within 3 minutes I had my fresh new, and functional certificate, valid for 3 years, all for free and in less time than it takes to listen to GoDaddy’s obnoxious phone tree (all about “we pride ourselves in customer service” and other Orwellian corporate babble).

I then called GoDaddy’s billing department to get a refund. Surprisingly, the process was very fast and smooth. I guess it is well-trod.

The moral of the story: GoDaddy—bad. RapidSSL—good.

Update (2012-08-26)

I switched my DNS business from GoDaddy to Gandi.net in December 2011 after Bob Parsons’ despicable elephant-hunting stunt.

Clueless SaaS providers can leave you with egg on your face

While cleaning out my spam folders, I noticed a disturbing trend: a number of the spam were sent to vendor-specific email addresses I had set up to communicate with Parallels, Joyent and Shoeboxed. As a security measure, I do not give my personal email address to vendors, only aliases. The email address I used in the past for Dell was dell@majid.fm, for instance (I now use a different domain). A few years back, I started receiving pornographic spam at that address, which led me to think either Dell had secretly adopted a radically new diversification plan, or that their customer database had been compromised. Needless to say, this did not reflect well on Dell. I canceled that alias and stopped dealing with Dell.

I contacted the support for the three vendors. Joyent got back to me, and said:

We have traced this back to a third-party provider that was used to distribute service notifications. We have been in contact with this service provider, and they have determined that subscriber email addresses of their clients were compromised. They have launched their own investigation, which is ongoing, and have also reached out to their local FBI office.

After some digging, I found some interesting posts. Some email marketing company called iContact, that I had never heard about before, was the source of the compromise. They claim to be SAS-70 compliant, but of course like most bureaucratic certifications, SAS-70 is mostly security theater that makes sysadmins’ life miserable for no meaningful security benefit (SAS-70 auditors, on the other hand, profit handsomely).

Just another example of how outsourcing critical functions to outside vendors can backfire spectacularly and take down your own reputation in the process.

Broken SPF records

I have SPF verification enabled on my mail server. While SPF is no panacea for the problem of spam, it is quite effective at ensuring spammers do not forge the sending address to impersonate someone else, and cause some poor innocent soul to receive in a boomerang effect the torrent of complaints hurled at them.

Unfortunately far too many lame organizations (cough, Google) qualify their SPF record using a too permissive ?all or ~all clause, which means they have servers other than those listed, and thus their SPF record is useless for filtering purposes.

In the last month, I noticed the opposite problem: I did not receive emails from Eurostar and BookMooch because their SPF records did not list the mail servers they actually use. If they are not clueful enough to manage a simple list of IP addresses, or have basic change management discipline, they should do us all a favor and ditch the SPF record they clearly are incapable of maintaining.