Funding the vetting of the Software Supply-Chain

TL:DR A way out of our software supply-chain security mess

As memorably illustrated by XKCD, the way most software is built today is by bolting together reusable software packages (dependencies) with a thin layer of app-specific integration code that glues it all together. Others have described more eloquently than I can the mess we are in, and the technical issues.

Crises like the log4j fiasco or the Solarwinds debacle are forcing the community to wake up to something security experts have been warning about for decades: this culture of promiscuous and undiscriminating code reuse is unsustainable. On the other hand, for most software developers without the resources of a Google or Apple behind them, being able to leverage third-parties for 80% of their code is too big an advantage to abandon.

This is fundamentally an economic problem:

  • To secure a software project to commercial standards (i.e. not the standards required for software that operates a nuclear power plant or the NSA’s classified systems, or that requires validation by formal methods like TLA+), some form of vetting and code reviews of each software dependency (and its own dependencies, and the transitive closure thereof) needs to happen.
  • Those code reviews are necessary, difficult, boring, labor-intensive, require expertise and somebody needs to pay for that hard work.
  • We cannot rely entirely on charitable contributions like Google’s Project Zero or volunteer efforts.
  • Each version of a dependency needs to be reviewed. Just because version 11 of foo is secure doesn’t mean a bug or backdoor wasn’t introduced in version 12. On the other hand, reviewing changes takes less effort than the initial review.
  • It makes no sense for every project that consumes a dependency to conduct its own duplicative independent code review.
  • Securing software is a public good, but there is a free-rider problem.
  • Because security is involved, there will be bad actors trying to actively subvert the system, and any solution needs to be robust to this.
  • This is too important to allow a private company to monopolize.
  • It is not just the Software Bill of Materials that needs to be vetted, but also the process. Solarwinds was probably breached because state-sponsored hackers compromised their Continuous Integration infrastructure, and there is Ken Thompson’s classic paper on the risks of Trusting Trust (original ACM article as a PDF).
  • Trust depends on the consumer and the context. I may trust Google on security, but I certainly don’t on privacy.

I believe the solution will come out of insurance, because that is the way modern societies handle diffuse risks. Cybersecurity insurance suffers from the same adverse-selection risk that health insurance does, which is why premiums are rising and coverage shrinking.

If insurers require companies to provide evidence that their software is reasonably secure, that creates a market-based mechanism to fund the vetting. This is how product safety is handled in the real world, with independent organizations like Underwriters Laboratories or the German TÜVs emerging to provide testing services.

Governments can ditch their current hand-wavy and unfocused efforts and push for the emergence these solutions, notably by long-overdue legislation on software liability, and at a minimum use their purchasing power to make them table stakes for government contracts (without penalizing open-source solutions, of course).

What we need is, at a minimum:

  • Standards that will allow organizations like UL or individuals like Tavis Ormandy to make attestations about specific versions of dependencies.
  • These attestations need to have licensing terms associated with them, so the hard work is compensated. Possibly something like copyright or Creative Commons so open-source projects can use them for free but commercial enterprises have to pay.
  • Providers of trust metrics to assess review providers. Ideally this would be integrated with SBOM standards like CycloneDX, SPDX or SWID.
  • A marketplace that allows consumers of dependencies to request audits of a version that isn’t already covered.
  • A collusion-resistant way to ensure there are multiple independent reviews for critical components.
  • Automated tools to perform code reviews at lower cost, possibly using Machine Learning heuristics, even if the general problem can be proven the be computationally untractable.

Externalities again

I just wasted half an hour of my life on the phone with my credit card company’s fraud department, as someone attempted to buy expensive tickets from an airline in Panama. Most likely my card number was compromised by Target, although it could also be due to Adobe.

It is actually surprising such breaches do not occur on a daily basis—the persons paying for the costs of a compromise (the card holder, defrauded merchants and their credit card companies via the cost of operating their fraud departments) are not the same as those paying for the security measures that would prevent the said breach, a textbook example of what economists call an externality. There are reputational costs to a business that has a major security breach, but they are occurring so often consumers are getting numbed to them.

Many states have mandatory breach disclosure laws, following California’s example. It is time for legislatures to take the next step and impose statutory damages for data breaches, e.g. $100 per compromised credit card number, $1000 per compromised social security number, and so on. In Target’s case, 40 million compromised credit cards multiplied by $100 would mean $4 billion in damages. That would make management take notice and stop paying mere lip service to security. It might also jump-start the long overdue migration to EMV chip-and-PIN cards in the United States.

The Gresham’s law of Amazon Web Services

In the bad (good?) old days when currency’s worth was established by the amount of gold or silver in coinage, kings would cut corners by debasing currency with lead, which is almost as dense as gold or silver. In the New World, counterfeiters debased gold coins with platinum, which was first smelted by pre-columbian civilizations. Needless to say, the fakes are now worth more than the originals.

The public was not fooled, however, and found ways to test coins for purity, including folkloric ones like biting a coin to see if it is made of malleable gold, rather than harder metals. People would then hoard pure gold coins, and try to rid themselves of debased coins at the earliest opportunity. This led to Gresham’s Law: bad money drives out good money in circulation.

After a year of using Amazon Web Services’ EC2 service at scale for my company (we moved to our own servers at the end of 2011), I conjecture there is a Gresham’s Law of Amazon EC2 instances – bad instances drive out good ones. Let me elaborate:

Amazon EC2 is a good way to launch a service for a startup, without incurring heavy capital expenditures when getting started and prior to securing funding. Unfortunately, EC2 is not a quality service. Instances are unreliable (we used over 80 instances at Amazon, and there was at least one instance failure a week, and sometimes up to 4). Amazon instances have poor disk I/O performance that makes them particularly unsuitable to hosting non-trivial databases (EBS is even worse, and notoriously unreliable).

Performance is also inconsistent—I routinely observed “runt” m1.large instances that performed half as well as the others. We experienced all sorts of failure modes, including disk corruptions, disks that would block forever without timing out, sporadic losses of network connectivity, and many more. Even more puzzling, I would get 50% to 70% failure rate on new instances that would not come up cleanly after being launched.

Some of this is probably due to the fact we use an uncommon OS, OpenSolaris, that is barely supported on EC2, but I suspect a big part of this is that Amazon uses low-end commodity parts, and does not proactively retire failed or flaky hardware from service. Instances that have the bad luck of being assigned to flaky hardware are more likely to fail or perform poorly, and thus more likely to be be destroyed, released and a new one reassigned in the same slot. The inevitable consequence of this is that new instances have a higher likelihood of being runts or otherwise defective than long-running ones.

One work-around is to spin up a large number of instances, test them, and destroy the poor-performing ones. AWS runts are usually correlated with slower CPU clock speeds, as older machines would be running older versions of the Xen hypervisor Amazon uses under the hood, have less cache, slower drives and so on. Iterating through virtual machines as if you are picking melons at a supermarket is a slow and painful job, however, and even their newer machines have their share of runts. We were trying to keep only machines with 2.6 or 2.66GHz processors, but more than 70% of the instances we were getting assigned were 2.2GHz runts, and it would usually take creating 5 or 6 instances on average to get a non-runt.

In the end, we migrated to our own facility in colo, because Amazon’s costs, reliability and performance were just not acceptable, and we had long passed the threshold beyond which it is cheaper to own than rent (I estimate it at $5,000 to $10,000 per month Amazon spend, depending on your workload). It is not as if other cloud providers are any better—before Amazon we had started on Joyent, which supports OpenSolaris natively, and their MTBF was in the order of 2 weeks, apparently because they replaced their original Sun hardware with substandard Dell servers and had issues with power management C-states in the Dell server BIOS.

The dirty secret of cloud services is that there is no reliable source of information on actual performance and reliability of cloud services. This brings out another economic concept, George Akerlof’s famous paper on the market for lemons. In a market where information asymmetry exists, the market will eventually collapse in the absence of guarantees. Until Amazon and others offer SLAs with teeth, you should remain skeptical about their ability to deliver on their promises.

What is heard, and what is not heard

French economist Frédéric Bastiat (1801–1850) wrote a pamphlet titled Ce qu’on voit et ce qu’on ne voit pas (What is Seen and What is Not Seen) where he demolishes the make-work fallacy in economics. When Jacques Bonhomme’s child breaks his window, paying for a replacement will circulate money in the economy, and stimulate the glassmakers’ trade. This is the visible effect. Bastiat urges us to consider what is not seen, i.e. opportunity costs, such as other, more productive uses for the money that are forgone due to the unexpected expense. This lesson is still relevant. The cost of repairing New Orleans after Katrina, or cleaning the Gulf after Deepwater Horizon, will cause a temporary boost in GDP statistics, but this is illusory and undesirable, another example of how poorly conceived metrics can distort thinking.

Another example is that of electric cars. Advocates for the blind have raised a ruckus about the dangers to blind people from quiet electric cars they cannot hear or dodge. Nissan just announced that their Leaf electric car will include a speaker and deliberately generate noise, in part to comply with the Japanese Transport Ministry’s requirements. To add injury to insult, the sound selected is apparently a sweeping sine wave, a type of sound that is incredibly grating compared to more natural sounds, including that of machinery.

Unfortunately, this is illustrates the fallacy Bastiat pointed out. Authorities are focusing on the visible (well, inaudible) first-order effect, but what is not seen matters as much. Most urban noise stems from transportation, and that noise pollution has major adverse impact on stress levels, sleep hygiene, and causes high blood pressure and cardiac problems from children to adults to the elderly. According to the WHO, for 2006 in the UK alone, an estimated 3,000+ deaths due to heart attacks can be attributed to noise pollution (out of 100,000+).

These figures are mind-boggling. For a country the size of the US, that probably comes around to five  or six 9/11 death tolls per year. Quiet electric cars should be hailed as a blessing, not a danger. There are other ways to address the legitimate concerns of the blind, e.g. by mandating transponders on cars and providing receivers for the blind.

Why do voters put up with bad politicians?

As a foreigner living in San Francisco for the last ten years, I never cease to be baffled by US voters’ tendency to vote for candidates who are clearly class warriors on the side of the rich and other influential special interests. Political scientists have long wondered why people vote against their own best interests, e.g. Americans voting for candidates beholden to health “care” provider lobbies and who hew to the status quo, saddling the US with grotesquely overpriced yet substandard health care. Another example would be the repulsive coddling of an increasingly brazen Wall Street kleptocracy.

Ideology cannot explain it all. Certainly, some people will put principle ahead of their pocketbook and vote for candidates that uphold their idea of moral values even if they simultaneously vote for economic measures that hurt their electorate. That said, there is nothing preventing a political candidate from adopting simultaneously socially conservative positions and economic policies that favor a safety net, what in Europe would be called Christian Democrats.

Media propaganda and brainwashing cannot explain it either, to believe so, as do conspiracy theorists on both right and left of the US political spectrum, is to seriously underestimate the intelligence (and cynicism) of the electorate. In a mostly democratic country like the United States, special interests can only prevail when the general population is apathetic, or at least consents to the status quo.

I believe the answer lies in loss aversion, the mental bias that causes people to fear a loss far more than they desire a gain. Our brains did not evolve in a way that favors strict rationality. Most people’s intuition about probability and statistics is unreliable and misleading—we tend to overestimate the frequency of rare events. The middle class, which holds a majority of votes, will tend to oppose measures that expose it to the risk of being pulled down by lower classes even if the same measures would allow them upward mobility into the upper classes. The upper class exploits this asymmetry to maintain its privileges, be they obscene taxpayer-funded bonuses for bankers who bankrupted their banks, or oligopoly rent-seeking by the medical profession.