joepie91's Ramblings

home RSS

CloudFlare, We Have A Problem

14 Jul 2016

For the past few years, CloudFlare has been steadily gaining popularity - being used by a staggering amount of websites, big and small. One of their frequently repeated claims to fame is that they "make web properties faster and safer".

I disagree.

In reality, CloudFlare has been structurally making the web less secure during these years. And they are incredibly good at selling that as a feature.

The Solution To No Problems

Back in 2011, when I ran AnonNews.org, I had to cope with frequent DDoS attacks - not all that surprising, given that it was a very popular news site and community for Anonymous, which was seeing the peak of its media coverage at the time. In 2011, however, it was pretty much impossible to get working DDoS mitigation for less than $100 a month, and that was simply not a budget I had to spend on it.

I eventually ran across CloudFlare, and - despite it not advertising DDoS mitigation anywhere at the time - I realized that with it being essentially a reverse proxy on beefy infrastructure, it would make for a useful pincushion against most DDoS attacks. And it did - it got in the way of many attacks, saved me some traffic as a bonus, and was overall a good solution to the problem at the time, even if it wasn't "real" DDoS mitigation.

Fast-forward to today, in 2016. It's not so clear anymore whether CloudFlare really solves any problems. Single-homed bandwidth can be gotten for $0.35/TB, DDoS mitigation services are plentiful and sometimes even provided by default, and the web is generally Fast Enough. Of course this doesn't stop CloudFlare from marketing to AWS customers - who are still grossly overpaying for bandwidth - or simply to those who are not aware of the changes in the hosting landscape.

Essentially, there's not really a reason to use CloudFlare anymore, and the majority of sites won't see any real benefit from it at all. I'll go into the alternatives further down the article, but I want to address some of the problems that CloudFlare introduces first.

Encryption? What's That?

A big issue with CloudFlare today is their "Universal SSL" feature. Hailed as a way to "support SSL connections to every CloudFlare customer", it actually does the very thing that SSL/TLS is meant to prevent. One instance of this occurred today, when visitors of The Pirate Bay suddenly started seeing network blocking messages from Airtel. This wasn't a compromise of CloudFlare; rather, the connection between CloudFlare and the real TPB servers wasn't encrypted, and so CloudFlare's network provider could intercept and mess with the traffic.

This is a simplified schematic illustrating what happened:

The reason this could happen is CloudFlare's "Flexible SSL" option, which is one of the modes in which "Universal SSL" can operate. In this mode, the TLS connection only runs from the end user to CloudFlare, and is plaintext beyond that point.

See the problem?

To the end user, it will look like they are using TLS, and their connection is secure right up until the site they are trying to reach - after all, that's how TLS is meant to work, right? In reality, anybody could have still messed with the traffic between CloudFlare and the "origin server". The user is presented with a false sense of security, suggesting security that simply isn't there. This breaks the TLS model, and is extremely dangerous; users will behave more carelessly because they believe they are being protected, resulting in a greater compromise.

The issue is made even worse by CloudFlare claiming that "there are no security flaws on its side". Apparently, they don't consider "complete compromise of the SSL/TLS trust model" a security flaw. Right.

But let's pretend that CloudFlare realizes that Flexible SSL was a mistake, and removes the option. They'd then require TLS between CloudFlare servers and the origin server as well. While this solves the specific problem of other ISPs meddling with the connection, it leaves a bigger problem unsolved: the fact that CloudFlare itself acts as an MITM (man-in-the-middle). By the very definition of how their system works, they must decrypt and then re-encrypt all traffic, meaning they will always be able to see all the traffic on your site, no matter what you do.

This may not sound that bad - after all, they're just a service provider, right? - but let's put this in context for a moment. Currently, CloudFlare essentially controls 11% of the 10k biggest websites, over 8% of the 100k biggest websites (source), and almost 5% of sites on the entire web (source). According to their own numbers from 2012(!), they had more traffic than several of the most popular sites and services on earth combined, and almost half the traffic of Facebook. It has only grown since. And unlike every other backbone provider and mitigation provider, they can read your traffic in plaintext, TLS or not.

Could you claim with a straight face that all this intercepted data isn't used by intelligence agencies, whether with CloudFlare's cooperation or not? It would be the perfect intelligence source, and the only way to have a guarantee that target sites will never start encrypting the data - after all, that's what they're expecting the service to do for them!

And what if somebody wanted to serve malware? What better place to do that than injected directly into potentially billions of sites, without any cross-domain restrictions whatsoever?

And all of this is completely inevitable, because CloudFlare's very business model is based on the ability to intercept and read HTTP traffic. They can't offer any of their services without it.

Which brings us to...

Packets, Please!

CloudFlare is frequently hailed as a "free DDoS mitigation provider", ever since they started marketing themselves as such. The reality is very different; you won't get any actual DDoS mitigation, even if you pay $200/month for their Business plan.

Traditional DDoS mitigation services work by analyzing the packets coming in, spotting unusual patterns, and (temporarily) blocking the origin of that traffic. They never need to know what the traffic contains, they only need to care about the patterns in which it is received. This means that you can tunnel TLS-encrypted traffic through a DDoS mitigation service just fine, without the mitigation service ever seeing the plaintext traffic... and you're still protected.

In contrast, CloudFlare is just a reverse proxy with a very fast connection. Layer 3/4 attacks (those aimed at the underlying network infrastructure, rather than the application or protocol itself) will only ever reach up to the point where it's handled by a server rather than just passed through, and in a "reverse proxy"-type setup, that server is CloudFlare. They're not actually mitigating anything, it just so happens that they are the other side of the connection and thus "take the hit"!

This is also why CloudFlare only supports HTTP(S), and not other protocols - they never actually pass through any traffic. Their servers will make a request to your site on the behalf of a visitor, and forward the response, after potentially modifying it first. They would have to write a custom reverse proxy for each protocol to support.

At this point, you might be wondering "well okay, I get that, but why should I care as long as it protects my site?", and the answer to that would be: because it doesn't. You can't protect the rest of your infrastructure (mailservers, chat servers, gameservers, and so on), and even for your web-based services, CloudFlare will kick you off the Free and Pro plans if you get attacked too much and they can figure out that you are the target.

In other words: unless you pay them $200/month, they won't provide any protection that you wouldn't already have anyway. And if you do pay them $200/month, you'll get half-functional protection for a single protocol on a single domain, with all your users being completely exposed to CloudFlare and whatever other organizations might obtain access to their traffic or servers. As you'll see below, this is a pretty shitty deal, and there are far better options today.

Oh, and about that "I'm Under Attack" mode that you get on the Free plan as well? Yeah, well, it doesn't work. But don't take my word for it - here's proof. That code will solve the 'challenge' that it presents to your browser, in a matter of milliseconds. Any attacker can trivially do this. And the challenge can't be made more difficult, because it would make it prohibitively expensive for mobile and embedded devices to use anything hosted at CloudFlare.

But while it doesn't stop attackers, it does stop legitimate users. Which brings us to...

You Shall Not Pass

See, the "I'm Under Attack" system imposes some problems. By its very definition, it requires that you have JavaScript enabled to be able to view a site - note that I'm not talking about the CAPTCHA page here, but about the "Checking your browser..." page.

It's quite frequently claimed that "oh well, everybody has JS anyway", but this is simply not true. Eevee has written an excellent article about this problem, including many examples where this assumption doesn't hold true.

But I want to address an issue that I've had specifically with CloudFlare's "I'm Under Attack" mode. I'm involved in ArchiveTeam, essentially a loose collective of archivists that try to preserve culture and knowledge on an ever-rotting web - the rotting usually being a result of service providers throwing away user data on 2 weeks notice because it's no longer profitable to them, not really caring about the consequences for the users.

One of the services that ArchiveTeam operates is ArchiveBot, essentially an IRC bot that archives whatever is thrown at it, and adds it to the Internet Archive. You can kind of think of it as an on-demand, public service Wayback machine. To be able to do this, it needs to access websites - not as a browser, but just as a plain HTTP client - and spider their content. Somewhat predictably, ArchiveBot has very limited support for JavaScript.

Indeed it is essentially impossible to archive something that's in "I'm Under Attack" mode, despite that usually being the exact moment where archival is necessary!

I've been told that ArchiveBot can be added to the internal whitelist that CloudFlare has, but this completely misses the point. Why do I or anybody else need to talk to a centralized gatekeeper to be able to access content on the web, especially if there might be any number of such gatekeepers? This kind of approach defeats the very point of the web and how it was designed!

And for a volunteer-run organization like ArchiveTeam, it's far more tricky to implement support for these "challenge schemes" than it is for a botnet operator, who stands to profit from it. That problem only becomes worse as more services start implementing these kind of schemes, and often it takes a while for people to notice that their requests are being blocked - sometimes losing important information in the process.

Some might argue that these kind of archival bots are precisely what CloudFlare is meant to protect against, but that's not really true. If that were the case, why would there be an offer to add ArchiveBot to the whitelist to begin with? Why would the Wayback Machine be on that very same whitelist?

Speaking of which, perhaps you're using CloudFlare because of their blocking of spambots. Apart from the fact that blacklists for this are freely available and don't require sending your traffic through a centralized middleman, it's also a completely misguided approach. It's based entirely on the premise of "malicious IPs", but there is no such thing.

IP addresses change hands frequently, can be shared by tens of thousands of people, and can be reassigned to a different household 10 minutes later. In reality, there are only malicious clients and malicious users, and trying to identify them by IP will lead to a lot of false positives, and not just on Tor.

The effective way to deal with malicious clients and users isn't to block "known-bad IPs" - because again, those do not exist, and there's no correlation to clients or users. It's to detect patterns of abusive behaviour, and to encourage the behaviour that you desire. Blocking IPs is akin to banning trucks from the freeway - sure, you've reduced the amount of truck-on-car collisions to zero, but was the loss of commercial transport really worth it?

As somebody who has run various high-risk services over the years, attracting a lot of targeted abuse, I can confidently say that IP blocking is never necessary and rarely effective.

But The Speed! The Speed!

CloudFlare's original mass-market selling point, performance. Route your traffic through us, and everything will be magically faster! Well, as it turns out, that's not quite true. Where to start...

In most of the Western world, connectivity is pretty good. You can go from most places in the US to Europe and back - across the ocean! - in about 140 milliseconds. A commonly used metric in the web development industry is that your page and all your assets should be loaded in under 300 milliseconds.

Assuming you're declaring all the assets on your page directly, that would make it two roundtrips totalling about 280 milliseconds, since the assets can be retrieved in parallel. Even if you have to cross the Atlantic, you're still going to clock in under the guideline, without any CDN or geolocation whatsoever.

But some people still want to squeeze out more performance - for example, they might have assets referenced a few levels deep, or they consider every millisecond critical because they are in e-commerce. Whether that's a valid concern is something I'll leave in the middle, but let's assume for now that it is. Even in this case, the problem is still static assets - CloudFlare can't cache the actual pageloads locally, because they are dynamic and different for everybody.

So why not just use a CDN? Using a CDN means you can still optimize your asset loading, but you don't have to forward all your pageloads through CloudFlare. Static assets are much less sensitive, from a privacy perspective.

But perhaps you're also targeting users in regions with historically poor connectivity, such as large parts of Asia. Well, turns out that it doesn't really work there either - CloudFlare customers routinely report performance problems in these regions that are worse than they were before they switched to CloudFlare.

This is not really surprising, given the mess of peering agreements in Asia; using CloudFlare just means you're adding an additional hop to go through, which increases the risk of ending up on a strange and slow route.

And this is the problem with CloudFlare in general - you can't usually make things faster by routing connections through somewhere, because you're adding an extra location for the traffic to travel to, before reaching the origin server. There are some cases where these kind of techniques can make a real difference, but they are so rare that it's unreasonable to build a business model on it. Yet, that's precisely what CloudFlare has done.

But perhaps you're thinking of the extra features that they offer like bundling assets, minification, cache headers, different loading orders, and so on. However, all of these are things that you can do on your own infrastructure, and without compromising the privacy of your users. Sending your traffic through a third party is completely unnecessary for that.

To top it off, for most cases none of this matters anyway - more and more organizations are (unnecessarily) turning their sites into Single Page Applications, and don't realize that this adds entire seconds of rendering time on top of your milliseconds worth of asset retrieval. Why bother with those 50 milliseconds difference, especially at such a cost?

What Of My Web?

Unfortunately, all of these issues together mean that CloudFlare is essentially breaking the open web. Extreme centralization, breaking the trust model of SSL/TLS, a misguided IP blocking strategy, requiring specific technologies like JavaScript to be able to access sites, and so on. None of this benefits anybody but CloudFlare and its partners.

For you as a CloudFlare customer, the problem is bigger - by routing your traffic through CloudFlare, you are essentially exposing every single one of your users completely, to CloudFlare and more than likely to intelligence agencies as well. Browsing behaviour, credit card details, passwords, private conversations, everything. Even if you just have a small static site like a blog, your users can be tracked as having visited it, without either you or your users having any knowledge of it whatsoever.

We shouldn't want this, especially if it's completely unnecessary.

And What Of My Provider?

One argument I saw in response to this, was that you are trusting your hosting provider anyway, and so adding CloudFlare doesn't really do any harm. This is really not true, though - not only does CloudFlare run at a far larger scale than any hosting provider, they are also in a much better position to maliciously intercept traffic (or be forced to do so). It's considerably easier to do dragnet interception on a reverse proxy, than it is to compromise every single server in a datacenter. And either way, you now have two providers you need to trust, rather than one.

Another argument is that CloudFlare just does the same thing that load balancers have been doing for over a decade, and that this isn't really anything new. But while the functionality is the same, the context is not - traditional load balancers run on the same network as the servers they are balancing between, and so the risk of interception is almost non-existent. While there are some newer providers that offer similar services to CloudFlare - and I consider them bad on exactly the same grounds - they run on a much smaller scale, and have much less impact.

Whence Shall I Source My Solutions?

Of course, it'd be a bit strange for me to claim that CloudFlare has outlived its usefulness, and then not provide any alternatives. Thankfully I've had this discussion before on Hacker News, so I have some concrete alternatives handy.

I'll reiterate them here for your convenience:

DDoS mitigation

Use a real (network-level) mitigation provider.

Some providers include mitigation for free with your hosting service (OVH, Online.net, ServerCrate, ...). Others charge a small fee, typically between $1 and $5 (RamNode, BuyVM, SecureDragon, ...).

There are also dedicated mitigation providers for more demanding usecases (Akamai, Level3, Voxility, CNServers, Sharktech, ...) and some providers that resell and/or combine these services (eg. X4B.net).

If you have your own physical infrastructure, you can also pick a mitigation appliance provider. There are quite a few.

Easy and free SSL/TLS

Let's Encrypt offers free browser-recognized SSL/TLS certificates. If you don't want the hassle of setting it up, Caddy is a web server that will automatically set up SSL/TLS using Let's Encrypt certificates out of the box, no configuration required.

Web Application Firewall

Run one on your own backend server(s) and/or loadbalancer(s). There's no benefit to doing this remotely, really. Even something relatively simple like ModSecurity will cover a wide array of problems.

Better performance

Do server-side optimizations of your code. Don't build an SPA unless you need it - it will significantly slow things down for your users.

Use a (real) CDN for your static assets, not a proxy like CloudFlare. If you want to optimize your dynamic pageloads as well, look into Anycast hosting (BuyVM, BHost, and so on.)

Saving bandwidth

Don't bother, beyond the usual performance improvements described above. Use a provider that doesn't gouge you over it - a typical cost for both VPSes and dedicated servers is $1 to $5 per TB per month, with no additional fee for the connection.

Free DNS

Many hosting providers offer this for free with your plan. If you'd rather not put all your eggs into one basket, Hurricane Electric offers free dual-stack Anycast DNS.

Other things

If there's anything you're using CloudFlare (or similar services) for that isn't listed here, then please do let me know, and I'll do my best to find you a less harmful alternative. My contact details are at the bottom of this article.

You can also leave comments in the thread on Hacker News.