Securing your self-hosted website with Let's Encrypt, part 1

As I mentioned in my previous post, I am going to describe how to use Let's Encrypt to generate free digital certificates that you can use to serve secure https content instead of plain unencrypted http.

There's a lot to learn in this space, so that's why I'm splitting this in various posts.

I'll start by clarifying the 'secure' aspect. When we say secure, we generally mean one of these two things:

  • a secure transmission (one in which only the interested parties can make sense of what's being transmitted)
  • a secure system (one in which no exploits exist and so unauthorised access does not happen)
Our goal with https is to establish secure transmissions between our web server and the users of our server--or as secure as we can make them, as crackers are always at work trying to find the next vulnerability, to break into systems and make money! The list of recent security incidents proves it: Heartbleed, POODLE, Logjam, etc...

So how are HTTP and HTTPS related?

HTTP is the 'language' a web server and a browser use to communicate with each other. The browser makes requests to the server, and the server returns results. For example, the browser can say something like GET /index.html to the server, and it will read and return the contents to the browser, using TCP as the transmission protocol.

TCP will ensure that even if the response is split across different parts because they're too big to fit in just one part, the browser will not only receive all of them, but also in order.

But when TCP prepares the data to send it across the network, it doesn't apply any sophisticated processing or encryption to it. At most, it will split the data in various packets, but that's it.

If someone gains access to the network and starts looking at the packets sent between a browser and a server using HTTP, they could make sense of what's going over the wires.

This is where HTTPS fits in. HTTPS "sits" between the HTTP server (the web server) and the transport layer (TCP), and encrypts the content of the packets before they are handed over to TCP and sent to the browser. Likewise, when the browser sends stuff back to the server over HTTPS, the data is first encrypted, then sent. The HTTPS layer in the server will decrypt it, before handing it to the server itself, which will to all effects just get normal HTTP data. The protocol used for the encryption is normally TLS although older systems might use SSL.

TCP/HTTP/HTTPS stacks

Because the encryption is happening at the application level, and not at lower levels, the IP addresses and ports used to establish the connection are still disclosed and can be accessed by attackers, although deciphering the transmission is way, way harder (vulnerabilities aside). The rest of data, including the address of the page we wanted to retrieve, the contents of the response, etc, are encrypted.

This ensures two things:

  • Privacy: because only the server and the browser that initiated the connection know how to decrypt the data they send each other
  • Integrity: since no one else had a chance to look at the data on the way to the user, we can trust that the server sent exactly that. And likewise, the server can trust that what the user is sending back to it hasn't been altered by someone who intercepted the communication.
But it doesn't ensure that websites are not malicious by any other means. You should still exercise caution!

Attacks against insecure connections are on the rise

Initially pretty much only banks used https connections. Obviously you don't want someone spying how much money you have and then modifying your payment data to redirect it to some foreign scammer account instead (and also the bank doesn't want to be bothered by your claims), so it made sense.

But over time, various types of evil techniques have been developed to extract as much money as possible out of people who use unencrypted http connections. Someone which intercepts http data not only can spy on you, but also do things such as:

  • Insert advertising on the page
  • Replace other site's advertising with their own (and earning the money that was supposed to be earned by the site)
  • Insert tracking 'devices' (e.g. a cookie) to obtain behavioural data about users
  • Steal user credentials for late use (e.g. your bank details)
  • Or steal the credentials and then use the access to a restricted, authenticated part of the site to look for more ways of causing havoc!

Attack vectors

Or: how does a user get attacked?

Potentially compromised layers

Essentially whoever has access to the 'pipes' has access to the unencrypted traffic that goes through them. Nowadays this not only means physical connections, but quite often also wireless connections.

  • Public WiFi access points that do not use encryption or use a weak encryption such as WEP can make it very easy for an attacker to insert themselves in the network and start listening to traffic
  • Likewise, home WiFi access points with default or easy to guess wireless keys or admin passwords are an easy target as well, specially in dense residential areas where you can reach many networks from just one place (compare with someone sitting in a car with a laptop over their knees, parked in a suburban street - it'd definitely look suspicious)
And if it wasn't enough with random strangers trying to break into our lives, turns out that network providers sometimes attack their own customers too. For example:

  • Mobile ISPs have been known to modify internet data to track and profile their customers even after they opted out (e.g. Verizon's Super Cookie)
  • Home ISPs haven't resisted the temptation to modify their customers' web traffic to insert some ads here and there either (e.g. BT's Phorm)
  • And apparently even core network operators have been shown to engage in some activities that result in modifying the traffic for users that visit certain websites. GASP.
In light of all this, I guess you will agree with me that the unencrypted web is a dangerous place.

But if you still needed more reasons to use HTTPS...

  • HTTPS is safer by design, so it's harder for certain bad things to happen when pages are served over https.
  • You also show empathy with your users. Let's be honest: SECURITY IS HARD. Computers are confusing. Tech is overwhelming. Everything is just so incredibly complicated. And people reuse emails and passwords and usernames everywhere. If we can use our technical prowess to make it a bit safer for 'the average person', avoiding their credentials being transmitted in the plain, then we're doing good things. Do you not want to do good? Boo you.
  • Newer, privacy sensitive JS APIs will eventually only work over HTTPS. If you want to use the Cool Stuff, you'll need to serve your site with https. Service Workers, WebRTC, GeoLocation, Background Sync... all of those will require https to function, or browsers will simply refuse to execute them in an insecure context. Chrome is has already 'deprecated' a number of these APIs such as getUserMedia and GeoLocation served via http (i.e. an insecure context).
  • HTTP2 only works over HTTPS too. If you not only want to use the newer JS APIs but also want to serve the code faster, you'll need https as well.
  • And finally, browsers are getting really serious about this, showing big huge warnings when forms are insecure, displaying very distinctive location bar information, and so on.
KEEPING USERS SAFE IS SRS BUSINESS

So here is part 1 of this series on Let's Encrypt. I hope I have installed on you a burning desire to migrate all of your websites to https (or to go make silly memes with SRS CAT, I'm fine with that too).

We'll continue tomorrow, looking at HTTPS and certificate authorities. In the meantime, stay safe and don't enter usernames or passwords via insecure connections ;-)