Whilst working with Amazon’s object storage S3 service recently, it occurred to me that there were a lot of different options for encrypting objects in your buckets, and that people new to the service or AWS might find it difficult to understand the differences between them or know which one to use.
There are two key parts to storing encrypted data in S3. The first of these is data in transit (flowing over the wire as you upload it to AWS servers) and the second is data at rest (the data actually sat on disks in an AWS data centre somewhere). Data in transit encryption is provided for you by simply using HTTPS, so most of the decisions you’ll have to make relate to the at-rest encryption.
How Sensitive Is Your Data?
As with all aspects of information security, the amount of effort you’re going to want to put into protecting data depends on two things: how sensitive is it, and what your risk profile for the data is. Legal documents containing personal information are more sensitive than your collection of dog GIFs, so it’s logical that you’d want to put more effort into protecting these. Before you decide on the specifics of how you’re going to encrypt the data, you need to decide whether to opt for client-side or server-side encryption.
Client Side Encryption
If you opt for client side encryption, then you’ll encrypt the data before you upload it to S3. You’ll be responsible for the whole process: creating, managing and storing encryption keys and using appropriate software to actually perform the encryption. This is the approach to take if you’re paranoid about your data security or if you’re uploading very sensitive data. This way, your data will never pass through any of AWS’ systems in plaintext – it will always be encrypted and you’ll hold the keys.
Additionally, as your data is uploaded over HTTPS, it will be encrypted again inside that HTTPS tunnel – so even if someone can compromise that connection, your data still won’t be readable.
If you do decide to go down this route, the encryption and decryption process itself is made simpler with the excellent AWS Encryption SDK. This includes support for multiple programming languages as well as a brilliant CLI that supports you using either keys managed by KMS within AWS or keys from a custom provider you define yourself. Whilst you obviously don’t have to use the AWS tools for client-side encryption, the SDK is one of the simplest tools I’ve come across for this sort of work – so if you’re in the position where you don’t already have tools in place for this, I’d recommend giving it a look.
Server Side Encryption
Server side encryption (SSE) is by far the simpler option, and predictably for AWS it comes in a few similar but subtly different varieties. Whilst somewhat complex to understand at first, I think this is good – it gives you the choice you need to be able to employ the system that’s exactly right for you and you can make it as simple or complex as you want. With server side encryption, you upload the data as-is in plaintext, and it’s encrypted before it’s stored in S3. HTTPS is still used for the transfer, so data is encrypted in transit as well, but if an attacker can compromise your connection then they can read your data.
AWS handles the server side encryption entirely transparently to you – you upload, download and use data as normal, and as long as you’re authenticated with the console, CLI or SDK, you’ll get your objects back ready to use in plaintext.
There are three varieties of server side encryption on S3, which all have slightly different use cases. They all encrypt your data with AES-256, but manage the encryption keys in different ways:
This is also referred to in the console as “Amazon S3 master-key” and just “AES-256”. Every object that you upload is encrypted with its own unique key, and that key is encrypted with the “master key” that AWS manage on your behalf, for things like key rotation. This is the simplest way to encrypt content on S3 – you just tick a box and your data will be automatically encrypted at rest when it’s stored. You won’t notice any difference with how you upload or access files, but if someone were to somehow steal the disks your data lived on from an AWS data centre, they wouldn’t be able to read your data. There really isn’t any reason to not use this, at a minimum.
This is similar to SSE-S3, but you get more control over how the encryption keys are managed. Instead of AWS managing a master key for you, your data keys (which encrypt the individual objects) are encrypted with a key you manage using KMS, the Key Management Service. There are additional charges for using KMS but this may be worth it for you if you need the additional security and audit trail that it provides.
KMS lets you create keys and assign permissions to them using IAM, so as an example use case, you could encrypt all of your HR data with a KMS key that only HR staff had permission to use for decryption operations. Suddenly, the decryption process isn’t quite so transparent – for anyone other than the HR team. If you don’t have permissions to use the key for decryption, you can’t read the data because you have no way to decrypt it. If you do have permissions, the decryption is just as seamless as with SSE-S3. KMS also lets you track who and when used specific keys and what they did with them.
With this variety, the “C” stands for “Customer”, and it’s so called because you manage your own encryption keys. That means that when you’re making requests to S3, you include the encryption key you want to use as part of your request. AWS doesn’t store the key, instead retaining a salted hash that you can’t derive the initial key from, which they use to verify a key you provide for decrypting objects.
As well as managing the encryption keys themselves, you also need to keep track of what keys are used to encrypted what objects, as AWS don’t keep track of this for you. This could get quite complicated quite quickly, especially if you’re practising your own key rotation or if different versions of your objects use different encryption keys. You’d use this process if you don’t want the hassle of managing client-side encryption, but you still want to retain full control over the key material.
How Does S3 Encryption Work?
Server side encryption on S3 uses a concept called envelope encryption for securing objects that you upload. Every single object is encrypted with its own unique key using AES-256 – this is known as the data key.
Next, we encrypt the data key with a new key – the master key. This master key is the one that you supply – be it the generic S3 master key which AWS manage, or a KMS or customer provided key as discussed above. Because the data key is encrypted, you can store it alongside the encrypted data. This makes it easier to rotate your master keys, because you don’t need to re-encrypt all of your large objects again – you just need to re-encrypt the data keys themselves.
There’s another good reason for using envelope encryption – it allows you to use both symmetric encryption (AES-256, for the data itself) and asymmetric encryption (for encrypting the data keys). Each of these types has its own pros and cons, and this lets you take advantage of the benefits of each. Symmetric encryption is faster, so it’s great for encrypting your objects – which could be quite large. Asymmetric encryption facilitates easier key management – such as when you use KMS to manage your keys and what permissions different users and groups have.
S3 gives you the mechanism to enforce encryption of items that are uploaded to buckets through the use of bucket policies. You can define a policy to reject uploads that don’t conform to specific encryption requirements and like the rest of IAM, it’s pretty flexible. There are plenty of examples in this AWS blog post about enforcing encryption and you can extend it even further, such as denying uploads that aren’t encrypted with a specific KMS key.
Hopefully this post is useful and informative – I’m hoping to follow this up with more on AWS security topics, so stay tuned!