Introduction
Writing about Passport.js the other day got me thinking about how authentication actually works, and more importantly how many ways it can go wrong. The naive solution is to just store a user's username/email and password directly in the database, and then check the submitted password against the stored one. While this is essentially what should happen during authentication, there is actually a lot more to it.
Throughout this article I'll be explaining how you should be doing authentication, and the reasons behind it. You can go pretty nuts with some of this stuff (like the password requirements you enforce), so throughout the article I'll try to stick to techniques that are:
- Reasonable for you to implement
- Reasonable for the user to follow
- Highly secure
Password Requirements
Like all of the other techniques I'll describe here, enforcing password requirements is an absolute must. You're essentially protecting the user from himself. Just look at this analysis of recent database dumps with plain-text passwords, many of them are as simple as 123456
or even password
. It's up to you to ensure your users don't get lazy with their accounts.
The following requirements are a good place to start:
- Minimum 8 characters
- Must contain at least 1 lowercase letter (a-z)
- Must contain at least 1 uppercase letter (A-Z)
- Must contain at least 1 number (0-9)
- Must contain at least 1 special character (!, @, #, $, %, ^, etc)
For some reason some websites have been putting a maximum length on passwords, which you should absolutely should not do. You'll be hashing the passwords before they're saved to the database, so they will all be the same character length in the end anyway, regardless of whether the user enters an 8 character or 150 character password. Don't do this.
If you're worried about someone abusing this (like sending you a megabyte-long string) then just make sure the cap is really high, like a few hundred characters. A few hundred characters may seem a bit ridiculous, but some people do make crazy long/complicated passwords and then just copy-paste it during authentication, so don't expect it won't happen.
Also, don't restrict which types of special characters can be used. Anything the user can enter is fair game, even spaces.
For added security:
Force the user to periodically change their password after a period of time, like once per year. You can also do what Google does and not allow the user to re-use any of their old passwords.
Transferring the Data
The first major step of the whole process is to send the data from the user (whether it's from an app or a web page) to your server. Although this should go without saying, you should always be using HTTPS here.
It used to be expensive and inconvenient to buy and manage your SSL certificates, but now there are some services that cost only $16.00 per year for a Domain Validated certificate. And take SSLMate for example - they allow you to purchase and install your certificates right from the command line of your server. And if you choose to do so, you can have your certificates auto-renewed and auto-installed for you as well. There aren't many excuses left for not having HTTPS if you have a user-based webapp.
Side note: No, I'm not paid by SSLMate, I've just had good experiences with their product.
In addition to using HTTPS, you should only ever use the POST method to send sensitive data, and not in URL query strings via GET. Just think about it, what if you signed in to a website using a URL like:
https://stackabuse.com/login?username=scott&password=myPassword
While this could technically work, it wouldn't be very smart. If you send sensitive data using this method then the URL will be saved in your browser's history and is most likely saved in the server's logs as well. POST data is not usually logged unless explicitly configured to be, which is rare, so it is much safer than GET methods. Logs are usually much easier to access (and handled more carelessly) than the data in a database is, so you'd be better off keeping all your users' data out of there.
For added security:
Use HTTP Strict Transport Security, which is a standardized web security policy that allows web servers to tell the client's browser to only interact with that website via HTTPS and not HTTP. This will not only ensure that your users are getting the benefits of an HTTPS connection, but it also protects HTTPS websites from downgrade attacks.
Hashing Passwords
First of all, never ever ever ever ever store passwords in plain text. Seriously, just don't. It is so incredibly easy to hash a password that there is really no reason to not do it. I'll even show you the code for how to do it in a few different popular languages to get you started.
There are quite a few types of hashing algorithms out there (MD5, SHA-0, SHA-1, SHA-3, etc) so it might not be obvious which one is the best to use. bcrypt is widely considered to be the best hashing algorithm (well, technically it's a key derivation function), largely because it is an adaptive function. What that means is you can increase the iteration count (the number of key expansion rounds) to make it slower, which will help it remain resistant to brute-force attacks.
Here is the code to hash passwords with bcrypt in a few popular languages:
JavaScript:
var bcrypt = require('bcryptjs');
var hash = bcrypt.hashSync('somePassword', bcrypt.genSaltSync(10));
Python:
import bcrypt
hash = bcrypt.hashpw('somePassword', bcrypt.gensalt())
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Ruby:
require 'bcrypt'
hash = BCrypt::Password.create('somePassword')
See, it's not so bad, right?
Another thing to consider is where the hashing should occur - client side or server side? It might make sense to you to hash on the client side and then send over the hash to the server, which would prevent the plain-text password from being sent over. While intuitively this makes sense, consider an attacker that snoops on the data being sent to the server. If they get the hashed password being sent over, then all they have to do is send that same hash to the server to authenticate, so they don't even need to know the password.
The main problem, however, lies in the scenario where an attacker gets access to a database of these client-side hashes, which would give them instant access to every account since no hashing occured on the server. So I guess you could say that you can hash client-side, but then you'd still need to hash server-side as well.
For added security:
One method you can use for added security (although some people will argue that it isn't that helpful) is to make it even harder for attackers to break a hash by using unconventional hashing combinations like this:
sha3(sha2(password + salt))
sha3(sha1(password) + md5(salt))
This is called security through obscurity. If you're going to do it this way, at least make sure you're using a strong hashing algorithm. In the end, bcrypt is probably still a better choice since it's adaptive, but if you don't want to use it since it takes up a lot of computational resources then this at least gives you an alternative.
Making Your Hashes Stronger
Hashing the user's password is the first step, and a good place to start, but to make your hashes truly secure you'll want to add salt to the hash. Hash salt is just a random string added on to the string you want to hash, which helps prevent cracking with lookup tables or rainbow tables. In code, it might look like this:
salt = genSalt()
hash = hashAlg('somePassword' + salt)
The theory behind using salt is that if two users have the same password (without salt), then they'll also have the same hash. So if an attacker has a lookup table or has already cracked a password once, then there is no more work to do for all other matching passwords. The attacker would just use the lookup table. If salt is used, however, then even matching passwords will have unique hashes that are different from each other.
This holds true only if the salt is unique and unpredictable for each user, so make sure you use a Cryptographically Secure Pseudo-Random Number Generator (CSPRNG) to generate the salt. os.urandom
in Python and java.security.SecureRandom
in Java are CSPRNGs.
Here is what it looks like to salt a password hash:
hash1 = sha3('somePassword' + 'N7v4bL1YZU4xJw5A') // b7854b0f3c9422b4ee4f6e590d8c95897c53aacce9ab6e0c0ab05f0a4d986407
hash2 = sha3('somePassword' + 'z8WnrKRcu9D3BeOK') // df56f4a8876f53900575b784a594bbce7e2ab3e913ba146f13c6817c295e5f09
Conclusion
There are quite a few other things you can do to help keep your data and users secure (two-factor authentication, OAuth, etc), but many of them are out of the scope of this article, so I'll leave those for another time.
There are quite a few resources and code libraries out there to help you with some of the things I've covered here, so use them to your advantage. There's no reason to reinvent the wheel, although you still should at least know how and why it works.
What other tips do you have for securing your authentication process? Let us know in the comments!