Why You Shouldn’t be Using BCrypt and Scrypt

I start by saying that this is certainly an opinion piece. However it’s not just pure opinion. My opinion on this subject is backed by a mathematical analysis of hashing algorithms. There are certainly some complicating factors that may be argued to influence the described scenario in all reality. However, given that I believe the core logic holds, a strong argument may be made against the use of B/S-crypt.

If you aren’t already aware, as to what Bcrypt and Scrypt are, a brief explanation follows. BCrypt is a computationally difficult algorithm designed to store passwords by way of a one-way hashing function. You input your password to the algorithm and after significant (relative) computation, an output is produced. Bcrypt has been around since the late 90s and has handled significant scrutiny by the information security/cryptography community. It has proven reliable and secure over time.

Scrypt is an update to the same model from which Bcrypt arose. Scrypt is designed so as to rely on high memory requirements as opposed to high requirements on computational power. The realization that lead to this, was that specialized computer chips (FPGA/ASICs/GPUs) could be purchased at scale by an attacker easier than could huge amounts of memory for a traditional computer.

The goal for both of these algorithms is the same. To attempt to make it harder to run the algorithm. If an attacker wishes to brute force any given plaintext password -> given only a password hash; he must guess plaintexts, compute the hash of that plaintext then compare the output (the hash of his guess) with the known stored password hash. (Ex. Attacker captures hash_0; must guess A for A in all guesses until hash(A) = hash_0.)

As you can see by the above. If the password is hard to guess, then the attacker must run the algorithm a massive amount of times to be able to find the correct input to the algorithm which will match the stolen hash value on output. Therefore, the idea behind BCrypt and Scrypt is to increase the amount of time it takes the attacker to make any single guess, and by extension massively increase the time it takes him to find the correct answer.

There is actually an issue here though. Fundamentally, the idea that BCrypt and Scrypt are solutions is built on a faulty premise. People want to feel safe. It makes sense that any idea which could potentially be seen to offer businesses hope that their stored passwords will not be cracked in the event of a breach would be particularly compelling.

But right off the bat, there is a real problem with Bcrypt and Scrypt. By requiring extensive computation to process a logon event, they increase the load on a company’s server resources; stealing from business critical applications just to play around with math on the off chance that one day the stored passwords are stolen.

But not only are Bcrypt and Scrypt wasting server resources on an ongoing basis; regardless of any reality to the threat against a given service… they aren’t even the most logical response to the issue presented by contemporary password storage dilemmas.

There is a function that can be used to describe the time it takes for an attack to perform an “exhaustive search” to locate a password given the parameters defining the complexity of the underlying password and the method by which it is stored.

An exhaustive search is the maximal time it will take an attacker to find a password for sure given certain parameters. For a simplified version of this function all we need to know are a few simple variables. First we need to know the “character space” of the password. That is, the number of possible character types on the keyboard the password creator may have chosen. For example, if the password is a PIN and is constructed only of numbers then the character space -> C = 10 (as there are 10 decimal numbers 0-9). Lowercase alpha is 26, lower/upper alpha is 52, lower/upper alphanumeric is 62, etc.

The next thing we need to know is the length of the password. In the case of our PIN example above, it may be as short as 4 characters long. In that case we would say that length -> L = 4.

The final thing we want to know is how long does it take to make a single guess. This is where Bcrypt and Scrypt come in. They both attempt to slow down the guessing process by making the time it takes to make a guess longer than other cryptographically secure algorithms. Let’s use a simplified example of relative time… and say that our PIN is stored by way of using SHA256 which hashes in some time T = 1*SHAS for SHAS = the time it takes to run SHA256 on whatever hardware you’re using (this describes the relative nature of these computations against the processing power of the attacker).

In this case, the total time it would take an attacker to guess the PIN 100% for sure is equal to C^L * T = E = 10^4 * 1SHAS = 10000SHAS (with SHAS as relative time value for SHA256 on his hardware… SHAS equaling something small like 1/10^6 second).

The logic behind using Bcrypt or Scrypt to increase the time it takes the attacker to guess the PIN is by an increase in the T value. One way this may be accomplished is by “iterating” a cryptographically secure hashing algorithm against itself. So for example (though this would not be as slow as Bcrypt) let’s follow the same logic and iterate our algorithm 1000 times per run.

Now our T = 1000 and our equation looks like C^L * T = 10^4 * 1000SHAS = 10000000SHAS. This is a significant improvement over the last simulated attack. Now it should take the attacker significantly longer to break the PIN value. By this logic the use of slow hashing algorithms like Bcrypt and Scrypt should be secure, right?

But something is missing. Most specifically, the cost benefit analysis still begs the question, “is the increase in attack difficulty I’m seeing, really worth all the added work?” Remember, this additional work put into the value for T doesn’t just fall on the attacker. Your application must also perform 1000x more work now every single time someone logs in. With a large enough user base, that is no insignificant number.

In fact, as a professional security consultant, I have written reports for companies detailing how their use of Bcrypt could be leveraged by a clever attacker to bring their site offline with ease. Ex. Creating a large number of login events by manipulating profile data.

Give me a real solution…

Thankfully, there is actually a very valid and powerful solution to this issue.

If we look at our exhaustive search function again: C^L * T; using just a bit of college math we can see that there does exist a better solution. Imagine I was to ask you to make the output to this function as large as possible (which is our goal, given that a larger exhaustive search time means more work for the attacker)… Which variable would you want to increase to make the whole product increase the fastest? The Bcrypt and Scrypt folks would tell you that the answer is T. But they’re dead wrong.

It’s L. Given that T > 0, and C > 1; increases to L will cause this function to grow exponentially. Remember L means our length value. We don’t need a large value for T, even a small increase to L will result in a large increase to the total output. All we need is an 8 character PIN and we’ve already exceeded our example “slow hashing algorithm” from earlier. The best part, computationally speaking, increases to L are effectively free. Having your users use longer inputs will not stress your application’s resources.

Now the argument is that it will stress your users. That users need to use shorter passwords that they can remember. For that, I’d like to move the discussion from our inherently simple PIN example, to something a bit more real world.

For the longest time, the recommendation was to use complex passwords. A complex password increases the value C. And certainly there must be at least some minimum value for C given that 1^x = 1 for all x. Practically speaking, we also cannot expect our users to remember passwords of infinite length, so for our equation to grow fast we want at least SOME complexity. But that doesn’t mean you need to be jamming on in l33t speak with some password like $uP3rS1kR##t.

A famous XKCD said it best:

https://xkcd.com/936/

Passphrases give us the ability to easily remember high entropy password values. And rapidly outperform Bcrypt and Scrypt, at no cost to you.

For example, assuming a hashing algorithm that is 210,000x slower than SHA256… and a minimum password length of 10 (which is actually significantly longer than most sites require at this time still, so this is me being wildly generous). Now imagine that someone on this site thought that since the site was using Bcrypt (or whatever) they could have a 10 character password all lowercase alpha. Something like “tunafishgo”. Their E is equal to 26^10 * 210000.

Now imagine another site competing with this one. Simply using SHA256 once, and requiring passphrases. If they use the length of the example passphrase “correcthorsebatterystaple” as per the above… their E is equal to 26^25 * 1.

The raw difference between these two different approaches is that it will take 26^15/210000 ~= 2^53 (SHAS) longer to crack the passphrase site, than it will to crack the passwords protected by our “bcrypt” example. Keep in mind, that this is me being generous, as many sites “secure” their passwords with Bcrypt, then opt for shorter password lengths. I’ve seen lengths as short as 7 in the wild. Additionally, this is done for free by the winning team.. While the Bcrypt camp is out buying more servers just to keep their application running.

In Conclusion

You should not be using Bcrypt. You should be using passphrases. The dangers of Bcrypt and Scrypt are that you will:
– Be lulled into a false sense of security; and actually weaken your passwords
– Waste your resources processing hard algorithms; and lose the real game

One final thought. Of course there must be some sort of minimal difficulty level for the T value as if T = 0 then E = 0 in all cases. It is reasonable to expect that as computers become more powerful, so too should our hashing algorithms. However, as long as you’re using a modern cryptographically secure hashing algorithm like the SHA2 series, with passphrases, your servers, users, and shareholders will all be happier.

About the author

I do a lot of stuff. I'm a decent hacker. I jump out of airplanes a lot. Done some firefighting and stuff. Philosophy all the time, make music, acting in awesome indie films. Just, a lot of stuff.

Comments

  1. “correcthorsebatterystaple” as per the above… their E is equal to 26^25 * 1.

    Not true because here you would use a dictionary attack

  2. In a real-world engagement, I cracked an xkcd generated password in 6 days. I am not claiming that all such passwords will fall as easily, but I believe your conclusion to be only partially accurate. The issues of strong password storage and strong password hygiene are not mutually exclusive. (And as can be inferred by my success above, most real-world password cracking does not rely on walking an entire keyspace.)

  3. You misunderstand scrypt. The goal of scrypt is to trade off time and memory, in order to prevent hardware based massively parallel attacks.

    If you only have a small amount of memory available per processing core (such as in GPUs), then the time required to perform the calculation is high, since you are forced to continually recalculate intermediate results.

    If you have a lot more memory available (such as for a server CPU), you require less time, as you can hold on to the intermediate results.

    This ensures that processing requirements are kept low for business as usual activities, but are forced to be high for massively parallel hardware attacks.

  4. The assumption that bcrypt/scrypt will in any way affect the overall performance of the system is flawed. It is only true for poorly designed systems. In the real world, password authentication is only used for initial login and then an alternative mechanism is used for controlling access (JWT, OAuth, etc.). So this argument is flawed.

    Secondly, high-performance ASIC hardware for calculating SHA256 is available for very cheap (obsolete bitcoin miners), it is also available for scrypt (scrypt-based cryptocurrency miners) but not for bcrypt. So, this makes anything based on SHA256 vulnerable to brute force attacks by orders of magnitude, while bcrypt is affected the least.

    I totally agree that choosing high entropy passwords is way more effective diminishingly small percent of people know the difference and even fewer people know how to use this properly. It is very hard to enforce entropy when creating passwords (tools like zxcvbn are too computationally expensive and there are plenty of other implementations that just doing it wrong), so for foreseeable future we stuck with `John,12345` kind of passwords and the only algorithm that has an edge (and practical) is bcrypt.

Leave a Reply to Dmitri Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.