Data breaches happen every day, and there isn’t a whole lot you can do about it. Even with secure passwords, your data is at mercy of the websites you use, and what kind of hashing algorithm they use to store your credentials.
A hash is essentially a one-way function. It’s an arithmetic operation that takes little processing power to perform, but extremely difficult to reverse, turning a user-entered string into what appears to be a string of random alphanumeric characters. All hashing functions are designed to be impossible to convert back to their original data. Most websites uses hashes to store passwords, making a hash of your entered text and compares it to the hash it has stored for your credential. This process allows the server to verify your identity without ever saving your password. Hash functions are extremely complex since they have to create the same sequence of characters when the same data is entered, but make it look like its random data.
When attackers compromise a company to access its stored passwords, they usually only get the hashes of the passwords, and not the passwords themselves. Depending on the hashing function used, reversing the hash might be possible. For SHA-1 and MD5, the two most commonly used hashing functions, the tools to reverse a basic hash are widely available. MD5 produces a 16 byte hash value, and many vulnerabilities have been found leading to the release of rainbow tables that let attackers crack them with ease. SHA (Secure Hash Algorithm) is a group of hash functions started with SHA-0, SHA-1, and SHA-2. SHA-1 is the most common of the three, and the first ever SHA-1 collision was discovered by Google earlier this year. A collision occurs when two unique data inputs have the same hash, I’ll go in depth on collisions later.
There are variations on these hash functions that make these functions stronger. Salt is a series of random information added to the sensitive data before it is sent into the hash function. The hash is then stored along with the salt. Other algorithms recursively send the hash back through the hash function many times, but those create a bottleneck in memory since all of those values must be stored and the task can’t be parallelized.
So how are hashes cracked?
There are a few different ways to crack hashes, the most common is to use known passwords from previous data breaches and hash them using many different algorithms. Once they’ve created this giant table of the passwords and a few different possible hashes, they search to see if any of those match the hashes they need to crack. Once a pattern is established, statistical analyses can be performed to find patterns to guess new passwords against the hashes. Using GPUs instead of CPUs have allowed attackers to guess millions of times in parallel, drastically reducing the computational time required to break hashes. Once those hashes are broken, attackers can use the passwords and information they’ve gleaned and test them on other websites since most people use the same password for all of their online accounts.
So what can we do?
Like I mentioned earlier, data breaches happen all the time and there isn’t much you can do about that. To prevent your data being stolen, use unique passwords for all of your online accounts. If you have trouble remembering all of them, use a password manager like 1Password or LastPass. If a service you use has announced a data breach, change your passwords.
Cybersecurity engineers are smart people, but we can’t promise the attackers aren’t smarter.