Hash VS Encrypt

Hashing is different from encryption in that hashing is not reversible by design. When we encrypt a piece of text, with the correct key, the process can be undone. However, hashing is a one way function. There is no way as of now to mathematically undo the hashing process. Furthermore, like some encryption schemes, a small change to the input almost completely changes the resulting hash. This makes hashing very good for validating secret information like passwords, because the original information cannot be recovered through the hash in theory. Data of various lengths will all result in the same length, unlike most encryption schemes.

What is MD5?

MD5 message-digest algorithm is the 5th version of the Message-Digest Algorithm developed by Ron Rivest to produce a 128-bit message digest. MD5 is relatively faster than other implementations and is widely used for purposes that don't require security. It's fast computation speed among other things are part of the reason behind its deprecation.

Brute Force Attack

Although we cannot reverse hashes to give us the input, we can try hashing different inputs that we guessed until the resulting hash matches the targeted hash. If the hashes match, it is likely that we found the original input we were looking for.

However, the more secure hashes are harder to calculate and take more time per guess. Because of this, it quickly become implausible to try all possible combinations. But this is often undermined by the predictability of user input. For example, instead of trying all the possible combinations of alphanumeric strings to break a password hash, attackers can instead use lists of passwords that have been leaked in the past and some rules to reduce the number of combinations tried and increase chances of hits. For most capture the flag competitions, you only need the famous RockYou password dump for most challenges.

Collision Attack

Like mentioned previously, hashes are often used to validate input like a special signature to prove the authenticity of the data. Because a small change will completely change the resulting hash, they make it obvious when data is tampered with.

But this has it's weaknesses. Because the number of possible different hashes is finite, but the number of possible different inputs is finite, it is certain that 2 different inputs can result in the same hash, hence the weakness. The more modern hashes make it harder to find these collisions. There are also specific mathematical ways to find them, which is far beyond the scope of this tutorial. What is important to keep in mind is that it is possible to switch out the input signed by a hash given enough trial and error attempts to find a collision.

Tools