Hash VS Encrypt
Hashing is different from encryption in that hashing is not reversible by design. When we encrypt a piece of text,
with the correct key, the process can be undone. However, hashing is a one way function. There is no way as of now to
mathematically undo the hashing process. Furthermore, like some encryption schemes, a small change to the input almost
completely changes the resulting hash. This makes hashing very good for validating secret information like passwords,
because the original information cannot be recovered through the hash in theory. Data of various lengths will all result
in the same length, unlike most encryption schemes.
What is MD5?
MD5 message-digest algorithm is the 5th version of the Message-Digest Algorithm developed by Ron Rivest to produce a
128-bit message digest. MD5 is relatively faster than other implementations and is widely used for purposes that don't
require security. It's fast computation speed among other things are part of the reason behind its deprecation.
Brute Force Attack
Although we cannot reverse hashes to give us the input, we can try hashing different inputs that we guessed until the
resulting hash matches the targeted hash. If the hashes match, it is likely that we found the original input we were
looking for.
However, the more secure hashes are harder to calculate and take more time per guess. Because of this, it quickly
become implausible to try all possible combinations. But this is often undermined by the predictability of user input.
For example, instead of trying all the possible combinations of alphanumeric strings to break a password hash,
attackers can instead use lists of passwords that have been leaked in the past and some rules to reduce the number of
combinations tried and increase chances of hits. For most capture the flag competitions, you only need the famous
RockYou password dump for most challenges.
Collision Attack
Like mentioned previously, hashes are often used to validate input like a special signature to prove the authenticity
of the data. Because a small change will completely change the resulting hash, they make it obvious when data is
tampered with.
But this has it's weaknesses. Because the number of possible different hashes is finite, but the number of possible
different inputs is finite, it is certain that 2 different inputs can result in the same hash, hence the weakness. The
more modern hashes make it harder to find these collisions. There are also specific mathematical ways to find them,
which is far beyond the scope of this tutorial. What is important to keep in mind is that it is possible to switch out
the input signed by a hash given enough trial and error attempts to find a collision.
Tools