## What is a cryptographic hash function? A cryptographic hash function takes input of any size, and securely generates a fixed-sized output. Cryptographic Hash Functions are one-way functions: that is, input is used to calculate an output, but the output cannot be used to calculate the input. For example, the SHA256 algorithm As hashes are calculated on the complete state of an input, if the input so much as changes one byte, the entire hash of that new output will be completely different. For example, the SHA256 hash of the phrase `aaa` is `9834876dcfb05cb167a5c24953eba58c4ac89b1adf57f28f2f9d09af107ee8f0. However, the hash of `aab` is `38760eabb666e8e61ee628a17c4090cc50728e095ff24218119d51bd22475363`. If you so much as change a single value for an input, the hash calculated for ends up becoming completely different. That being said, for some older and less secure hash functions there is the risk of something called a hash collision. ## Hash collisions A hash collision occurs when two distinct inputs provide the same output when put through a cryptographic hash function. This can be incredibly dangerous cybersecurity-wise. Lets use a standard login flow as an example: 1. A user provides a username and password to a login page. 2. The backend server looks up information on the username provided. The hash of the password tied to the username is stored in the database (as is common and security best-practice; never store plaintext passwords anywhere). 3. The backend server retrieves the password hash. 4. The user-provided password is then hashed, and that hash is compared to the one stored in the database. 5. If both hashes match, it's safe to say the user has provided the right authentication details, and they are allowed to log in. If a malicious actor is able to provide an input that generates the same hash as the original password (without providing that actual password), then we have a hash collision and (more importantly) they're able to log in and compromise the user. Mathematically speaking, given a fixed output size there is a relatively-limited amount of output that can be used. However, the smaller the output size, the greater the risk of a hash collision (as there's even less output character-space that can be used or generated).