Following our cryptography related article series, this article addresses Hash Functions and their importance to blockchain and cryptography in general. Starting with a simple definition, a cryptographic hash function is a one-way function that can map data of any arbitrary size to data of fixed size. The input to the hash function is called pre-image and the output is called hash. The one-way property, also known as pre-image resistance, means that it is computationally infeasible to recreate the pre-image if one only knows the output hash. A brute-force approach would be practically impossible, given that the search space tends to infinite.
Therefore, a cryptographic hash function should have the following properties:
• Determinism: A given input message always produces the same hash output.
• Pre-Image resistance: One-way property.
• Easy to compute: Computing the hash of a message is efficient, due to its linear complexity (The bigger the message, the lower the efficiency).
• Avalanche Effect: Any kind of change to the message should change completely the hash output, so that there is no correlation between outputs.
• Irreversibility: As previously described, computing the message from its hash is infeasible.
• Collision resistance: A strong collision resistance means that two different input messages should not have the same output
• Large output space: Output space large enough to make a brute force search infeasible.
With these properties, Hash Functions can be used on several applications, such as: Data fingerprinting, Message integrity (error detection), Proof of Work, Authentication, hash tables, pseudorandom number generators, peer-to-peer file sharing etc. They also have a vital role in blockchain, being used twice on the PoW function on Bitcoin in order to verify the computational effort spent by miners; digital signatures also have hash functions present on their algorithm; and the blocks of the blockchain are chained by block headers that contain the hash of the previous block header, therefore any change on the blocks will be immediately detected due to the change of the hash output.
There are several different classes of hash functions and algorithms, the most relevant for blockchain technology are the Secure Hash Algorithms (SHA), SHA-256 and SHA-3. Both algorithms are used in Bitcoin and Ethereum respectively. SHA-256 has a maximum input message of less than 264 -bits and the output is a 256-bit digest. The algorithm consists of an iterated hash function approach, in which the input message is compressed in multiple rounds on a block-by-block basis in order to produce the compressed output. SHA-3, also known as Keccak-256, has a different structure from the previous algorithm. It uses a newer approach called sponge and squeeze construction, which is a random permutation model. The Ethereum addresses are unique identifiers derived from public keys using the Keccak-256 hash function. As described on the previous Public Key article, the public key is derived from the private key using a Elliptic Curve Cryptography algorithm, and then Keccack-256 is applied to the public key, and then the 20 least significant bytes are considered as the Ethereum Address.
With this, one can verify how important cryptographic algorithms are for all of the blockchain advantages and characteristics. These technologies are in constant development and study and are also being applied for scalability purposes, with Zero Knowledge Proofs (ZKPs) being used on Ethereum Rollups. These proofs are used to prove the validity of an assertion without revealing any of its information, and will be the theme of future articles.