Hash Functions

Hash Functions

Hash functions are a class of mathematical algorithms that take any digital data as input and generate an encoded value as output.  These methods have many uses in computer science with some critically important security applications. Hash functions are used to help authenticate the source and integrity of files and to generate digital signatures.

If you are preparing for the CISSP, Security+, CySA+, or another security certification exam, you will need to have an understanding of how hash functions can help protect systems, basic hashing terminology and the common algorithms that may be encountered.  While it will not be necessary to know the inner workings of hash functions, you will need a high level understanding of the properties of hash functions and why some hash functions are more appropriate than others for security applications.

The Message and Message Digest

The input to the hash function is commonly called the message.  The message can be any digital data including text, images, audio, and even very large program files.

The output of the hash function is often called the message digest but be aware that there are many other names for the message digest, sometimes based on the way the hash function is being used.  Common alternate terms include hash, hash value, hash total, CRC, fingerprint, checksum, and digital ID.

How Hash Functions Are Used

Perhaps the easiest way to begin understanding how hash functions work is to consider a few hashing examples.  Below, we apply a standard security hashing algorithm called SHA-256 to three sentences:

1) Humpty Dumpty sat on a wall:
1b9ed7aa6346ee2c6e0faf0f75997b9308916160239b1844e7a77bee183f355d
2) Mary had a little lamb:
efe473564cb63a7bf025dd691ef0ae0ac906c03ab408375b9094e326c2ad9a76
3) Mary had a little lomb:
94c6af4c0b5d493181a0a762b9023b90b295fbad1e28a6f1616a4af59209ba7c

 

At a glance we can identify two characteristics of a security grade hash function.  First, regardless of the length of the input, the output of the hash function is a fixed length.  Second, a minor change in the input will produce a significant change in the output. Examples 2 and 3 differ by a single character (lamb vs lomb), but the outputs are completely different.

Hash functions provide a quick, easy and reliable method to verify the integrity of a message.  Suppose, for example, that an important software patch was available for system administrators to download from a website and run on their servers.  If the software publisher also provided a message digest, the administrators could run the patch through the hash function after downloading to ensure that the patch had not been infected with malware or been otherwise corrupted.

The Characteristics of Cryptographic Hash Functions

While there are many hashing algorithms, not all of them are well suited for security purposes.  The subset of hash functions that are suitable for security use are called cryptographic hash functions.  The ideal cryptographic hash function meets the following basic requirements:

  • It accepts a message of any length
  • It produces a fixed-length message digest
  • It is easy (and therefore fast) to compute the message digest for any given message
  • The hash is irreversible – it is not possible to generate a message from its message digest
  • A small change in a message should generate large changes in the message value
  • The hash is collision free, meaning it is not feasible that two different messages would result in the same hash value

Common Hashing Algorithms

Over the years, numerous hashing algorithms have been introduced and recommended for security purposes.  Some algorithms have gone out-of-favor after weaknesses have been identified. The exam will expect you to be familiar with current algorithms, along with some of the older ones.  It is recommended that you review the following descriptions and pay particular attention to the length of the message digest for each algorithm.

Secure Hash Algorithm (SHA-1)  Designed by the U.S. Government (National Security Agency).  Produces a 160-bit message digest. Went out of favor because of possible collisions.

Secure Hash Algorithm (SHA-2, SHA-3) SHA-2 replaced SHA-1 and includes a set of variants – SHA-224, SHA-256, SHA-384, and SHA-512.  The number in the variant indicates the length of the digest. SHA-224 has a 224-bit message digest, and so on.  SHA-3 was introduced in 2015 as a “drop in replacement” for SHA-2. It includes the same variants.

Message Digest 5 (MD5) MD5 and its predecessors MD2 and MD4 produce 128-bit digests. These algorithms are no longer recommended due to detected vulnerabilities.

Message Digest 6 (MD6) Developed in 2008 to replace MD5 and earlier MD types.  Produces variable length message digests up to 512-bits.

Hash of Variable Length (HAVAL) A similar algorithm to MD5 that produces variable length digests of 128, 160, 192, 224 or 256-bits.  HAVAL is vulnerable to collisions.

Hash Message Authentication Code (HMAC) Combines a shared key with hashing for an additional layer of security.  Produces a variable length message digest.

Understanding hash functions is an important component of your preparation for a variety of security certification programs.  If you’re interested in earning your next security certification, sign up for the free CertMike study groups for the CISSP, Security+, SSCP, or CySA+ exam

No Comments

Post A Comment