Implementing MD5 From Scratch Using Rust

June 11, 2023

I've been tinkering with Rust alot this past month and I decided to build a from scratch implementation of the md5 hashing algorithm. It was a fun project, so I figured I'd write a quick article about how md5 works and how I implemented it. For reference the source code is public on my personal github. (github.com/mdimovich/rusty_md5)

MD5 - A Short Overview

MD5, which is short for message digest algorithm 5, was a widely used cryptographic hash function until it was discovered to have security flaws that inevitably led to its decreased use. However, the md5 algorithm is still an important piece of computer history, and learning how it works is a fun dive into how hashing algorithms generate fixed length outputs.

The original MD5 rfc was written by Ronald Rivest in 1992. The MD5 rfc was a successor to the previously written MD4 rfc, which was a similar hashing algorithm that was more lightweight and fast. MD5 was meant to be a more robust replacement to MD4, and it added some extra steps for increased security.

Despite being more conservative in its implementation, MD5 is not considered a secure hashing algorithm anymore, and it is essentially useless for cryptographic purposes.

MD5 - Algorithm Details

Alright lets jump into the weeds a bit. The MD5 algorithm is capable of taking a variable-length quantity of data and converting it into a 128-bit digital fingerprint, called a "message digest". The idea of a message digest is that it is used to serve as a unique hash of a given file or string, such that no other file or string will have the same digest. As said before, with MD5 this has proven to be untrue, but it is the intention of the algorithm.

The algorithm can be broken into 5 steps:

Input Padding
Length Appending
Buffer Initialization
Message Processing
Output

Let's start from the beginning!

a quick note: it's worth stating that many of the operations in md5 are managed in a way where the lower-order bytes are listed first. this is made apparent in the source code, but its worth stating here.

Step 1: Bit Padding

For the MD5 algorithm to work correctly, the total size of our message must be a multiple of 512 bits. Our message can be any arbitrary bits, such as a file or string, as long as it can be converted to bits. For this first bit padding stage of the algorithm, we want our bit size to be equal to msg length % 512 == 448. We'll be using the extra 64 bits in the next step.

(note: if you are unfamiliar with the percentage operator, it is called the modulo operator, and it basically tells us that we want to ensure that when we divide our message length by 512, the remainder we get is 448)

Bit Padding Operation

The first thing we do is append a single "1" bit to the end of our bit-encoded message. Then we begin checking against the total length. Until message length % 512 = 448, we will append "0" bits to the end of our message. At minimum we will only append one bit, and at most we will append 512 bits.

Step 2: Appending Length

Once we have our appended bits such that the length of our message is equivalent to 448 % 512, we want to take a 64-bit representation of our original message length and append it to our current bit padded message. Now, this only works if our length is less than 2^64. In the case where our length is larger than that, we use only the lower order 64 bits.

Once our length is appended, we have a message that is a multiple of 512 bits in length, which is what we need to begin computing our message digest.

Step 3: Buffer Initialization + Table + Bitwise Operation Declaration

To arrive at our final 128-bit message, we use 4 32 bit registers (or buffers) to perform operations on. We call these 4 buffers A,B,C, and D.

To start the process, we initialize these 4 buffers to constants listed in the paper.

word A = 01 23 45 67
word B = 89 ab cd ef
word C = fe dc ba 98
word D = 76 54 32 10

Alongside these buffers, we also have a 64 element table, that we fill with constant values based on a mathematical operation using the absolute value of sine.

Here is that table formula:

for i in range(1,64):
	2^32 * abs(sin(i)), where i is in radians //only the integer part, not float

Now we also have 4 functions that we use to perform our operations and arrive at our final message digest. These 4 functions are called F, G, H, and I.

They are each bitwise operations performed on the 4 buffers.

F(X,Y,Z) = (X & Y) | (!X & Z)
G(X,Y,Z) = (X & Z) | (Y & !Z)
H(X,Y,Z) = (X xor Y xor Z)
I(X,Y,Z) = Y xor (X | !z)

These 4 operations serve as the crux of the algorithm.

Step 4: Message Processing

In this phase, we break down our message into 512 bit chunks, and we begin the process of performing our operations.

We loop through each 512 bit chunk and start by setting the initial values of our registers A,B,C,D to values called AA,BB,CC, and DD. This is to save their initial values for use later. We also want to copy our 512 bit block into a new array or vector for use in our operations.

Now comes the meat of the MD5 algorithm: We perform a series of 64 operations on the 4 registers with our 4 bitwise operation functions. There are 4 rounds in total, each round split into 16 operations.

Let's take a look at round 1: In round 1 we are given a series of numbers to denote the operation: [abcd k s i]

Those 7 variables plug into this formula: a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< S)

This may look confusing at first glance but its actually pretty straightforward. Let's take a look at the very first operation performed in round one which is given as [ABCD 0 7 1]:

A = B + ((A + F(B,C,D) + X[0] + T[1]) <<< 7)

So for A,B,C, and D all we have to do is plug in the registers at their correct location in the equation. k and i both serve as indices: k is our index for our message, and i is the index for the table with the sine equation we constructed. s is a integer value that we use to shift our entire result left. So in this case once our internal equation is evaluation, we shift our entire result left by 7 bits.

I won't list all the operations here because that would be pretty wordy, but you can read them in the original rfc or in the source code. Its basically 64 operations just like the above.

Once all 64 operations are complete, we take AA,BB,CC, and DD which we created at the start of our operations, and we add them to the original message like this:

A = A + AA
B = B + BB
C = C + CC
D = D + DD

Step 5: Output

Now that we have all 4 registers complete, we can use them to output our final message. We create our final message by taking A,B,C, and D and outputting them together, beginning with the least-significant byte of A, and ending with the most-significant byte of D.

And that's it! That's how MD5 works.

Now that we understand the algorithm, lets talk about my implementation process.

Implementing MD5 In Rust

I set out on this journey mostly to learn about how MD5 worked, and to get more confident writing in Rust. My intention was never to make a production-ready implementation, but I am happy with the result! In the current iteration of the project, there are some constraints. Rigth now it only supports valid utf-8 strings. This is because I am using the &str type, which doesn't support all valid unicode chars. Also as of right now, I am only supporting strings and not files (but that may change if I decide to tinker more). One other point of note is I wanted to build everything using strictly the Rust standard library, and not use any external crates.

As a basic example, I started with a simple &str of the character "a", which if all works correctly, should return a message digest of "0cc175b9c0f1b6a831c399e269772661". I also created a tests.rs file to run through a series of unit tests to ensure the algorithm is working correctly. Each of the test cases are taken directly from the RFC, and are the original test cases from the C implementation.

Utility Functions and Bitwise Functions

The first thing I did was implement the functions that were clearly defined in the RFC, such as the 64 element table construction function and the 4 bitwise operation functions F,G,H, and I. For this I created 2 functions: the formula function, and one for building the table itself.

/*
* The 64 Element Table
*/

// The table formula function:
// K[i] = floor(2^32 * abs(sin(i))
fn table_construction_function(i: u32) -> u32 {
	let x: f64 = i as f64; // cast i to a 64-bit float
	let sin_eval = x.sin().abs(); // get the absolute value of the sine of x (where x in radians)

	// note: 4294967296 == 2^32
	return (4294967296.0 * sin_eval) as u32; // cast it back to unsigned int, cutting off decimal places
}

// Build our table
fn construct_value_table() -> Vec<u32> {
	let mut t: Vec<u32> = Vec::new();
	t.push(0x00000000); //push in a padding value so we can follow rfc, which is 1-indexed

	for i in 1..=64 {
		t.push(table_construction_value(i));
	}
	return t;
}

/*
* The Bitwise Functions
*/

fn f(x: u32, y: u32, z: u32) -> u32 {
    x & y | !x & z
}

fn g(x: u32, y: u32, z: u32) -> u32 {
    x & z | y & !z 
}

fn h(x: u32, y: u32, z: u32) -> u32 {
    x ^ y ^ z
}

fn i(x: u32, y: u32, z: u32) -> u32 {
    y ^ (x | !z)

The bitwise functions were pretty straightforward to implement, given that all of the necessary operations are built into Rust. Each function takes in 3, 32 bit unsigned integers as input, and returns a single 32 bit unsigned int as output.

note: in rust the ^ operator is a bitwise xor.

Bit Padding and Appending Length

The next thing I needed to do was build the bit padding and length appending steps of the md5 algorithm.

To start I created a function called bit_padding and took in a &str called input as a parameter, and returned out a vector of unsigned 8 bit ints. The first thing I needed to do was convert the original message to byte chunks, so I decided to use a u8 vector for that. So I wrote a function called convert_str_to_vec to handle that operation.


// this should only work with utf-8 encoding and not full unicode support
// due to multi-byte unicode chars
fn convert_str_to_vec(input: &str) -> Vec<u8> {
    let mut byte_vec: Vec<u8> = Vec::new();
	// we can easily create our vector by calling extend on 
	// the vector and converting the input to bytes using as_bytes()
    byte_vec.extend(input.as_bytes()); 
    return byte_vec;
}

One other function we need to write before putting together the bit_padding function is a function that takes the 64 bit length and splits it into 8 u8 chunks so that it can be pushed into our vector.


fn split_u64_to_u8_array(s: u64) -> [u8; 8] {
	// we are taking each 8 bit chunk, and bit shifting right so that we get the correct piece.
	// this allows us to split up our u64 into 8 u8s and return an array of u8's
    let u8_array = [
        s as u8,
        (s >> 8) as u8,
        (s >> 16) as u8,
        (s >> 24) as u8,
        (s >> 32) as u8,
        (s >> 40) as u8,
        (s >> 48) as u8,
        (s >> 56) as u8,
    ];
    return u8_array;
}

Now that we have our conversion and split functions, we can write our bit_padding function:

fn bit_padding(input: &str) -> Vec<u8> {
    let mut input_vector: Vec<u8> = convert_str_to_vec(input);

	// we multiply our length by 8 to get our length in bits instead of bytes
    let bit_length: u64 = (input.len() as u64) * 8u64; 

    // 128_u8 is the equivalent of padding 1 as an unsigned 8-bit integer
    // with lower-order bits first. (i.e. 10000000 in binary)
    input_vector.push(128_u8);
    
    //check if bit length % 512 is 448 (64 less than 512)
    while (input_vector.len() * 8) % 512 != 448 {
        input_vector.push(0_u8); // push in another 8-bit 0 padded value until the correct
                                 // result is reached;
    }

    let length_bits_as_u8_array = split_u64_to_u8_array(bit_length);
    // extend our vector with our length bits
    input_vector.extend(length_bits_as_u8_array);

    return input_vector;
}

Initialize Buffers and Create A U32 Vector

Now that we have our vector bytes and our message divisible by 512, we can begin the process of computing our message digest.

The first thing we need to do is initialize our 4 32-bit buffers, a,b,c, and d. I did that like this:

    let mut word_a = 0x67452301u32;
    let mut word_b = 0xefcdab89u32;
    let mut word_c = 0x98badcfeu32;
    let mut word_d = 0x10325476u32;

We have the bytes in such a way that the lower order bytes are first, so we are essentially swapping the position of each byte. For example: 67 comes at the beginning of the initialization, not the end.

Next we want to construct our 64 (or in our case 65) element table.

let table = construct_value_table();

With all of that, we are ready to start creating our message digest.

MD5 works by processing each 512 bit chunk as 16 "words". So in this case, each "word" will equal 32 bits. (32 x 16 = 512)

Rust has a great vector method called chunks_exact_mut, which lets us give it a byte size, and it will chunk our vector into that amount of bytes as a mutable slice. We will give it 64 bytes as a parameter, which will let us process our 512 bit chunk.

Also, since we need to process each 512 bit chunk as 16 words, we need to convert our u8 chunk into a u32 vector chunk. So let's write a function to do that:

fn convert_u8_chunk_to_u32(chunk: &mut [u8]) -> Vec<u32> {
    let mut x: Vec<u32> = Vec::new();

    let mut count = 0;

	// a temporary vector to hold each u8 piece of
	// our new 32 bit ints
    let mut temporary_vec: Vec<u8> = Vec::new();
    
    // iterate over our block and take
    // our 8 bit ints and convert them to
    // 32 bit integers
    for i in 0..chunk.len() {
        temporary_vec.push(chunk[i]);
        count += 1;
        if count == 4 {
            let temp_arr: [u8; 4] = vec_to_array(temporary_vec.clone());
			// construct an unsigned 32 bit int from our temporary array
			// (most modern systems use little endian, so we can use from_ne_bytes to convert)
            let value = u32::from_ne_bytes(temp_arr);
            x.push(value);
            // reset the count and temp vector after each new u32 is added to our return vector
            count = 0;
            temporary_vec.clear();
        }
        // we know this will always process every 8 bit chunk because our total amount is divisible by
        // 512, which is also divisible by 4
    }
    return x;
}

We also write a function for this piece of code called vec_to_array, which takes in our vector and converts it into an arrays so that we can use from_ne_bytes

// utility function to convert a vector of type T with size N into an array of type T with size N.
fn vec_to_array<T, const N: usize>(v: Vec<T>) -> [T; N] {
    v.try_into()
        .unwrap_or_else(|_v: Vec<T>| panic!("error converting vector to array - sizes don't match"))
}

But before we move further with this portion of the code, we need to write our round functions that actually do all our bitwise operations.

Round 1-4 Functions (and macro time)

Now that we have this, we also need each of the functions for our rounds. This is mostly a mechanical undertaking, since there are 64 total operations we need to do, and we need to specify each manually, since each of them are different.

This a perfect use case for a rust macro!

We can specify each operation as a macro, and then specify which registers we need to use on each macro call. So let's write a function called round_one_operation


/**
* Round 1 function is specified as 16 operations using the following:
* Let [abcd, k s i] denote the operation -
* a = b + ((a + f(b,c,d) + x[k] + table[i]) <<< s);
* where x[k] is a 32 bit chunk of our original message,
* table[i] is an entry generated by our sine function, s is an unsigned
* integer, and a,b,c,d are our 32 bit registers we perform operations on.
*/
fn round_one_operations(
    mut a: u32,
    mut b: u32,
    mut c: u32,
    mut d: u32,
    table: &Vec<u32>, // pass in refs to our u32 vec + our table
    x: &Vec<u32>,
) -> [u32; 4] {

	// we are using wrapping_add instead of normal addition because it is 
	// easy to overflow during these operations, and we don't want that to 
	// happen. we just want them to wrap back around instead of an overflow
	// error.
    macro_rules! round1 {
        ( $a:ident, $b:ident, $c:ident, $d:ident, $k:expr, $s:expr, $i: expr ) => {
            $a = $b.wrapping_add(
                ($a.wrapping_add(f($b, $c, $d))
                    .wrapping_add(x[$k])
                    .wrapping_add(table[$i]))
                .rotate_left($s),
            )
        };
    }

// each of these values are taken directly from the RFC, which shows you which operations
// to run.
    round1!(a, b, c, d, 0, 7, 1);
    round1!(d, a, b, c, 1, 12, 2);
    round1!(c, d, a, b, 2, 17, 3);
    round1!(b, c, d, a, 3, 22, 4);

    round1!(a, b, c, d, 4, 7, 5);
    round1!(d, a, b, c, 5, 12, 6);
    round1!(c, d, a, b, 6, 17, 7);
    round1!(b, c, d, a, 7, 22, 8);

    round1!(a, b, c, d, 8, 7, 9);
    round1!(d, a, b, c, 9, 12, 10);
    round1!(c, d, a, b, 10, 17, 11);
    round1!(b, c, d, a, 11, 22, 12);

    round1!(a, b, c, d, 12, 7, 13);
    round1!(d, a, b, c, 13, 12, 14);
    round1!(c, d, a, b, 14, 17, 15);
    round1!(b, c, d, a, 15, 22, 16);

    return [a, b, c, d];
}

The way the Rust macro works is that when it comes time to run each operation, it's going to substitute the identifiers and expressions in our macro with the parameters that we provide in the macro call. Macros are a common and popular feature of Rust, even used to do things such as print to console (println!("hello world") is a macro!).

I'm not going to list each round function here because its mostly mechanical and takes up a ton of lines, but if you want to see each round function, check out the repository: https://github.com/mdimovich/rusty_md5/blob/main/src/lib.rs#L116

Putting it All Together + Output the Result

With our round 1-4 functions done we can build our md5 function. The last thing we need to is put everything together. Going back to when we were creating our u32 vector, we also need to set AA, BB, CC, and DD to A,B,C, and D respectively. This is to save the initial values of the buffer for the end.

We also need to add AA, BB, CC, and DD to A,B,C, and D once all the round 1-4 operations are complete. Once again for that, we have to use wrapping_add to prevent overflow from happening.

Let's write out our function that does all this work.


fn compute_md5_digest(mut v: Vec<u8>) -> String {
    // as described in the rfc,
    // 4 32-bit words initialized as fixed constants.
    let mut word_a = 0x67452301u32;
    let mut word_b = 0xefcdab89u32;
    let mut word_c = 0x98badcfeu32;
    let mut word_d = 0x10325476u32;

    // construct the 64 element constant table.
    let table = construct_value_table();

    // let M[0 .. N-1] = words of resulting message, where N is multiple of 16
    for chunk in v.chunks_exact_mut(64) {
        let x = convert_u8_chunk_to_u32(chunk); // this is where we are building our u32 vector

        // set all values of a,b,c,d to aa,bb,cc, and dd respectively.
        // this is to save the initial values.
        let word_aa = word_a;
        let word_bb = word_b;
        let word_cc = word_c;
        let word_dd = word_d;

        // execute round 1
        let result = round_one_operations(word_a, word_b, word_c, word_d, &table, &x);
        word_a = result[0];
        word_b = result[1];
        word_c = result[2];
        word_d = result[3];

        // execute round 2
        let result = round_two_operations(word_a, word_b, word_c, word_d, &table, &x);
        word_a = result[0];
        word_b = result[1];
        word_c = result[2];
        word_d = result[3];

        // execute round 3
        let result = round_three_operations(word_a, word_b, word_c, word_d, &table, &x);
        word_a = result[0];
        word_b = result[1];
        word_c = result[2];
        word_d = result[3];

        // execute round 4
        let result = round_four_operations(word_a, word_b, word_c, word_d, &table, &x);
        word_a = result[0];
        word_b = result[1];
        word_c = result[2];
        word_d = result[3];

        // at end of loop iteration, add original word
        // to the current word value
        word_a = word_a.wrapping_add(word_aa);
        word_b = word_b.wrapping_add(word_bb);
        word_c = word_c.wrapping_add(word_cc);
        word_d = word_d.wrapping_add(word_dd);
    }

    // format and return the final result, which
    // is a 128-bit digest string.
    let message_digest = format!(
        "{:08x}{:08x}{:08x}{:08x}",
        word_a.swap_bytes(),
        word_b.swap_bytes(),
        word_c.swap_bytes(),
        word_d.swap_bytes()
    );
    
    return message_digest;
}

With that, we can use our main function to process input arguments and output the result!


fn md5(input: &str) -> String {
    let input_vec = bit_padding(input);
    return compute_md5_digest(input_vec);
}

// used to remove the first element of a vector
fn remove_first(vec: &mut Vec<String>) -> Option<String> {
    if vec.is_empty() {
        return None;
    }
    Some(vec.remove(0))
}

fn main() {
    let mut input_args: Vec<String> = std::env::args().collect();

	// we take the first element out of our input args, 
	// because this will be the program itself
    remove_first(&mut input_args);

    // join multi-word strings together by spaces
    let input = input_args.join(" ");

    let result = md5(input.as_str());
    println!("{}", result);
}

Conclusion

And just like that, we have a working version of MD5 implemented in Rust! If you are interested in trying this yourself, check out the repo and use the unit tests I wrote there (in tests/tests.rs folder).

I am relatively new to Rust, so i'm sure there are plenty of improvements that could be made here, and I also know there are crates such as byteorder that probably would've helped with some integer conversion, but I enjoyed building it and learned a lot in the process.

I hope you enjoyed reading about my journey building md5 from scratch.

References

Rusty MD5 Repo (my implementation) : https://github.com/mdimovich/rusty_md5/
RFC 1321 - The MD5 Message-Digest Algorithm: https://datatracker.ietf.org/doc/html/rfc1321
The Rust docs: https://doc.rust-lang.org/book/