AWS Lambda Python Hash Producing Different Results

Here is a short and sweet guide on what to do if your Python Hash is producing different results!

AWS Lambda Python Hash Producing Different Results

When I deployed my Flask web application to AWS using Zappa, I discovered that users could not log in.

After reading the logs, I realised that the checksums generated whenever a user logs in, were different despite the constant variables (username and user ID).

user_name = 'jake'
user_id = '1001'
checksum = hash(user_name + user_id)

print(checksum)
# 3631206479466447074
# Your result will differ, read more to understand!
Snippet of the code to generate the checksum

I wondered why.

Following a consultation with Google, I came across this Stack Overflow article.

"Python uses a random hash seed to prevent attackers from tar-pitting your application by sending you keys designed to collide...By offsetting the hash with a random seed (set once at startup) attackers can no longer predict what keys will collide."

Here is a further explanation of the above paragraph!

Zappa works in tandem with AWS API Gateway and AWS Lambda. For each request to the site, API Gateway sends the request to a Lambda worker. The worker could be a different worker that handled the previous request and, as such, has a different random seed generated at startup.

How did I overcome this?

I cannot force all requests to be routed to a single Lambda worker (this will result in a bottleneck and affect the scalability). Moreover, all Lambda workers are ephemeral, meaning that they last for a short period of time and dies when they are no more requests.

A quick and easy way is to change the hash function. Instead of using the native hash function, you can utilise the hashlib library to generate a md5 hash. Regardless of where you run this code, you will always generate the same hashed results.

import hashlib

user_name = 'jake'
user_id = '1001'

checksum = hashlib.md5((user_name + user_id).encode('utf-8')).hexdigest()

print(checksum)
# cff72a0287fe0d6e40a60fd71c71d116
Snippet of code to generate checksum with md5

Thats all! Let me know if the solution works for you.

Photo used:  Liudmyla Denysiuk on Unsplash