Duplicate Hash Codes Created Due to Weakness of MD5
The MD5 message-digest algorithm has for years been causing issues and is back again showcasing its weaknesses. Security researcher Nat McHugh on his blog shows how he was able to create two images with matching hash addresses (e06723d4961a0a3f950e7786f3766338).
Using what is known as a birthday attack he was able to give the images the matching hash codes for the low price of 65 cents and about 10 hours of time on an AWS large GPU instance. If interested in the birthday attack and where the name comes from, it refers to what is known as the birthday problem or birthday paradox. Essentially, it means that as approaching a certain number of people it becomes increasingly likely for two people to have the same birthday.
The birthday attack approach isn’t considered a total brute force attack. A brute force attack would check for all possible values until it found the correct possibility. Birthday attack checks random numbers until it gets closer and closer mathematically slowly eliminating what it isn’t rather then what it is. It then takes the parts of the hash code it has found to match and checks for the remaining unmatched hash until it succeeds.
This isn’t the first time MD5 weaknesses have been taken advantage of. Back in 2012 in the United States and Israeli governments in a project called Flame infiltrated Middle Eastern countries with malware. It matched the Windows update with malware to infect the desired countries. It was believed to be around and working since at least 2010. The two main countries infected by Flame were Iran and Israel specifically Palestine, but Sudan, Syria, Lebanon, and Saudi Arabia were infected as well. Originally this task took (when emulated) the combined power of 200 PlayStation 3s and two days of processing time roughly costing 20,000 dollars to run on Amazon’s Elastic Compute Cloud (EC2). A cost when compared with Matt McHugh’s hash collision of 65 cents just four years later seems astounding.
Marc Stevens has created a program to run if you’re worried a file may have hidden collisions. It would be worth considering using SHA-2, but to probably skip over using SHA-1 algorithms which have had security issues as far back as 2005. So many issues in fact the US Government stopped using SHA-1 back in 2010. McHugh would sum up his blog saying,
“So I guess the message to take away here is that MD5 is well and truly broken.”
Marjin Grooten, author and security researcher might’ve put it best,
“If you can’t even distinguish between Barry White and James Brown, it’s time to send MD5 to hashing algorithm heaven.”