During the last few years, Hashes.org’s database has slowly grown to approximately 1.1 billion entries (as of Oct. 2017). The scaling itself is not a problem, requests for hashes are still pretty fast and the backend runs stable but, another problem has constantly appeared over time:
Some people are uploading/searching “bad” hashes and as a result, the left lists are filling with hashes that are unlikely to ever be found. Such hashes are for example file checksum hashes or specially modified hashes. Some users think it’s a good idea to submit salted hashes with the salts removed. These hashes can only be found if we can get hands on the original salted hashes or by using MDXfind. Hashes that are searched via the input field are imported as a single hash instead of grouped together. We are unable to remove these hashes as an entire group if they are “bad” hashes. ie. Salted hashes without salts. In the event, we identify bad hashes it would be nice to be able to remove the entire source that imported them in the first place. If one hash is bad it’s likely that the rest of the hashes imported from the same source are bad as well.
The total number of unfound hashes has now grown over 100 million at the time of writing this and we suspect that a significant part of them are unresolvable. This makes it harder to crack the potentially resolvable hashes as some crackers are not able to load a hashlists of this size or if they can load it cracking performance suffers.
Due to the above-expressed concern, I have decided to change how Hashes.org works. I am introducing Hashes.org version 3.
No Single Hashes & Everything in lists
In order to manage junk or bad input, the hashes are only saved in the database when they are uploaded as hashlist. This way it’s easy to clean up the database when we find a hashlist with bad or junk hashes. To optimize the work on hashlists (especially where it doesn’t make sense to put all uncracked hashes together, e.g. salted hashes) all hashlists are public. Each list will have the left and found files available for download.
Single hash searches can still be done but, the searched hashes will not be saved to the database. If someone wants to have the hashes stored persistently to the database they need to upload it as hashlist.
Reduced number of left lists
Previous versions of Hashes.org supported hashes of length 8, 16, 48, 64, 96, and 128 these left lists did not contain a lot of hashes. Therefore, the new version of Hashes.org currently builds the left lists for lengths 32, 40 and 128. If there is a need in the future for left lists of other lengths this can be added at a later date.
The minimum hash length required on the new version is 13 (DES support), shorter hashes produce collisions very easily and might be problematic. Collisions have multiple solutions for the same hash and it’s not possible to determine which is the correct solution.
The tools section from the old version is removed as they were not used often and they were highly inefficient. Most of the functionality of the tools can be done with simple command line tools.
We were able to clean up the founds by cracking nested and salted hashes to produce a better quality list. This is a constant process of analyzing solved hashes and checking if the solution is the “actual” answer or if it’s just a nested hash which needs to be cracked further. With the new version, it’s easier to replace these hashes and update them with the correct password and algorithm. This ultimately results in a better cleaner plaintext list.
The API always was a bottleneck on the server. The new version will not have an API available but, there are plans to bring it back in the near future. If there is an urgent need for the API (or usage information) for applications that heavily depend on Hashes.org please contact me.
Hashes.org now supports the TESTVEC plain format introduced by MDXfind. With this new feature, you can create repeating strings multiple megabytes in size.
$TESTVEC[6861736865732e6f7267 x 10000]
This would mean that the string ‘hashes.org’ is appended ten thousand times and this is then used as the string to create the hash. The string is provided in hex and can be 1 byte or longer. The maximum accepted number to repeat the string is 10 million.
Hashes.org now supports even more basic algorithms, mainly non-hex hashes, to extend the functionality of the site. The new algorithms are:
- PBKDF2-SHA512 (hex version with $ml$ signature)
I would also like to point out there are some special modifications for algorithms. This includes outer loops and base64 encoding. The way algorithms are handled is still the same and is described here.
Voting & Reporting
The new hashlist rating system is something every hashes.org user has access to. Each user can give up/down-votes to a hashlist. Users can write reports about hashlists, in turn, these reports can then be read by moderators. This helps us identify bad hash lists. When users discover the hashes in a particular list cannot be recovered or they can confirm that the list is missing salts the list can be reported and the moderators will take action to remove the “bad” hashlist.
Some Numbers and Stats
The database for hashes.org v2 used ~1.1TB of SSD storage. The textual export of all data was approximately 128GB in a CSV format. There were 61 public and 725 private hashlists managed (and 4’047 deleted private hashlists which were removed by users).