TOBTU

It's rot13 of the acronym of "go big or go home"
It was either this or NewFolder9.com because that's the name Windows gave it :).

This site will be the home of the largest "instant" MD5 database 2 ^ 40.028 passwords "Omni-5." I'm implementing the database as a lossy hash table (LHT). Basically the name says it all it is a hash (lookup) table that is lossy. I wanted it to be done today (April 1st, 2010), but I lost the weekend to helping a friend move and I can't add (26+26+10=52 right?). Anyway I finished step one which took just under 10 hours. Step two will hopefully only take 20 hours. I should be able to start step two soon. The total size of the database will be 2.96 TB but I'm thinking of going for 3.38 TB just to make it faster and I doubt that 2.96 TB can fit on 3 TB of disk with a file system (ext2) and I don't want to go with raw disk right now.

The layout:
(Indexed bits, mini indexed bits, password bits)
(40, 33, 19) is 2.96 TB and hoping for under 16 ms disk + 64 ms work (single threaded)
(40, 33, 22) is 3.38 TB and hoping for under 16 ms disk + 8 ms work (single threaded)
Hmm I can cut disk time in half with 18 GB of SSD.

1.981 bits/password for the main index
0.125 bits/password for the mini index
19 or 22 bits/password for the password

Total bits/password are:
21.106 bits/password or 24.106 bits/password

My database will contain the passwords in "Omni-5"
Key Spaces:
Omni-5: 2 ^ 40.028 (1.12 Trillion [1,121,284,146,783])
Omni-6: 2 ^ 45.506 (50.0 Trillion [49,973,328,108,832])
Omni-7: 2 ^ 51.259 (2.69 Quadrillion [2,694,058,161,297,220])

* - 95 characters: space through ~
M - 62 characters: A-Z, a-z, and 0-9
m - 52 characters: A-Z and a-z
n - 36 characters: a-z and 0-9
a - 26 characters: a-z
0 - 10 characters: 0-9

Omni-5      Omni-6        Omni-7
*
**
***                       *****
****        *****         ******
*****       ******        *******
MMMMMM      MMMMMMM       MMMMMMMM
Mnnnnnn     Mnnnnnnn      Mnnnnnnnn
maaaaaaa    maaaaaaaa     maaaaaaaaa
00000000    000000000     0000000000
000000000   0000000000    00000000000
0000000000  00000000000   000000000000
            000000000000  0000000000000
                          00000000000000
maaa0000    maaaa0000     maaaaa0000
maaaa000    maaaaa000     maaaaaa000
maaaaa00    maaaaaa00     maaaaaaa00
maaaaaa0    maaaaaaa0     maaaaaaaa0
maaaa0000   maaaaa0000    maaaaaa0000
            maaaaaa000    maaaaaaa000
                          maaaaaaaa00 (maybe, this one is 40.31% of
                                       the total key space so will see)

The plan (this all depends on a few things like funding and cryptohaze.com):

The Omni-6 rainbow tables will be free the lossy hash table (if I generate it which is not an easy task) will cost something. The Omni-7 rainbow tables will also cost something. The only reason I'm thinking of spending $4,000 for servers and colo is the fact that I might recoup my losses in maybe 6 months. Anyway they'll be very cheap for those that help generate.

"Why pay for something I can generate on my own?" Last time I looked there aren't any tables of this quality, size, with this reduction function, or format and it has been 8 years. IRT "Indexed Rainbow Table" better than 1/2 the size of the RT format and IPRT "Indexed Perfect Rainbow Table" about 1/3 the size of the RT format (IPRT tables are required to be perfect).

Yeah, yeah I know brute force style passwords are not all that good for guessing passwords, but this is just a proof of concept and I am looking at generating 2 ^ 40 of the most likely passwords. Probably going to do something like Matt Weir's "Probabilistic Password Cracker."

I need to go through these again to make sure I didn't foo bar something. I think the phpBB stats are for unique passwords and RockYou stats are including duplicates.

Omni-5 can crack:
??.??% of phpBB's unsalted passwords
68.38% of RockYou's passwords

Markov 283 max length 11 can crack:
77.57% of phpBB's unsalted passwords
82.21% of RockYou's passwords

  1. TOBTU.com "Well, it looks like I'm on top."
    MD5 and 16 "bit" MD5
    1,121,284,146,783 each (2 ^ 40.028)
    2,242,568,293,566 total (2 ^ 41.028)
    (16 "bit" MD5 totally doesn't count but I did set it up with it in mind as all I rearrange the MD5 so it's BCAD instead of ABCD because 16 "bit" MD5 is just BC)

  2. cmd5.org (paid service)
    MD5, Double MD5, Double Binary MD5, and SHA1
    462,921,590,751 each (2 ^ 38.752)
    1,851,686,363,004 total (2 ^ 40.752)
    (This could be just three databases Double MD5, Double Binary MD5, and SHA1 since plain MD5s can be searched in either the Double MD5 or Double Binary MD5 databases)

  3. c0llision.net
    LM, NTLM, and MD5
    10,578,558,976 LM (2 ^ 33.300)
    295,413,219,328 each (2 ^ 38.104)
    601,404,997,632 total (2 ^ 39.130)

  4. MD5Decrypter.co.uk
    NTLM, MD5 and SHA1
    70,000,000 SHA1 (2 ^ 26.061)
    5,124,854,698 NTLM (2 ^ 32.255)
    7,355,145,497 MD5 (2 ^ 32.776)
    12,550,000,195 total (2 ^ 33.547)

  5. TMTO.org (current size is unknown but it had this at one time)
    MD5
    308,288,137,504 total (2 ^ 38.165)

All the sites I've found are either smaller or don't say how many passwords are in their database.