A while back, I was telling you about my attempts to crack a cipher left behind by a serial killer who called himself The Zodiac.
Like the other people who have worked on it, I haven't broken the cipher yet. I realize there are authors out there who claimed to have solved the cipher, but the solutions I've seen don't read clearly. The killer's other ciphers did, and I'm betting that when this one is cracked it will have a much more readable plaintext than those I've seen.
I walked away from the project for a while because, to be honest, I got a little bored and burnt out from it.
I had some ideas for improving the algorithms I was using, so I started again recently. I re-wrote my program in Perl so that I could run it easily on every computer I had ready access to (PC, Mac, Linux, etc.) without having to reinvent it. Interestingly, it sped up a fair amount in the process.
In addition to rewriting the program, I also figured out some optimizations that dramatically reduced the size of the possible range of solutions and improved my scoring system for possible decodes.
In addition, I realigned the way I numbered the symbols and the order of evaluation so that I'm working on the most common symbols in the message first, and the least common last. This should increase the "bang for the buck" for the CPU cycles I'm consuming.
Right now, my HP Pavilion DV8000t Windows XP Pro Intel Core Duo 2.16GHz laptop is cranking away at the program. It's trying to sort out the 10 most common symbols in the message. The solution set I'm testing for those symbols contains approximately 4,000,000 squared possibilities. For each combination, I partially decode the message and compare that decode to a dictionary of around 2,900 words that the killer used in his writings. The more words I find, the more characters I use out of the 340 symbols, and the higher the average word length, the better the score.
The laptop is testing a little more than 20,000 solutions a minute. My main desktop tests about as many. My other two desktops test about that many. Some other systems running the script for me are able to test another 100,000 or so a minute. So I'm doing something in the range of 150,000 tests per minute. While that sounds like a lot, at this rate it will take over 4 million days (12,000 years!) of constant operation at that level to test all the possibilities.
I'm hoping to sort out a way to turn this into a distributed computing project that I can share through the web site.