|
A while back, I was telling you about my attempts to crack a cipher left behind by a
serial killer who called himself The Zodiac.
Like the other people who have worked on
it, I haven't broken the cipher yet. I realize there are authors
out there who claimed to have solved the cipher, but the solutions
I've seen don't read clearly. The killer's other ciphers did,
and I'm betting that when this one is cracked it will have a much
more readable plaintext than those I've seen.
I walked away from the project for a while
because, to be honest, I got a little bored and burnt out from
it.
I had some ideas for improving the
algorithms I was using, so I started again recently. I re-wrote my
program in Perl so that I could run it easily on every computer I
had ready access to (PC, Mac, Linux, etc.) without having to
reinvent it. Interestingly, it sped up a fair amount in the
process.
In addition to rewriting the program, I
also figured out some optimizations that dramatically reduced the
size of the possible range of solutions and improved my scoring
system for possible decodes.
In addition, I realigned the way I
numbered the symbols and the order of evaluation so that I'm
working on the most common symbols in the message first, and the
least common last. This should increase the "bang for the buck" for
the CPU cycles I'm consuming.
Right now, my HP Pavilion DV8000t Windows
XP Pro Intel Core Duo 2.16GHz laptop is cranking away at the
program. It's trying to sort out the 10 most common symbols in the
message. The solution set I'm testing for those symbols contains
approximately 4,000,000 squared possibilities. For each
combination, I partially decode the message and compare that decode
to a dictionary of around 2,900 words that the killer used in his
writings. The more words I find, the more characters I use out of
the 340 symbols, and the higher the average word length, the better
the score.
The laptop is testing a little more than
20,000 solutions a minute. My main desktop tests about as many. My
other two desktops test about that many. Some other systems running
the script for me are able to test another 100,000 or so a minute.
So I'm doing something in the range of 150,000 tests per minute.
While that sounds like a lot, at this rate it will take over 4
million days (12,000 years!) of constant operation at that level to
test all the possibilities.
I'm hoping to sort out a way to turn this
into a distributed computing project that I can share through the
web site.
|