Home Computers in General
Computers in General
|
Written by Michael Salsbury
|
|
Friday, 05 May 2006 |
|
As I mentioned in yesterday's article, I am working to crack the "340
cipher" sent to police by the Zodiak Killer, who operated in the 1960s
and 1970s in California. I also mentioned the assumptions I've made
about the message (which could well be wrong) and the staggering size of
the potential solution space. Clearly I needed to shortcut that 100-year
process as much as possible since it's very unlikely I'll live to be 140
to see the end of it.
Some of the shortcuts I can take include:
-
I know that all of the symbols can't translate to the exact same
letter, though it's highly likely that several of them do represent
specific letters. Thus, I can (probably) discard any potential message
key that has "too many" of the same letter. That reduces the size of
the solution space a good bit.
-
I know that when the message is cracked, it's highly likely that there
will be a pretty standard breakdown of the letters as seen in typical
English texts. By spending a minimal amount of time on any key that
generates a "possible solution" of the message whose character
breakdown is too much "off" from that breakdown, I can speed up my
trip through the solution space.
-
I know that when the message is cracked, it should contain a certain
percentage of the most popular bigrams (2-letter combinations) and
trigrams (3-letter combinations) found in English texts. By checking a
possible solution against those percentages, I can avoid wasting time
on "solutions" which are filled with unlikely bigrams (such as "QZ")
and trigrams ("QZQ").
-
I know that the message is most likely written in English, so I can
build a dictionary of the English language from online sources and
compare any possible solution which has the right breakdown of
characters, bigrams, and trigrams against that dictionary to see how
many real words are in the message. The more words we find in the
possible solution we're looking at, the more likely I'll have found
the "right" solution.
-
Based on my analysis of the enciphered message, there are some symbols
that occur too frequently to be likely to be letters like Z, Q, or X.
When trying potential keys, I can discard those which are attempting
to replace those symbols with characters they're unlikely to be.
Not being a mathemetician (I blame the lousy Calculus teachers I had at
Syracuse University for squelching my confidence in my ability to do
math), I have no idea just how much the above will cut down my solution
space. Still, I expect it slices the space down pretty handily overall. To
implement the above rules, I developed a scoring system that requires a
potential solution to pass through a number of "gates" before going on to
an analysis that is more thorough or computer time consuming. The scoring
method works something like thi
-
I generate a possible message key.
-
I attempt to decrypt the message using that key.
-
I run the potential solution through a character counter to verify
that it has approximately the right number of the "most common"
characters. If not, I move on to the next key.
-
I compare the number of individual characters found against their
typical frequencies in English. If there are too many or too few of a
given character than expected, I move on to the next key.
-
I compare the message against the most common bigrams and trigrams
used in English. If those don't occur in approximately the right
proportions, I move on to the next key.
-
I compare the message to a dictionary of 20,000 English words. I
weight the scoring in favor of larger words, and heavily in favor of
words the killer used most frequently (especially those he liked to
misspell). The more words I find and the larger the words are, the
higher the "score" I get for the message.
-
I analyze the percentage of the characters in the solution that are
"swallowed up" by the words I found in the message. The higher the
percentage of "words to overall characters" the more likely this key
is to have broken the message, so the higher the score will be.
In the next installment, I'll talk more about the program's logic to try and accomplish the above.
|
|
Last Updated ( Thursday, 04 May 2006 )
|
|
|
Written by Michael Salsbury
|
|
Thursday, 04 May 2006 |
|
I've not been posting a lot of new articles into the blog lately, and I
thought it was about time I explained why. Aside from the fact that
things have gotten busier at the office, and at home, I've also been
channeling what little energy I do have into a few projects. First is to
publicize my spam-inspired cartoon site, next is to publicize my site to
help bloggers find ideas, and finally (which is the point of this little
missive) to try and crack a very old cipher written by a serial killer
about 30 years ago.
The serial killer in question is the Zodiac Killer, who operated in the
San Francisco area in the 1960s and 1970s. He killed an unknown number
of people, but took credit for double-digit numbers. In spite of the
fact that he taunted police by writing letters to them and to the news
media, he was never caught. The last communication known to be
received from him was a 340-character cipher which has never been
cracked (at least it isn't publicly believed to have been cracked). I
decided to take a crack at it.
I should begin by stating that I am not a cryptographer or any kind of
an expert in the subject. I'm a computer geek, to be sure, but have no
special training or background in such things. Regardless, I do have a
morbid curiosity to know what this cipher says and what it might reveal
about the killer. I'm also very curious to see if I can design and write
a program which will crack this cipher.
Having read a bit about cryptography, I know that there is a pretty
consistent frequency with which letters appear in English texts. I know
that there are also certain pairs of letters which tend to appear
together ("bigrams") and certain 3-letter combinations which tend to
appear more frequently together ("trigrams"). Cryptanalysts uses this
information to help them find the key used to decrypt messages. I've
found and made use of this same data in the work I've done thus far.
I began by analyzing the known writings of the Zodiac Killer, verifying
that his letter frequencies match typical English letter frequencies
(they do), that the bigrams and trigrams in his writing occur
approximately the same as in normal English texts (they do), and
building a list of his "vocabulary" used in previous messages. Armed
this this information, I was fairly confident that if in the future I do
crack this cipher, any computer program I write should be able to use
standard cryptanalysis tactics to identify a break.
The encoded message in question is referred to by analysts of the Zodiac
Killer as "the 340 cipher" because it is written as 20 rows of 17
symbols long (20 x 17 = 340). There appear to be 62 individual symbols
and/or letters used in the message. It is likely that the Zodiac used
the "extra" symbols to make it harder to identify the most commonly used
letters in English (e.g., he may have used 4-5 symbols to represent the
letter "E" and the letter "T").
Before I could begin instructing a computer to attack this cipher, I had
to make some assumptions about it, which I fully recognize could be
completely wrong. Still, I had to start somewhere if I was going to
break the thing. My working assumptions at this point are the following:
-
The killer's previous ciphers were all simple substitution ciphers
(e.g., the killer substituted one letter or symbol for another, and
any time he used the same symbol it meant the same letter).
-
The killer's previous ciphers are all written in English, and thus
this cipher is also in English.
-
The cipher contains an actual message and isn't just random scribbling
that the killer sent to annoy the police and cryptanalysts.
-
When properly deciphered, the message will yield a string of words
with no punctuation in them, just like the killer's prior ciphers.
In the next article, I'll discuss the method I used to build a program to try to crack this cipher.
|
|
Last Updated ( Thursday, 04 May 2006 )
|
|
|
Written by Michael Salsbury
|
|
Thursday, 20 April 2006 |
|
I remember not that long ago when PeoplePC advertised the sale of
personal computers, probably used or refurbished ones, at
somewhat-lower-than-normal-retail pricing. This, as I recall, was why
they were called "PeoplePC" and not "PeopleInternet" or something along
those lines.
Today, in 2006, PeoplePC sells AOL-like Internet service. As near as I
can tell, that's all they're selling. I can't even find a reference to
the fact that they once sold used PCs. Am I imagining it? I don't think
so. They used to run television commercials about it.
Here's all the evidence I could find on the Internet.
Picture of a truck showing boxes, presumably holding PCs. (You can't
really tell because of the resolution.):
And this old scan of an article from PCWorld:
If you go to the PeoplePC.com web
site today, you don't see any reference to the option of getting a PC
from them.
However, there's some indication that it might still be possible if
you're willing to pay for it, at this
web page.
I went part way through their sign-up process, as far as I could go
without actually setting up an account or giving them my real name and
number, and I didn't see any reference to buying PCs.
I guess I'm not surprised. The days of buying Internet service bundled
with PCs are over, I think. A few years ago you'd see deals where you
could get a PC "nearly free" if you committed to two years worth of
Internet service at some inflated price.
|
|
Last Updated ( Saturday, 22 April 2006 )
|
|
|
Written by Michael Salsbury
|
|
Thursday, 20 April 2006 |
What is a Domain Name?
If you look at your web browser's address or location bar, you'll
usually see in it a URL like "http://www.mikesalsbury.com" (the URL for
this site). The "domain name" for this site is therefore
"mikesalsbury.com". Since that domain name is already registered to me
(and will be for some time), you can't register that one for your own
site, even if your name happens to match mine. But you could register
some other domain name that you like, such as "mike-s.com" if that's
available. All that domain names really do is make it easier for human
beings to remember the address of your web site. Without domain names,
we'd have to give people URLs like "199.205.42.113" to find our sites,
which wouldn't be as easy to remember as "mikesalsbury.com" or
"spamtoons.com".
How Do You Get a Domain Name?
Getting a domain name is actually pretty easy. You find a company that
has the authority to register domain names with one of the Internet
authorities, pay them a small fee, and they'll register the name for
you. This assumes that the name you want is not already registered to
someone else.
Once you've registered a domain name, it's yours for at least one year.
Some registrars allow you to register the domain name for several years
in advance. Pricing can vary greatly. Some registrars will allow you to
register a ".com" domain name for as little as $2.99. For example, Yahoo
Small Business is currently allowing new customers to register domain
names for $2.99 for the first year. GoDaddy.com offers domain
registration for $1.99 if you purchase some other product, such as their
web hosting services. A quick Internet search should reveal any number
of registrars and prices.
|
|
Last Updated ( Monday, 24 April 2006 )
|
Read more...
|
|
|
Written by Michael Salsbury
|
|
Monday, 03 April 2006 |
|
I recently started a little side project to this blog, a cartoon site
called Spamtoons.com. I've been
cobbling cartoons together to this point using my very meager skills
drawing in The GIMP and Inkscape
along with some free and public domain clip art. While that's working,
I've had some ideas that require me to create some original stuff of my
own. It seemed like I needed a drawing tablet.
I did some searching on eBay and elsewhere on the web and just couldn't
find a really good deal. I was surfing some of the "green light
specials" on Geeks.com when I
found the Aquila-L1
Graphics Tablet with Cordless Stylus Pen for $17.99 (normal
price $21.99). I decided it was worth risking $20 on it. It arrived
Saturday and I've already done quite a bit of work with it.
Below is a photo of the unopened Aquila-L1 package:
 The tablet in its packaging
|
|
Last Updated ( Friday, 21 April 2006 )
|
Read more...
|
|
| | << Start < Previous 1 2 3 4 5 6 Next > End >>
| | Results 10 - 18 of 50 |
|
|
|