Search
Enter Keywords:
Home
Teaching Google to Mambo PDF Print E-mail
User Rating: / 2
PoorBest 
Written by Michael Salsbury   
Monday, 11 July 2005
Earlier this year, I moved the feeble amount of existing content from my hand-generated HTML web site over to a new content management system, Mambo Open Source (MOS). Since then, I've set about adding new content (like this) on a fairly regular basis. I try to add something every day if I can, but at least every week. A couple of months later, when I did a Google search on some of my favorite topics from my site, nothing showed up. Needless to say, that disappointed me considerably. When I then asked Google to show me everything it had indexed from my site, almost none of it was from the new Mambo site, and virtually all of it was content that was from my previous "plain old HTML" site that no longer existed. Now I was really bummed. What to do? It was clear to me that Google needed to "learn how to Mambo", that is, I needed to help Google understand that there WAS real content on this site and exactly what content that was.  (Click "Read More.." below - if it's visible - to see what I've learned.)

Becoming More Search Engine Friendly

The first step in my quest was to look on the Mambo forums to see what others suggested I do to improve my "Google-ability". That led me to the "Search Engine Friendly URLs" feature of Mambo. To ensure that the URLs you use are "friendly" to search engines, login to your Mambo administrator page and click the "Global Configuration" icon. Under the "SEO" tab are two options. I suggest setting both of these to "Yes" if you want to help Google figure out your site. The first option will adjust Mambo to work with URLs that are easier for search engines like Google to digest. The second option will ensure that page titles match the contents, which will help Google to better categorize your site.

Giving Google a Map to Follow

After making this change and waiting a few weeks, I saw that Google had indeed picked up my new site. Asking Google for a list of all the pages from my site showed both the obsolete pages from the "pre-Mambo" days and some (but far from all) of the current pages. Clearly, I had more work to do. I went to Google's site to see if there was anything I could do to make it easier for them to find and index all my pages. Turns out, there is. See the following page on "sitemaps" from Google:

http://www.google.com/webmasters/sitemaps/docs/en/protocol.html

This page discusses a file format in XML that helps Google locate the pages on sites whose content management systems may not make indexing easy for them. After taking the time to code one of these files manually, I realized it was going to be a real pain to update on a regular basis. I went looking for tools. The first one I found was an official add-on component for Mambo. Unfortunately, this one only seemed to put about 28 pages in the sitemap file. There were well over 100 articles on my site, so I knew this was missing a lot. A little more digging turned up the following Java-based web tool that will "spider" your site for you and help you generate a sitemap file automatically:

http://www.auditmypc.com/free-sitemap-generator.asp

Using this tool, and some trial-and-error adjusting of its parameters, I was able to build a sitemap file that included all the content on my web site and save it to disk. I FTP'd that file up to my web server and told Google where it was. Then I waited a while and did another search for the content on my site that Google knew about. This time around, a huge chunk of my site was visible. It was now time to see if I came up where I thought was appropriate in the search results.  Unfortunately, that tool didn't remember some of the parameters I'd set for each page, and I decided to find something better.  That led me to GSitemap from Vigos:

http://www.vigos.com/products/gsitemap/

This Windows-based tool performs an initial spidering of your site and lets you specify sitemap parameters for each link.  It remembers those parameters between executions, so you don't have to enter your data twice.  It can also re-spider and update itself as needed.  The only problem I have with it, and this isn't the product's fault, is that Mambo sometimes gives it multiple URLs that point to the same article.  My sitemap only needs to specify a given article once, so I need to watch each time I re-spider the site to be sure that only one URL for a given page is included.  Fortunately, that's pretty easy.

Learning What Google Likes

As you'll see if you poke around this site a little bit, I've got a number of cigar reviews here that I've written. When I did a search on a given cigar name, followed by "review", I found that often my reviews weren't showing up at all, or were showing up way down the list. In one case, my review was about sixteenth in the list of search results. Looking at the 15 results above it, most of them weren't reviews of that cigar at all. They were ads to buy it or blogs that mentioned it. While being sixteenth on the list meant that someone would find me on the second page of search results, it didn't seem fair to be that far below other non-relevant results. So I asked myself the question of how Google decides where to rank one site versus another in its search results. Naturally, their exact methodology is a trade secret, but the general overview of what they do is covered in their patent filings with the US Patent and Trademark Office. So I had a look at one of the more recent ones and found that Google uses a number of criteria to decide where to put your site in its rankings:

  • Length of domain name registration:  If your domain name is only registered a year at a time, and it's relatively new, Google probably assumes that you aren't too serious (at least not yet) about publishing information on the web.  Many search engine spammers, for example, register a domain for a year, spam search results like crazy, then get out the following year because the domain is blacklisted with search engines.  Registering your domain name for more than a year should help push you up a little in the search results.
  • Inbound links to your site from others:  If your site has any valuable information on it, it's logical to assume that others will link to your site from theirs.  Thus, a part of Google's ranking system asks the question "how many other sites link to this one?" and adjusts your ranking if more sites link to yours.  Thus, participating in link exchanges can help, though I suspect Google's algorithms are smart enough to recognize a "pure" link exchange from a link exchange among related web sites.
  • Clicks to your site from the search results:  If users see your site in the search results and they tend to click on it, Google can infer from this that your content may indeed be relevant to the search criteria.  For example, if you're searching for information about the Model T Ford and you see a link for a "Model T Collector Info Site" you'll probably click on that rather than the link for a "Sanford Toaster Model T5" which might come up in the same results.  By noting that a number of people clicked on the collector site when searching for Model T information, Google can infer that your site is probably more relevant to that search than the toaster site which few or no people clicked on.  What we can learn from this is that our pages, when they show up in Google search results, should give the viewer a solid idea of why our page is relevant to their search (or not).  Google also appears to take notice of people clicking on the link to a page and then going back to the search results quickly, inferring from this that the page looked relevant but really wasn't.  Thus, it's to our benefit if people want to click on our page in the search results, but only if they stay there long enough to glean some useful information from it.
    Updates to your content:  Google monitors the number and frequency of updates to your site's content.  If you keep your site updated regularly, this will count in your favor more (in most cases) than if you only update it rarely.  If you're wondering how Google does this, try right-clicking on a link in a search result in Netscape (this doesn't work in IE) and pasting it into Notepad.  You'll see a link that looks like this:

    http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A//www.mtfca.com/&ei=HOQJQ5OmHbq4atGX3asO

    Where you expected a result like http://www.mtfca.com.  This link tells Google that you performed a specific search (probably encoded in that blob of letters at the end of the URL and that you chose to click on the URL for "mtcfa.com" out of the available search results.  From this, Google can infer that a "human being" decided that "mtcfa.com" is a good search result for "Model T Ford" (which it certainly is, since mtfca.com is a Model T Ford Collector Association site).

    Update 08//22/05: The folks at BoingBoing apparently noticed this behavior today, based on one of Cory Doctorow's blog entries there:  http://www.boingboing.net/2005/08/22/google_stealthily_mo.html
  • The "density" of keywords on your page:  If someone is searching for "polar bear information" and your page only references polar bears once, while another page mentions polar bears 50 times, Google is going to figure that your "polar bear" page isn't as relevant to the search as the other person's page because theirs is more "dense" with references to the phrase "polar bear".  What we learn from this is that it's important to load our text with all the keywords we think people would use to find it.  Sticking a paragraph or two of keywords and phrases at the end of the article might help boost its density if you can't logically improve the keyword density in the main text.

Applying the above knowledge to my site, I decided that there were a few things I could do:

  • Confirm that my domain name was registered for more than one year. It was.
  • Make sure I regularly updated my site and kept any time-critical content current. Fortunately the Mambo content management system makes this pretty easy to do.
  • Improve the "keyword density" of my articles. To do this, I began adding extra references to key search terms in the main body text, adding more keywords and keyword phrases in the meta tags, and even tacking on a paragraph of "stream of consciousness" keyword references at the end of the article for search engines to find, with a note indicating that humans shouldn't bother to read it.  In the long term, I want to find a better way to do this.
  • Increase the number of click-throughs to my page from search results. I did this by making sure the meta tags contained a very clear and concise description of what the page was about. Then, whenever I was on a different computer with a different IP address, I'd search out my pages and click on them in the results, staying there a while before doing anything else. This should have given Google's algorithms the impression that different people doing a search for those pages found mine of interest, moreso than the results above them.

That left me with figuring out a way to increase the number of in-bound links to my web site's content from other sites on the Internet. I went about this in several ways:

  • Since Mambo offers RSS distribution and I use that capability, I made sure that my site was linked on a number of the major RSS search engines and RSS discovery sites. That would get me some number of inbound links and perhaps some RSS subscribers. Robin Good's web site (http://www.masternewmedia.org/rss/top55/) helped there.
  • There are a number of good blog directories out there, and some well known blogs like BoingBoing, Fark, and Slashdot. I did my best to get my URL listed in the major directories and provide a link to my blog to the creators of the major blogs out there. Hopefully if I say anything interesting and they like it, I'll get some in-bound links and traffic that way, which will help.
  • Visiting a number of community forums for people interested in some of the same things I am, I was able to answer some visitors' questions with overviews of full-length articles on my web site (and including a link to the full-length article). By posting those answers and links on the forums to detailed pages on my site, I effectively created an in-bound link to my site on the forum's site. Since I chose the forums carefully so that they were related to my site (and didn't just "spam" every forum I could find), the information was received positively by forum regulars and some number of people were motivated to visit my site as a result. I will probably need to do a lot more of this to increase inbound traffic and inbound links.  This is kind of a win-win for everyone involved.  People get answers to their questions, I get inbound links (which makes Google recognize the relevance of my articles) and traffic back here.

The question you should be asking at this point is whether all this effort paid off. Did any of this improve my ranking in Google search results? Did it increase traffic to my site? Here's the answer...

One Article's Journey Through the Rankings

Using my review of the Perdomo Reserve Cabinet Series "P" maduro cigar as a benchmark, here's what happened along the way. For about the first month, Google didn't even show this page although it was on my site. When I added the sitemap file, the review came in around 16th in the results for the phrase "perdomo reserve maduro cigar". After dumping in a very nonsensical but "keyword dense" passage at the end of the review (with a note for human readers indicating that it was for search engines only) but with a "cross-through" on the text, the page dropped to 21st. Removing the cross-through formatting, bolding relevant words in the article and keyword dense paragraph, and submitting the sitemap again, the article moved to 11th in the results.

Along the way, every opportunity I had to use a web browser on a different computer, I did. I performed that exact same search and always clicked (only) on my link in the results. I then either shut down the browser or stayed on that page for quite a while, so that Google got the message that this page was relevant to the search criteria. As of this morning, that page is now 10th in the search results, and appears on the first page. While I'd love to be #1, I'm perfectly happy with being on the first page of results for the search, since my page is at least as relevant as anything else listed. If anyone else does a search for a review of this particular cigar, they should find my page fairly quickly and (I hope) find it of value.

The unfortunate part is that although I've now gotten pretty good exposure for my cigar reviews in the Google search results (they're nearly all on the first or second page of results now), many of the other articles on my site don't show up at all. Hopefully that will change as the forum visitors read about my web pages and visit the links, and as Google recognizes that my site is linked from a number of relevant sources. I'll come back and update this article in a few weeks to see if my efforts on the forums have helped. I'll also be monitoring the number of hits on my site, which right now is 15,858.

SEO ELITE - Search Engine Optimization Software

Related Blogs:

Related Links:

Last Updated ( Thursday, 30 March 2006 )
< Previous   Next >

Main Menu
Home
Blog
Photos
Links
Search
Site Index
Feedback
Administrator
Featured Links
BlogInspiration
SpamToons
Shawn Prince's Blog
Jack Ludwig's Blog
Mike Cramer's Site
Fark
Slashdot
Woot!
Cigar Envy
John Kricfalusi's Blog
CigarBlog 101
Cigars 101 Forum
Sponsored Links


View Site Stats