Search
Enter Keywords:
Home
Automatic Google Sitemaps for Mambo PDF Print E-mail
User Rating: / 0
PoorBest 
Written by Michael Salsbury   
Saturday, 18 March 2006
A while back, Google began beta testing a new tool to improve its ability to index web sites, the Google Sitemap.  A Google Sitemap is an XML file (similar to an HTML file) which contains a list of the URLs on your site that Google should index, an estimated frequency at which you edit those pages, a relative importance of the page to you, and the date/time the page was last modified.  More simply, it's a big list of all the pages on your site that helps Google find all your content more easily.

The Mambo 4.5.3 (and below) content management system doesn't generate Google Sitemaps natively.  As a result, many webmasters who use Mambo end up using some other tool to generate their sitemaps - or they write them up by hand.  Until I created the program discussed here, I first created the file by hand, then began using the SoftPlus GSiteCrawler program.  The problem with GSiteCrawler is that it, like Google, is a "spider" that indexes the contents of your site.  That means it chews up lots of your network bandwidth, just to generate the sitemap file, which theoretically changes each time you add content to your site.  Not a great thing.

When I moved my site to a new hosting provider (godaddy.com), I had to move the MySQL database on which Mambo stores its information.  Your site content is included there also.  Looking in the MySQL "export" from my prior hosting provider, I was able to see that Mambo stores all your content in a table aptly named "content".  That file has a large number of fields which are used to house your article's title, the date it was written, the date it was last modified, whether it's been published or is still a draft, etc.  I reasoned that I could probably learn enough PHP to be able to write a program that would use the data in the "content" table to automatically generate a current sitemap file for Google any time it asked.


Doing a little research online, I did indeed find enough information to write such a program.  It works something like this:

  • Connect to the Mambo database and grab all the content records
  • For each record (content item):
    • Check to see if the item is in "published" status (i.e., the "state" column has a value of "1")
    • Check to see if the item's "Start Publishing" date is earlier than or equal to the current date
    • Check to see if the item's "Finish Publishing" date is "Never" or after/equal-to the current date
    • Identify the date the item was created or last modified, whichever is later
    • If we can't identify a creation or last modification date, substitute today's date instead
    • Build a URL record in Google Sitemap form
  • Once all records have been processed, output the Sitemap file

When it's finished, we get a Google Sitemap file that looks something like this:
<?xml version="1.0" encoding="UTF-8" ?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<url>
<loc>http://www.mikesalsbury.com/mambo/content/view/416/</loc>
<lastmod>2006-03-16</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
.
. <snip>
.
<url>
<loc>http://www.mikesalsbury.com/mambo/content/view/1/</loc>
<lastmod>2006-03-17</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
</urlset>

This file is in the format Google expects it to be, and they can use it to crawl your site looking for content they haven't indexed yet.  The nice thing about this is that because we generate this file automatically at the time Google asks for it, it will always contain all the content displayed on our site (i.e., it's always accurate and complete).  You never need to do anything once you tell Google where to get this information.

To tell Google where to pick up the new sitemap, give it the URL to this program.  For example, on my site that URL is "http://mikesalsbury.com/sitemap.php".  Submit this Sitemap to Google and you should get an "OK" a few minutes or hours later.  (It may take a day or two, so be patient.)

Before you can use this program, you need to modify the lines under the "//Connect To Database" line to reflect the correct parameters needed to access your Mambo database.  Where do you get the correct parameters?  In Mambo's Configuration page, on the Database tab:

Mambo Database Settings
Mambo Database Settings


Change the information in quote marks in the "$hostname" line to match the value of "Hostname":
$hostname="myserver.net";
Change the information in quote marks in the "$username" line to match the value of "MySQL Username":
$username="myuname";
Change the "$password" line to match the "MySQL Password" value displayed in Mambo:
$password="thepassword";
Change the "$dbname" line to match the "MySQL Database" value displayed in Mambo:
$dbname="my_mamb1";
Change the "$usertable" line to the value of the "MySQL Database Prefix" value displayed in Mambo, plus the word "content":
$usertable="mos_content";

Next, adjust the "www.mikesalsbury.com" in the "$siteprefix" line to match the URL of your own web site. 

This should be all you need to do for the script to do its job.

When you're finished, you should end up with a script file that looks like this:

<?php

#
# Output the XML header for the Sitemap
#
$xmlheader = "<?xml version=".chr(34)."1.0".chr(34)." encoding=".chr(34)."UTF-8".chr(34)." ?>\n";
$xmlheader = $xmlheader."<urlset xmlns=".chr(34)."http://www.google.com/schemas/sitemap/0.84".chr(34).">\n";
echo $xmlheader;

//Connect To Database
$hostname="myserver.net";
$username="myuname";
$password="thepassword";
$dbname="my_mamb1";
$usertable="mos_content";

$siteprefix="http://www.mikesalsbury.com/mambo/content/view/";

$sysdate = getDate();
$now = date("Y-m-d",mktime(0,0,0,$sysdate["mon"],$sysdate["mday"],$sysdate["year"]));
$blank = date("Y-m-d",mktime(0,0,0,0,0,0));
#
# Connect to the database
#
mysql_connect($hostname,$username, $password);
mysql_select_db($dbname);
#
# Get the content items from the database
#
$query = "SELECT * FROM $usertable WHERE 1 ORDER BY id DESC";
$result = mysql_query($query);
#
# Blank out the sitemap information in case it contains anything.
#
$urlinfo = "";

# If we have any content...
if($result)
{
# Go through each item of content in the system...
#
while($row = mysql_fetch_array($result))
{
#
# Check to see if this item has no expiration date...
#
$forever = 0;
if(substr($row["publish_down"],0,10) == "0000-00-00")
{
   $forever = 1;
}
#
# Make sure it's published and not in draft or other status.
#
if($row["state"]=="1")
{
#
# If the content item is currently published on the site, include it.
#
if($now >= substr($row["publish_up"],0,10))
{
#
# If it hasn't expired yet...
#
if(($forever == 1) or ($now <= substr($row["publish_down"],0,10)))
{
$urlinfo = $urlinfo."<url>\n<loc>".$siteprefix.$row["id"]."/</loc>\n";
#
# If it has a non-zero modification date, use that as our lastmod date...
#
if(substr($row["modified"],0,10) != "0000-00-00")
{
$urlinfo = $urlinfo."<lastmod>".substr($row["modified"],0,10)."</lastmod>\n";
}
#
# If it has a zero modification date, use the created date instead...
#
if(substr($row["modified"],0,10)=="0000-00-00")
{
#
# But if the created date is zero, use today's date instead...
#
if(substr($row["created"],0,10)=="0000-00-00")
{
$sysdate = getDate();
$date = date("Y-m-d",mktime(0,0,0,$sysdate["mon"],$sysdate["mday"],$sysdate["year"]));
$urlinfo = $urlinfo."<lastmod>".$date."</lastmod>\n";
} else {
$urlinfo = $urlinfo."<lastmod>".substr($row["created"],0,10)."</lastmod>\n";
}
}
#
# Set the changefreq to monthly for all items.
#
$urlinfo = $urlinfo."<changefreq>monthly</changefreq>\n";
#
# Set the priority to 0.50 for all items.
#
$urlinfo = $urlinfo."<priority>0.5</priority>\n";
#
# Close out the URL record.
#
$urlinfo = $urlinfo."</url>\n";  
}
}
}
}
#
# Close out the URL list.
#
$urlinfo = $urlinfo."</urlset>\n";
}
#
# Write the URL information to the browser/bot.
#
echo $urlinfo;
?>
If you'd rather not go through the hassle of copying and pasting the above information into your favorite text editor and saving it, you can download the script as a ZIP file here.

Once you've edited the PHP file as directed above, save it as "sitemap.php".  FTP it to the server where you're running Mambo.  Point your browser to the URL where the file is located.  If you get an error message, check to make sure you didn't make any mistakes editing the file.  If you see a bunch of URLs and other text on the browser window, it worked!  To confirm that, right-click the browser window and select "View Source".  You should see something that resembles the Google Sitemap example at the start of this article.

You're now ready to tell Google where to find your new Sitemap.  Go to their site and point them to it.

Limitations/Restrictions

This program is provided free of charge without warranty or support of any kind.  By choosing to try to use it, you agree to accept all responsibility for what happens.  Since it doesn't write any information anywhere, you shouldn't experience any damage or data loss as a result of trying to use it - but I am not promising that.  Basically, I built this to automate Google Sitemap generation for myself and it works for what I need it to do.  It may or may not work for what you need.  At least it was free, right?

The program works on my site, which has Search Engine Friendly URLs turned on in the Mambo Configuration page.  I don't use the FAQ section and have little or no static content on the site.  If you use these or other features of Mambo, I can't promise that this program will do you much good.  All I can suggest is that you try it and see what you think.  If it works for you, great.  If it doesn't, please DO NOT contact me.  I'm making this available free, in exchange for which you agree not to ask for support.

The program automatically marks all URLs as updated monthly and with a priority of "0.5".  If you don't like that, you're welcome to put your PHP skills to work making it do what you want it to.  If you make anything really robust and powerful out of it, please contribute your work to the Mambo project so that all Mambo users can benefit.

 

 




This article discusses how to automate the generation of Google Sitemaps on a Mambo Open Source or Joomla web site. To automatically create, generate, build, or provide a Google Sitemap of all the content pages on your site, this article will provide a tool and method for doing so. This article discusses a PHP program I wrote for use with Mambo that will generate a Google Sitemap for you automatically from your Mambo database. This sitemap will conform to Google's rules and be accepted by its spidering bot. This should help you get more of your content recognized by Google. Since the script automatically generates the sitemap at the time Google asks for it, it is always complete and accurate as of the time Google asks for it. This also can help ensure that your latest articles appear. Google Sitemap, Sitemap for Google, Sitemap, Sitemaps, Sitemaps for Mambo, Sitemaps for Joomla, Sitemaps for all, content sitemaps, google content sitemaps for Mambo.


Related Blogs:

Related Links:

Last Updated ( Saturday, 25 March 2006 )
< Previous   Next >

Main Menu
Home
Blog
Photos
Links
Search
Site Index
Feedback
Administrator
Featured Links
BlogInspiration
SpamToons
Shawn Prince's Blog
Jack Ludwig's Blog
Mike Cramer's Site
Fark
Slashdot
Woot!
Cigar Envy
John Kricfalusi's Blog
CigarBlog 101
Cigars 101 Forum
Sponsored Links


View Site Stats