background image
HomeRecent PostsDrupalSearchTagsRSSContactAboutAccount
Eric.London's picture

The XML Sitemap module is a great way to generate a sitemap that can be automatically submitted to search engines. The admin interface allows to determine the default inclusion behavior for content types, taxonomy, menus, etc. On a few projects, I've had the need to programmatically remove certain nodes and taxonomy term landing pages dynamically from the generated XML Sitemp. In this quick code snippet I'll show how I was able to remove these items.

I got started by reviewing the xmlsitemap.api.php file included in the module. The function hook_xmlsitemap_link_alter(&$link) looked promising because it accepts the link as an argument, passed by reference, which would allow you to make modifications to it.

I implemented this hook in my module...

<?php
function MYMODULE_xmlsitemap_link_alter(&$link) {
 
 
// define a variable to determine if the link should be removed
  // NOTE: when the xml sitemap is generated, every link will be passed through this hook
 
$remove_link = FALSE;

 
// check for taxonomy term links
 
if ($link['type'] == 'taxonomy_term') {

   
// define a list of taxonomy term ids to exclude
    // NOTE: in my actual projects, this code was more dynamic :)
   
$excluded_terms = array(1,2,3,4,5);

   
// check if the link should be excluded
   
if (in_array($link['id'], $excluded_terms)) {
     
$remove_link = TRUE;
    }
  
  }

 
// check for node links
 
elseif ($link['type'] == 'node') {

   
// define a list of node ids to exclude
   
$excluded_nodes = array(1,2,3,4,5);

   
// check if the link should be excluded
   
if (in_array($link['id'], $excluded_nodes)) {
     
$remove_link = TRUE;
    }

  }

 
// remove link as necessary
  // NOTE: if the link has been flagged to be excluded,
  // setting the $link data to an empty array should remove it from the xml sitemap
 
if ($remove_link) {
   
$link = array();
  }

}
?>

Now when my XML sitemap is generated, each link will be passed through this hook. If the link meets my conditional logic, it will be flagged to be removed and will be excluded from the xml output. This code works well for situations where you do not want to permanently exclude links from your sitemap.

Most SEO tutorials claim that meta keywords are not very important to search engines. But, if you insist on inserting meta keywords for each node's taxonomy terms, this tutorial will show you how to accomplish this. I added the following function to my template.php file in my theme.

<?php
function _phptemplate_variables($hook, $vars) {
 
// check for page scope
 
if ($hook == 'page') {
   
// ensure this page is a node view
   
if (is_object($vars['node'])) {
     
// get a list of taxonomy terms for this node
     
$terms = taxonomy_node_get_terms($vars['node']->nid);
           
     
// ensure terms exist
     
if (is_array($terms)) {

       
// loop through terms, and collect the names
       
$t = array();
        foreach (
$terms as $k => $v) {
         
$t[] = $v->name;   
        }
               
       
// ensure names exist
       
if (is_array($t)) {
         
// implode the terms into a string and add to the head variable
         
$vars['head'] .= "<meta name='keywords' content='" . implode(",", $t) . "'>";
        }

      }

    }
       
  }
  return
$vars;
}
?>

To verify this code is working, check out the tags I used on this page and then view the source. Hopefully, they match!

(NOTE: my homepage is a view, so there show not be any meta keywords.)


NOTE: when I upgraded my theme to Drupal 6, I had to update this above code:

<?php
function MYTHEME_preprocess_page(&$variables) {
  if (
is_object($variables['node']) && is_array($variables['node']->taxonomy)) {
   
$tags = array();
    foreach (
$variables['node']->taxonomy as $tid => $term) {
      if (!
in_array($term->name, $tags)) $tags[] = $term->name;
    }
       
    if (
count($tags)) {
     
sort($tags);
     
$variables['head'] .= "<meta name='keywords' content='" . implode(",", $tags) . "'>";
    }
  }
}
?>

In the future, I will try to elaborate more on how to improve SEO in a Drupal. But for right now, here are some notes on what I have done with this site.

1. Install the Google Analytics module (http://drupal.org/project/google_analytics). You'll need to create a Google account if you have not already done so. This will monitor your visitors and web traffic.

2. Configure your URLs
- Enabe clean URLs. This uses Apache mod_rewrite to create virtual directory structure in your query strings.
- Enable the path module so you can rename URLs to whatever you like. Instead of node/#, you can make them more descriptive.
- Install the Pathauto module (http://drupal.org/project/pathauto). This module can be configured to automatically create a URL path alias based off of taxonomy, node title, menu structure, etc. I find it useful to configure path aliases based off menu structure and node titles. For example, here is a sample menu structure and the follow aliases I would use:

Home
>> My Hobbies
   >> Photography
      >> node/67
      >> node/68

My-Hobbies/Photography/Hiking
My-Hobbies/Photography/Ralphie-the-Cat

3. Install the XML Sitemap module (http://drupal.org/project/xmlsitemap). This module allows you to generate an XML sitemap that can be submitted to search engines automatically. You can see mine here: (http://ericlondon.com/sitemap.xml). I set my site to submit the sitemap to each available search engine. I also recommend signing up for the Google Webmaster Tools (http://www.google.com/webmasters/tools). This will allow you to monitor and configure the way Google analyzes your XML sitemap.

4. Install the Global Redirect module (http://drupal.org/project/globalredirect). This module will check to see if a path alias exists are redirect the user as necessary. For instance, if a user went to node/#, this module would redirect them to the more search engine friendly URL alias.

5. Use the taxonomy module. It's a great way to categorize your content. For this site, I use free tagging, so I do not have to maintain a definitive list of terms. It enables me to type in comma separated lists of terms that relate to my content. This also allows your users to click on your taxonomy terms and view contact what has been tagged with the same term.

Additional Notes
- If you are using automatic path aliases via pathauto, be careful when editing your nodes; you're aliases may be updated which can affect your menu structure and page links.
- I think it's important to setup pathauto immediately after installing Drupal. That way, when you're ready to start adding content to your site, you'll already be following a standard naming convention.
- I had some difficulty getting XML Sitemap and Pathauto to work together. At first, not all of my pages and taxonomy terms where showing up in the sitemap. I found a module called Module Weight (http://drupal.org/project/moduleweight) which helped alleviate some of my headache.

Syndicate content