Posts tagged with lucene

Avatar-eric-london
Created by Eric.London on 2011-04-10
Tags:
New Comment
 
Please note: the content on this page orginates from ericlondon.com.
I recently found some time to switch my site's search framework from Lucene to Apache Solr. The module's README.txt makes installation for small production sites easy and straight forward.

Following the installation guide, I started the java Solr process by entering the right directory and executing the java jar..

$ java -jar start.jar


Everything was up and running in minutes.. until I closed my terminal and the java service ended with my shell process. Short term, I decided to writing a bash shell script to ensure Solr is running, and cron it to run every five minutes.

Here are the contents of my bash shell script:


#!/bin/bash

# check for process id
pid=`ps ax | grep -i java.*jar.*start\.jar | grep -iv grep | awk '{print $1}'`

# check if pid is not an integer
if ! [[ "$pid" =~ ^[0-9]+$ ]] ; then

  # start service
  cd /path/to/my/apache-solr-1.4.1/installation
  java -jar start.jar &

  # send email notification
  message='ericlondon.com: starting solr service'
  subject='ericlondon.com: starting solr service'
  to='myemail@example.com'
  echo "$message" | mail -s "$subject" $to

  exit 1;

fi


And I added the following cronjob:


$ crontab -l
*/5 * * * * /path/to/my/scripts/folder/check_solr.sh


A better option would be to setup initialization scripts for the process (/etc/init.d/), or install Solr as a more permanent solution, but I guess this will do for the time being :) ...




Part 2, Using Supervisor (updated: 2011/04/12)

As mentioned above, using a cronjob is probably not the best solution. I decided to install and configure supervisord to monitor the process.

Unfortunately supervisor was not available for for Centos 5.5 (RHEL):


$ yum search supervisor
Finished
Warning: No matches found for: supervisor
No Matches found


Luckily, I found some RPMs via http://rpmfind.net. I installed supervisor and its one dependency:


# downloading RPMs
$ wget ftp://rpmfind.net/linux/epel/5/x86_64/supervisor-2.1-3.el5.noarch.rpm
$ wget ftp://rpmfind.net/linux/epel/5/x86_64/python-meld3-0.6.3-1.el5.x86_64.rpm

# installing RPMs
$ rpm -Uvh python-meld3-0.6.3-1.el5.x86_64.rpm
$ rpm -Uvh supervisor-2.1-3.el5.noarch.rpm

# setting run level for supervisord
$ chkconfig --level 2345 supervisord on

# starting supervisor
$ /etc/init.d/supervisord start


Next, I create a simple shell script to start the Solr process and made the script executable. NOTE: file contents have been simplified:


#!/bin/bash

# enter solr dir
cd /path/to/my/apache-solr-1.4.1/installation

# start solr
java -jar start.jar


Lastly, I added a few line to my supervisor conf file (/etc/supervisord.conf):


[program:apache_solr]
command=/path/to/my/scripts/folder/apache-solr-supervisor-run.sh


Upon restarting supervisor, solr started automatically


$ /etc/init.d/supervisord restart

$ ps aux | grep -i java | grep -iv grep
root     28670  0.1  8.5 1041076 43548 ?       Sl   13:30   0:02 java -jar start.jar


I killed the script and it immediately came back (with a different process ID)!


$ kill 28670

$ ps aux | grep -i java | grep -iv grep
root     28869 62.0  5.3 1021532 27016 ?       Sl   13:50   0:00 java -jar start.jar
Avatar-eric-london
Created by Eric.London on 2010-11-23
Tags:
New Comment
 
Please note: the content on this page orginates from ericlondon.com.
In this tutorial I'll show how you to create a custom Lucene Search facet from taxonomy and integrate into the search form block. A great first step would be to review the Drupal Lucene API documentation (see luceneapi.api.php PHP file in the luceneapi module folder).

I assigned the search form block to a region in my theme (admin/build/block), which generates this form:

Search form block

I added a new vocabulary called "Topics", and added a few terms (admin/content/taxonomy).

Topic and terms

The goal of this code is to integrate the Topic vocabulary into the search form block to allow the user to select a taxonomy term as they search. To start, you need to implement a hook_luceneapi_facet_realm() and a callback function in your custom module.

<?php
/**
 * Implements hook_luceneapi_facet_realm()
 */
function MYMODULE_luceneapi_facet_realm() {

  $realms = array();

  $realms['form'] = array(
    'title' => t('Search form block'),
    'callback' => 'MYMODULE_luceneapi_facet_realm_callback_search_form_block',
    'callback arguments' => array(),
    'allow empty' => TRUE,
    'description' => t('Displays facets in the search form block.'),
  );
  
  return $realms;
  
}

/**
 * Implements hook_luceneapi_facet_realm() callback function
 */
function MYMODULE_luceneapi_facet_realm_callback_search_form_block($facets, $realm, $module) {

  $form = array();

  // loop through facets
  foreach ($facets as $name => $facet) {

    // NOTE: luceneapi_facet_to_fapi_convert() converts a Lucene facet to Drupal Form API data
    $form = array_merge_recursive($form, luceneapi_facet_to_fapi_convert($facet));

  }

  return $form;

}
?>


At this point, if you go to the facets admin page (admin/settings/luceneapi_node/facets), you can see the newly created realm. I assigned the taxonomy vocabulary "Topic" to this realm.

Assigning facets to realms

Up next is implementing a hook_form_alter() to integrate the facet into the search form.

<?php
/**
 * Implementation of hook_form_FORM_ID_alter().
 */
function MYMODULE_form_search_block_form_alter(&$form, &$form_state) {
  
  // get default search module (IE: luceneapi_node)  
  $module = luceneapi_setting_get('default_search');

  // check if default search module is defined in lucene searchable module list
  if (array_key_exists($module, luceneapi_searchable_module_list())) {

    // get index type (IE: node)
    $type = luceneapi_index_type_get($module);

    // fetch realm facets
    $elements = luceneapi_facet_realm_render('form', $module, $type);

    // if facet form elements exist, recursively merge with current form object
    if (!empty($elements)) {
      $form = array_merge_recursive($form, $elements);
    }

  }

}
?>


The search from block should now show an empty "Topic" facet.

Search form block, empty topic

The last piece of code implements hook_luceneapi_facet_postrender_alter() which gives you the opportunity to modify the facet, and in the this case, add its options.

Immediately after implementing this hook, if you krumo() or dsm() the $items argument, you'll see the form element has no options.

Using krumo to see empty form element

The next section of code copied the contrib module code in "Lucene Node". [See file: luceneapi/contrib/luceneapi_node/luceneapi_node.module; function: function luceneapi_node_luceneapi_facet_postrender_alter()]

<?php
/**
 * Implements hook_luceneapi_facet_postrender_alter()
 */
function MYMODULE_luceneapi_facet_postrender_alter(&$items, $realm, $module, $type = NULL) {

  if ($realm == 'form' && $module == 'luceneapi_node' && $type == 'node' && is_array($items['category'])) {
    
    // get taxonomy form data
    $taxonomy = module_invoke('taxonomy', 'form_all', 1);
    
    // get enabled facets
    $facets_enabled = luceneapi_facet_enabled_facets_get($module, $realm);
    
    // loop through enabled facets, validate, and fetch weight
    $weights = array();
    foreach ($facets_enabled as $name => $value) {
    
      // check for "category" facet
      // FORMAT: category_{VOCABID}
      if (preg_match('/^category_(\d+)$/', $name, $match)) {
      
        // load taxonomy vocabulary      
        if ($vocabulary = taxonomy_vocabulary_load($match[1])) {
        
          // ensure category and vocab id is enabled for this module and realm
          if (luceneapi_facet_enabled($match[0], $module, 'form')) {
                     
            // fetch weight
            $variable = sprintf('luceneapi_facet:%s:%s:%s:weight', $module, $realm, $name);
            $weights[$vocabulary->name] = variable_get($variable, 0);
          
          }
        
        }
      
      }
    
    // end foreach
    }
    
    // gets weighted taxonomy array
    asort($weights);
    $taxonomy_weighted = array();
    foreach ($weights as $vocab_name => $weight) {
      $taxonomy_weighted[$vocab_name] = $taxonomy[$vocab_name];
    }

    // create array of fapi data to override
    $category_data = array(
      '#prefix' => '<div class="criterion">',
      '#suffix' => '</div>',
      //'#size' => 10,
      '#options' => $taxonomy_weighted,
      '#multiple' => TRUE,
      '#default_value' => luceneapi_facet_value_get('category', array()),
      '#title' => NULL,
      '#description' => NULL,
    );

    // merge data
    $items['category'] = array_merge($items['category'], $category_data);
    
    // sets weight as the lowest weight of all taxonomy facets
    if (is_array($items['category']['#weight'])) {
      $items['category']['#weight'] = min($items['category']['#weight']);
    }    
          
  // end if
  }

}
?>


Reloading the page will now show the search form with a completed facet.

Search form with facet

Submitting the search form block with a selected taxonomy term will now take the user to the search results page with the taxonomy facet pre-selected!

Special thanks to Chris Pliakas for all of his great Lucene API work! (I miss working with you Chris)
Avatar-eric-london
Created by Eric.London on 2010-08-28
Tags:
New Comment
 
Please note: the content on this page orginates from ericlondon.com.
I created a Drupal site to host my photography in CCK Imagefield nodes and used Lucene to enhance my search functionality. By default Drupal's search results are text-based so I decided to add some code to show image thumbnails in my search results. I checked out Drupal Lucene's hooks and decided to implement a hook_luceneapi_result_alter() function in my existing module.

<?php
function MYMODULE_luceneapi_result_alter(&$result, $module, $type = NULL) {
  
  // check for node results
  if ($type == 'node') {
  
    // check node type
    if ($result['node']->type == 'image') {
    
      // define an imagecache image path for image thumbnail
      $imagecache_path_thumbnail = file_directory_path() . '/imagecache/thumbnail' . str_replace(file_directory_path(),'',$result['node']->field_image[0]['filepath']);      
      
      // define an imagecache image path for image (large)
      $imagecache_path_large = file_directory_path() . '/imagecache/large' . str_replace(file_directory_path(),'',$result['node']->field_image[0]['filepath']);
    
      // define theme_image() variables
      $alt = check_plain($result['node']->title);
      $title = check_plain($result['node']->title);
      // add rel=lightbox to enable lightbox2 module
      $attributes = array(
        'rel' => 'lightbox',
      );
      // let imagecache define the size
      $getsize = FALSE;
      // generate the image hml
      $image_html = theme('image', $imagecache_path_thumbnail, $alt, $title, $attributes, $getsize);      
    
      if ($image_html) {
                
        // define lightbox link
        $image_link = l(
          $image_html,
          $imagecache_path_large,
          array(
            'html' => true,
            'attributes' => array(
              'rel' => 'lightbox',
            )
          )
        );

        // add data to the result variable, passed by reference
        $result['image_thumbnail'] = $image_link;
        
      }
    
    }
  
  }

}
?>


The above code adds additional data to my search results variables. I then implemented a hook_preprocess_search_result() function in my theme's template.php file to pass this data to the search-result.tpl.php template file.

<?php
function MYTHEME_preprocess_search_result(&$variables) {

  // ...snip...

  // check for lucene node search results
  if ($variables['type']=='luceneapi_node') {

    // check for image
    if ($variables['result']['image_thumbnail']) {    

      // pass additional data to theme template file
      $variables['image_thumbnail'] = $variables['result']['image_thumbnail'];

    }
    
  }

}
?>


And in my theme's search-result.tpl.php template file, I added the following PHP to show the new variable.


<div class="search-result <?php print $search_zebra; ?>">

  <?php if($image_thumbnail): ?>
    <?php print $image_thumbnail; ?>
  <?php endif; ?>

  <!-- ...snip... -->


I also added a few lines of CSS in my theme's style.css file to tidy up the layout.


.search-results.luceneapi_node-results .search-result {
  clear: both;
}

.search-results.luceneapi_node-results .search-result img {
  float: left;
  margin: 0px 20px 20px 0px;
}


The visual results can be seen here on my photo gallery.

Visual search results