Geospatial Apache Solr searching in Drupal 6 by upgrading Solr to 3.1

Avatar-eric-london
Created by Eric.London on 2011-05-18
Tags:
New Comment
 
Please note: the content on this page orginates from ericlondon.com.
I've been working on a few Drupal projects recently that required geospatial Solr integration. After numerous failed attempts with patches, outdated modules, and Java Solr plugins, I decided to try a different approach: upgrading the Solr library from 1.4.1 to 3.1.0. I wish I had seen this earlier, but spatial search is core to Solr 3.1, see: http://wiki.apache.org/solr/SpatialSearch

This article is a companion to my configuration documented here: http://ericlondon.com/creating-centos-server-installation-apache-mysql-tomcat-php-drupal-and-solr, and assumes you have a similar working environment using the 1.4.1 Solr library.

I started by downloading apache-solr-3.1.0.tgz and replacing my installed Solr library 1.4.1 with the new files:


$ tar -xzf apache-solr-3.1.0.tgz

# copy/rename solr war file into Tomcat webapps directory
$ cp ~/downloads/apache-solr-3.1.0/dist/apache-solr-3.1.0.war /var/lib/tomcat6/webapps/solr.war

# copy solr files
$ cp -r ~/downloads/apache-solr-3.1.0/example/solr/ /var/lib/tomcat6/solr/


I then re-copied the Drupal apachesolr module xml configuration files into my Tomcat Solr directory:


$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/protwords.txt /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/schema.xml /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/solrconfig.xml /var/lib/tomcat6/solr/conf/


Unfortunately, the Drupal apachesolr module and provided xml configuration files do not account for the new data types in 3.1.0. I modified my schema.xml file and added the following:


<fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
<fieldtype name="geohash" class="solr.GeoHashField"/>

<field name="coordinates" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate"  type="tdouble" indexed="true"  stored="false"/>


Next, I added a hook_apachesolr_update_index() function to a custom module to index Location CCK data into the Solr Document.

<?php
function MYMODULE_apachesolr_update_index(&$document, $node) {

  // check for location data
  if (empty($node->field_location[0]['latitude']) || empty($node->field_location[0]['longitude'])) {
    return;
  }
    
  $document->coordinates = $node->field_location[0]['latitude'] . ',' . $node->field_location[0]['longitude'];
  
}
?>


I then created a bunch of nodes that had a Location CCK field, and varied their locations around New England. After re-indexing my Solr index and running cron, I queried Solr directly to ensure the locative data was added to the documents.

Solr Query Coordinates

I then used the new spatial query syntax (documented here: http://wiki.apache.org/solr/SpatialSearch) to pass in Boston coordinates. The results were limited to 3 matches within the specified distance.

Example query syntax, with coordinates and distance, searching for "ma"

?q=ma&fq={!geofilt sfield=coordinates pt=42.346617,-71.098747 d=10}


Solr Query GeoSpatial Search

The next challenge was integrating with the Drupal apachesolr module. The spatial query syntax has an augmented "fq" value, but the apachesolr Drupal module sets the fq parameter automatically (via: $query->get_fq()):

<?php
# snippet from file: apachesolr.module
function apachesolr_modify_query(&$query, &$params, $caller) {
  // ...snip...

  // TODO: The query object should hold all the params.
  // Add array of fq parameters.
  if ($query && ($fq = $query->get_fq())) {
    $params['fq'] = $fq;
  }
  // ...snip...
?>


After much struggle, I changed one line in the apachesolr.module file to allow the fq value to be modified (I normally refuse to patch ANY contrib module. please, if anyone knows a way to do this without patching, please let me know! :)

<?php
function apachesolr_modify_query(&$query, &$params, $caller) {
  // ...snip...

  // TODO: The query object should hold all the params.
  // Add array of fq parameters.
  if ($query && ($fq = $query->get_fq())) {
    $params['fq'] = array_merge($fq, $params['fq']);
  }
  // ...snip...
?>


Actual diff:

1264c1264
<     $params['fq'] = $fq;
---
>     $params['fq'] = array_merge($fq, $params['fq']);


I then add a hook_apachesolr_modify_query() function to my custom module to override the Solr query params.

<?php
function MYMODULE_apachesolr_modify_query(&$query, &$params, $caller) {

  if ($caller != 'apachesolr_search') {
    continue;
  }

  // NOTE: hardcoded Boston area coordinates:  
  $coordinates = '42.346617,-71.098747';  

  $distance = 10;
  
  // modify params
  $params['fq'][] = "{!geofilt sfield=coordinates pt=$coordinates d=$distance}";
  
}
?>


I then search for "ma", in Drupal this time (!), and my results reflected the nodes within the Boston area.

Drupal Solr GeoSpatial Results

Comments

 
  • Any reason why you didn't
    Created by Anonymous on 2011-05-20
    Any reason why you didn't start with http://drupal.org/project/localsolr and contribute patches to improve that module's functionality?
    • yes
      Created by Eric.London on 2011-05-20
      I have not yet found a module that implemented native geospatial searching (>= 3.1), so I decided to try it out. At this point, I do not know when 3.1 will replace 1.4.1 in the ApacheSolr Drupal module: http://drupal.org/node/1122722. In addition, my code is still a work in progress.
      • http://drupal.org/node/685794#comment-4494444
        Created by Eric.London on 2011-05-20
        http://drupal.org/node/685794#comment-4494444
        • nice job
          Created by Anonymous on 2011-06-27
          I implemented your modifications as per this page, works great!
          I found that the following line stopped apachesolr_autocomplete from returning any suggestions
          
            if ($caller != 'apachesolr_search') {
              continue;
          }
          

          how can this be adjusted to allow apachesolr autocomplete to work?
          • testing expanding
            Created by Anonymous on 2011-06-28
            I have been trying to integrate several features into this D6/Solr3.1, Any guidance would be appreciated, I'm happy to share results

            1) hook_apachesolr_modify_query variables $coordinates and $distance
            with my modified search form
            hook_form_search_block_form_alter(&$form, $form_state)
            essentially trying to pass the form_state data into modify_query to make it dynamic
            no luck yet, surprisingly difficult

            2) on the results page: It would also be nice to get a facet block (for changing location filter)
            and/or an entry into the current search block similar to
            (-) greater Boston area/Postal Code (friendly name of lat/long coordinates)
            (-) Proximity 100km

            3) I am also integrating a gmap view into the search results as per one of your previous posts
            http://ericlondon.com/embedding-gmap-view-search-results

            4) To set the proximity to unlimited could we use -1, basically a show all results ordered by the proximity from a user selected coordinate.

            5) I also noticed that apachesol stopped displaying results for nodes-types without location fields. I have a mix of articles/story nodes and directory/location nodes along with content type filters.
            
              // check for location data
              if (empty($node->field_location[0]['latitude']) || empty($node->field_location[0]['longitude'])) {
                return;
              }
            

            I could remove this line which is halting the indexing of nodes with no location info... would that have negative ramifications, ill give it a try
            • 2.x.dev and SOLR3.4
              Created by Anonymous on 2011-11-08
              fyi I added a post to the issue queue for setting up with apachesolr 2.x.dev & SOLR3.4
              http://drupal.org/node/1334960
  • Modification to apachesolr.module
    Created by Anonymous on 2011-07-23
    In order to not break facets I had to do the following:

    
    
      if ($query && ($fq = $query->get_fq())) {
        $fq[] = $params['fq'];
        $params['fq'] = $fq;
      }
    
    
    • Thank you for the great post
      Created by Anonymous on 2011-08-19
      Is there anyway to achieve this without patching the contributed module?
    • CALLER_finalize_query
      Created by Anonymous on 2011-08-29
      The next challenge was integrating with the Drupal apachesolr module. The spatial query syntax has an augmented "fq" value, but the apachesolr Drupal module sets the fq parameter automatically (via: $query->get_fq()):

      To do this I implemented the function (hook?) CALLER_finalize_query. Within that, I made the fq item creation. Worked for me, hopefully for you too.
  • No need to modify apachesolr.module
    Created by Anonymous on 2011-12-20
    Hi there, thanks so much for your blog post! By reading this and someone else's experience (and with a bit of digging around), I've managed to get this working perfectly. You said you'd like to hear if there was a way whereby you wouldn't have to hack any contrib modules - once you have everything set up, it's possible to simply add a filter in like so:

    $query->add_filter('_query_', '"{!geofilt sfield=coordinates pt='.$coordinates.' d='.$distance.'}"');

    For more information, see my comment at http://drupal.org/node/1334960#comment-5385216

    Thanks again!