Drupal 6: Geospatial Apache Solr searching in Drupal 6 by upgrading Solr to 3.1

I've been working on a few Drupal projects recently that required geospatial Solr integration. After numerous failed attempts with patches, outdated modules, and Java Solr plugins, I decided to try a different approach: upgrading the Solr library from 1.4.1 to 3.1.0. I wish I had seen this earlier, but spatial search is core to Solr 3.1, see: http://wiki.apache.org/solr/SpatialSearch

This article is a companion to my configuration documented here: http://ericlondon.com/creating-centos-server-installation-apache-mysql-tomcat-php-drupal-and-solr, and assumes you have a similar working environment using the 1.4.1 Solr library.

I started by downloading apache-solr-3.1.0.tgz and replacing my installed Solr library 1.4.1 with the new files:

$ tar -xzf apache-solr-3.1.0.tgz

# copy/rename solr war file into Tomcat webapps directory
$ cp ~/downloads/apache-solr-3.1.0/dist/apache-solr-3.1.0.war /var/lib/tomcat6/webapps/solr.war

# copy solr files
$ cp -r ~/downloads/apache-solr-3.1.0/example/solr/ /var/lib/tomcat6/solr/

I then re-copied the Drupal apachesolr module xml configuration files into my Tomcat Solr directory:

$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/protwords.txt /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/schema.xml /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/solrconfig.xml /var/lib/tomcat6/solr/conf/

Unfortunately the Drupal apachesolr module and provided xml configuration files do not account for the new data types in 3.1.0. I modified my schema.xml file and added the following:

<fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
<fieldtype name="geohash" class="solr.GeoHashField"/>

<field name="coordinates" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate"  type="tdouble" indexed="true"  stored="false"/>

Next I added a hook_apachesolr_update_index() function to a custom module to index Location CCK data into the Solr Document.

<?php
function MYMODULE_apachesolr_update_index(&$document, $node) {

  // check for location data
  if (empty($node->field_location[0]['latitude']) || empty($node->field_location[0]['longitude'])) {
    return;
  }

  $document->coordinates = $node->field_location[0]['latitude'] . ',' . $node->field_location[0]['longitude'];

}
?>

I then created a bunch of nodes that had a Location CCK field, and varied their locations around New England. After re-indexing my Solr index and running cron, I queried Solr directly to ensure the locative data was added to the documents.

Solr Query Coordinates

I then used the new spatial query syntax (documented here: http://wiki.apache.org/solr/SpatialSearch) to pass in Boston coordinates. The results were limited to 3 matches within the specified distance.

Example query syntax, with coordinates and distance, searching for "ma"

?q=ma&fq={!geofilt sfield=coordinates pt=42.346617,-71.098747 d=10}

Solr Query GeoSpatial Search

The next challenge was integrating with the Drupal apachesolr module. The spatial query syntax has an augmented "fq" value, but the apachesolr Drupal module sets the fq parameter automatically (via: $query->get_fq()):

<?php
# snippet from file: apachesolr.module
function apachesolr_modify_query(&$query, &$params, $caller) {
  // ...snip...

  // TODO: The query object should hold all the params.
  // Add array of fq parameters.
  if ($query && ($fq = $query->get_fq())) {
    $params['fq'] = $fq;
  }
  // ...snip...
?>

After much struggle, I changed one line in the apachesolr.module file to allow the fq value to be modified (I normally refuse to patch ANY contrib module. please, if anyone knows a way to do this without patching, please let me know! :)

<?php
function apachesolr_modify_query(&$query, &$params, $caller) {
  // ...snip...

  // TODO: The query object should hold all the params.
  // Add array of fq parameters.
  if ($query && ($fq = $query->get_fq())) {
    $params['fq'] = array_merge($fq, $params['fq']);
  }
  // ...snip...
?>

Actual diff:

1264c1264
<     $params['fq'] = $fq;
---
>     $params['fq'] = array_merge($fq, $params['fq']);

I then add a hook_apachesolr_modify_query() function to my custom module to override the Solr query params.

<?php
function MYMODULE_apachesolr_modify_query(&$query, &$params, $caller) {

  if ($caller != 'apachesolr_search') {
    continue;
  }

  // NOTE: hardcoded Boston area coordinates:
  $coordinates = '42.346617,-71.098747';

  $distance = 10;

  // modify params
  $params['fq'][] = "{!geofilt sfield=coordinates pt=$coordinates d=$distance}";

}
?>

I then search for "ma", in Drupal this time, and my results reflected the nodes within the Boston area.

Drupal Solr GeoSpatial Results

Updates

Created by Anonymous on 2011-12-20

it's possible to simply add a filter in like so:

<?php
$query->add_filter('_query_', '"{!geofilt sfield=coordinates pt='.$coordinates.' d='.$distance.'}"');
?>