Drupal 6: Geospatial Apache Solr searching in Drupal 6 by upgrading Solr to 3.1
I’ve been working on a few Drupal projects recently that required geospatial Solr integration. After numerous failed attempts with patches, outdated modules, and Java Solr plugins, I decided to try a different approach: upgrading the Solr library from 1.4.1 to 3.1.0. I wish I had seen this earlier, but spatial search is core to Solr 3.1, see: http://wiki.apache.org/solr/SpatialSearch
This article is a companion to my configuration documented here: http://ericlondon.com/creating-centos-server-installation-apache-mysql-tomcat-php-drupal-and-solr, and assumes you have a similar working environment using the 1.4.1 Solr library.
I started by downloading apache-solr-3.1.0.tgz and replacing my installed Solr library 1.4.1 with the new files:
$ tar -xzf apache-solr-3.1.0.tgz
# copy/rename solr war file into Tomcat webapps directory
$ cp ~/downloads/apache-solr-3.1.0/dist/apache-solr-3.1.0.war /var/lib/tomcat6/webapps/solr.war
# copy solr files
$ cp -r ~/downloads/apache-solr-3.1.0/example/solr/ /var/lib/tomcat6/solr/
I then re-copied the Drupal apachesolr module xml configuration files into my Tomcat Solr directory:
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/protwords.txt /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/schema.xml /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/solrconfig.xml /var/lib/tomcat6/solr/conf/
Unfortunately the Drupal apachesolr module and provided xml configuration files do not account for the new data types in 3.1.0. I modified my schema.xml file and added the following:
<fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
<fieldtype name="geohash" class="solr.GeoHashField"/>
<field name="coordinates" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>
Next I added a hook_apachesolr_update_index() function to a custom module to index Location CCK data into the Solr Document.
<?php
function MYMODULE_apachesolr_update_index(&$document, $node) {
// check for location data
if (empty($node->field_location[0]['latitude']) || empty($node->field_location[0]['longitude'])) {
return;
}
$document->coordinates = $node->field_location[0]['latitude'] . ',' . $node->field_location[0]['longitude'];
}
?>
I then created a bunch of nodes that had a Location CCK field, and varied their locations around New England. After re-indexing my Solr index and running cron, I queried Solr directly to ensure the locative data was added to the documents.
I then used the new spatial query syntax (documented here: http://wiki.apache.org/solr/SpatialSearch) to pass in Boston coordinates. The results were limited to 3 matches within the specified distance.
Example query syntax, with coordinates and distance, searching for “ma”
?q=ma&fq={!geofilt sfield=coordinates pt=42.346617,-71.098747 d=10}
The next challenge was integrating with the Drupal apachesolr module. The spatial query syntax has an augmented “fq” value, but the apachesolr Drupal module sets the fq parameter automatically (via: $query->get_fq()):
<?php
# snippet from file: apachesolr.module
function apachesolr_modify_query(&$query, &$params, $caller) {
// ...snip...
// TODO: The query object should hold all the params.
// Add array of fq parameters.
if ($query && ($fq = $query->get_fq())) {
$params['fq'] = $fq;
}
// ...snip...
?>
After much struggle, I changed one line in the apachesolr.module file to allow the fq value to be modified (I normally refuse to patch ANY contrib module. please, if anyone knows a way to do this without patching, please let me know! :)
<?php
function apachesolr_modify_query(&$query, &$params, $caller) {
// ...snip...
// TODO: The query object should hold all the params.
// Add array of fq parameters.
if ($query && ($fq = $query->get_fq())) {
$params['fq'] = array_merge($fq, $params['fq']);
}
// ...snip...
?>
Actual diff:
1264c1264
< $params['fq'] = $fq;
---
> $params['fq'] = array_merge($fq, $params['fq']);
I then add a hook_apachesolr_modify_query() function to my custom module to override the Solr query params.
<?php
function MYMODULE_apachesolr_modify_query(&$query, &$params, $caller) {
if ($caller != 'apachesolr_search') {
continue;
}
// NOTE: hardcoded Boston area coordinates:
$coordinates = '42.346617,-71.098747';
$distance = 10;
// modify params
$params['fq'][] = "{!geofilt sfield=coordinates pt=$coordinates d=$distance}";
}
?>
I then search for “ma”, in Drupal this time, and my results reflected the nodes within the Boston area.
Updates
Created by Anonymous on 2011-12-20
it’s possible to simply add a filter in like so:
<?php
$query->add_filter('_query_', '"{!geofilt sfield=coordinates pt='.$coordinates.' d='.$distance.'}"');
?>