Drupal 7: Geospatial Apache Solr searching in Drupal 7 using the Search API module (Ubuntu version)
In this tutorial, I’ll share my notes and code I’ve used to setup geospatial Apache Solr searching in Drupal 7 using the Search API module. For this tutorial I created a minimal Ubuntu server virtual machine. All the commands should be executed as a user with permission to modify files, or prefixed with “sudo”.
The first thing I do with a fresh virtual machine is check for package upgrades.
$ apt-get update
$ apt-get upgrade
I find it cumbersome to type in a virtual machine window, so I’ll install open-ssh and ssh from my Mac. If you plan to do so, you’ll need to find your virtual machine’s IP address using ifconfig. For this tutorial I added local DNS (/etc/hosts) to point “drupal7.vm” to my VM’s IP.
$ apt-get install openssh-server
Install the LAMP stack. The following packages will install Apache httpd as a dependency.
$ apt-get install php5 php5-cli php5-common php5-curl php5-gd php5-mysql php-pear mysql-server
At this point, browsing to your VM/server’s IP address will give you the standard Apache welcome message: It works! This is the default web page for this server. The web server software is running but no content has been added, yet.
Install version control.
$ apt-get install git-core
Create a mysql database for Drupal 7.
$ mysql -u youruser -p
mysql> create database drupal7;
mysql> grant all privileges on drupal7.* to 'drupal7'@'localhost' identified by 'somepassword';
mysql> exit
Install drush via Pear.
$ pear upgrade-all
$ pear channel-discover pear.drush.org
$ pear install drush/drush
Verifying drush is installed.
$ which drush
/usr/bin/drush
$ drush --version
drush version 4.5
Create an Apache vhost directory
$ mkdir -p /var/www/vhosts
Download drupal via drush
$ cd /var/www/vhosts
$ drush dl drupal
# rename folder (as necessary)
$ mv drupal-7.10 drupal7
Integrate drupal file system with git
$ cd drupal7
$ git init
$ git add .
$ git commit -am "initial commit of drupal7"
Install drupal via drush
$ drush site-install standard --db-url=mysql://dbuser:pass@localhost/dbname
Add Apache2 vhost
$ cd /etc/apache2/sites-available
# create new file, called "drupal7" with contents:
<VirtualHost *:80>
ServerName drupal7.vm
DocumentRoot /var/www/vhosts/drupal7
ErrorLog /var/log/apache2/drupal7-error_log
CustomLog /var/log/apache2/drupal7-access_log combined
<Directory /var/www/vhosts/drupal7>
AllowOverride All
</Directory>
</VirtualHost>
# create symlink
$ cd ../sites-enabled
$ ln -s ../sites-available/drupal7 001-drupal7.conf
# enable apache2 mod_rewrite module
$ a2enmod rewrite
# restart apache2
$ /etc/init.d/apache2 restart
At this point browsing to your VM/server’s hostname should show a Drupal installation.
Part 2, Tomcat/Solr
Installing java jdk and tomcat6
$ apt-get install openjdk-6-jdk tomcat6 tomcat6-admin tomcat6-common tomcat6-user
Browsing to your VM/server’s hostname on port 8080 (ex: http://drupal7.vm:8080) will show the generic Tomcat welcome message:
It works !
If you’re seeing this page via a web browser, it means you’ve setup Tomcat successfully. Congratulations!
Installing Solr in Tomcat
$ mkdir ~/downloads
$ cd ~/downloads
# Download the latest stable version of Apache Solr from:
url: http://www.apache.org/dyn/closer.cgi/lucene/solr/
# example:
$ wget http://www.motorlogy.com/apache//lucene/solr/3.5.0/apache-solr-3.5.0.tgz
$ tar -xzf apache-solr-3.5.0.tgz
Copy/rename java war file into Tomcat webapps directory
$ cp ~/downloads/apache-solr-3.5.0/dist/apache-solr-3.5.0.war /var/lib/tomcat6/webapps/solr.war
Note: copying the java war file into the Tomcat webapps folder will create this directory automatically: /var/lib/tomcat6/webapps/solr
Copy solr files
$ cp -r ~/downloads/apache-solr-3.5.0/example/solr/ /var/lib/tomcat6/solr/
Create Catalina config file to link war file to solr directory
$ cd /etc/tomcat6/Catalina/localhost
# create new file: "solr.xml", with the contents:
<?xml version="1.0" encoding="UTF-8"?>
<Context docBase="/var/lib/tomcat6/webapps/solr.war" debug="0" privileged="true" allowLinking="true" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="/var/lib/tomcat6/solr" override="true" />
</Context>
Setup Tomcat admin user(s)
# edit file: /etc/tomcat6/tomcat-users.xml, ensure similar contents exist:
<?xml version='1.0' encoding='utf-8'?>
<tomcat-users>
<role rolename="admin"/>
<role rolename="manager"/>
<user username="eric" password="supersecretpassword" roles="admin,manager"/>
</tomcat-users>
Update webapps WEB-INF/web.xml file
# edit file: /var/lib/tomcat6/webapps/solr/WEB-INF/web.xml, update "solr/home" section to reflect solr path:
<env-entry>
<env-entry-name>solr/home</env-entry-name>
<env-entry-value>/var/lib/tomcat6/solr</env-entry-value>
<env-entry-type>java.lang.String</env-entry-type>
</env-entry>
Download search api drupal modules that contain solr xml configuration files, and copy into solr conf directory
$ mkdir -p /var/www/vhosts/drupal7/sites/all/modules/contrib
$ cd /var/www/vhosts/drupal7/sites/all/modules/contrib
$ drush dl search_api search_api_solr
$ cp /var/www/vhosts/drupal7/sites/all/modules/contrib/search_api_solr/solrconfig.xml /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/drupal7/sites/all/modules/contrib/search_api_solr/schema.xml /var/lib/tomcat6/solr/conf/
Reset tomcat permissions, and restart tomcat
$ cd /var/lib
$ chown -R tomcat6.tomcat6 tomcat6
$ /etc/init.d/tomcat6 restart
You should now be able to browse to the solr admin java page. Example: http://drupal7.vm:8080/solr/admin/
If things aren’t working well at this point, check the Tomcat logs and look for SEVERE log entries, here: /var/log/tomcat6/catalina.out
In addition, the solr java module should be listed in the Tomcat Web Application Manager Ex URL: http://drupal7.vm:8080/manager/html
Part 3, Drupal code
Getting the solr-php-client library from code.google.com
$ mkdir -p /var/www/vhosts/drupal7/sites/all/libraries
$ cd /var/www/vhosts/drupal7/sites/all/libraries
# URL: http://code.google.com/p/solr-php-client/downloads/list
# File: SolrPhpClient.r60.2011-05-04.tgz
$ wget http://solr-php-client.googlecode.com/files/SolrPhpClient.r60.2011-05-04.tgz
$ tar -xzf SolrPhpClient.r60.2011-05-04.tgz
Downloading and installing contrib drupal modules
$ cd /var/www/vhosts/drupal7
$ drush dl entity views ctools facetapi
$ drush en search_api search_api_views search_api_solr search_api_facetapi entity views views_ui ctools facetapi
(Optionally) I install devel, admin_menu, and disable overlay/toolbar
$ drush dl devel admin_menu
$ drush en devel admin_menu
$ drush dis overlay toolbar
Add the tomcat/solr server to Search API configuration:
- URL: /admin/config/search/search_api
- click on "+ Add Server"
- server name: Solr 3.5.0
- Service class: Solr service
- Solr host: localhost
- Solr port: 8080
- Solr path: /solr
- click Create Server
You should receive some confirmation messages: The server was successfully created. The Solr server could be reached (latency: # ms). If not, ensure tomcat/solr is reachable at the url you specified and the tomcat service is running.
At this point Solr is ready to send/receive data and index content, but there is nothing to index. For this tutorial, I decided to build off of user profiles and store latitude and longitude using the geolocation field module.
$ drush dl geolocation
$ drush en geolocation
Adding some user profile fields:
- URL: /admin/config/people/accounts/fields
- First Name | field_name_first | Text
- Last Name | field_name_last | Text
- Geolocation | field_geolocation | Geolocation | Latitude/Longitude
I then added a bunch of users with latitude/longitude coordinates (URL: /admin/people/create). Note: I used Google Geocoding API to fetch the coordinates: http://code.google.com/apis/maps/documentation/geocoding/
Adding the search api index:
- URL: /admin/config/search/search_api
- click "+ Add index"
- Index name: People
- Item type: User
- Server: Solr 3.5.0
- click: Create Index
On the next admin page, you can select which fields to index. For this tutorial, I chose: User ID, Name, Email, URL, First Name, and Last Name. Unfortunately, at the time of writing this, the geolocation lat/lng fields are not exposed to the Entity API. I assume this is a temporary problem, and there are numerous patches in the geolocation issue queue. @see (for example): Property Info callback for Entity API - http://drupal.org/node/1366642 Fix for Search API not picking up the entity to index it’s fields - http://drupal.org/node/1320564
I copied code directly from the issues queue, made some modifications, and created a custom module to expose the geolocation field data to the entity api module. In addition, I added a new property “lat_lon” that concatenates lat and lng together with a comma. @see: http://wiki.apache.org/solr/SpatialSearch
<?php
/**
* Implements hook_field_info_alter()
*/
function MYMODULE_field_info_alter(&$info) {
if (isset($info['geolocation_latlng'])) {
$info['geolocation_latlng']['property_type'] = 'geolocation';
$info['geolocation_latlng']['property_callbacks'] = array('geolocation_property_info_callback');
}
}
function geolocation_property_info_callback(&$info, $entity_type, $field, $instance, $field_type) {
$name = $field['field_name'];
$property = &$info[$entity_type]['bundles'][$instance['bundle']]['properties'][$name];
$property['type'] = ($field['cardinality'] != 1) ? 'list<geolocation>' : 'geolocation';
$property['getter callback'] = 'entity_metadata_field_verbatim_get';
$property['setter callback'] = 'entity_metadata_field_verbatim_set';
$property['auto creation'] = 'geolocation_default_values';
$property['property info'] = geolocation_data_property_info();
unset($property['query callback']);
}
function geolocation_default_values() {
return array(
'lat' => '',
'lng' => '',
'lat_sin' => '',
'last_name' => '',
'lat_cos' => '',
'lat_rad' => '',
'lat_lon' => '',
);
}
function geolocation_data_property_info($name = NULL) {
// Build an array of basic property information for the geolocation field.
$properties = array(
'lat' => array(
'label' => t('Latitude'),
),
'lng' => array(
'label' => t('Longitude'),
),
'lat_sin' => array(
'label' => t('Sine of Latitude'),
),
'lat_cos' => array(
'label' => t('Cosine of Latitude'),
),
'lat_rad' => array(
'label' => t('Radian Latitude'),
),
'lat_lon' => array(
'label' => t('Latitude,Longitude'),
),
);
// Add the default values for each of the address field properties.
foreach ($properties as $key => &$value) {
switch ($key) {
case 'lat_lon':
$value += array(
'description' => !empty($name) ? t('!label of field %name', array('!label' => $value['label'], '%name' => $name)) : '',
'type' => 'text',
'getter callback' => '_MYMODULE_geolocation_entity_property_verbatim_get',
'setter callback' => '_MYMODULE_geolocation_entity_property_verbatim_set',
);
break;
default:
$value += array(
'description' => !empty($name) ? t('!label of field %name', array('!label' => $value['label'], '%name' => $name)) : '',
'type' => 'text',
'getter callback' => 'entity_property_verbatim_get',
'setter callback' => 'entity_property_verbatim_set',
);
break;
}
}
return $properties;
}
function _MYMODULE_geolocation_entity_property_verbatim_get($data, array $options, $name, $type, $info) {
if (is_array($data) && isset($data['lat']) && isset($data['lng'])) {
return $data['lat'] . ',' . $data['lng'];
}
return '';
}
function _MYMODULE_geolocation_entity_property_verbatim_set(&$data, $name, $value, $langcode, $type, $info) {
// TODO
return;
}
?>
I added this code to a custom module, renamed function calls (as necessary), and enabled. Update the solr index to add the new fields to the index:
- URL: /admin/config/search/search_api/index/people/fields
- Expand "Add Related Fields"
- Choose Geolocation, click Add fields
The above will expose the following fields now available to the index:
- Geolocation » Latitude
- Geolocation » Longitude
- Geolocation » Sine of Latitude
- Geolocation » Cosine of Latitude
- Geolocation » Radian Latitude
- Geolocation » Latitude,Longitude
Enable “Geolocation » Latitude,Longitude” and save changes.
Index the content, URL: /admin/config/search/search_api/index/people/status. Click: Index now. Note: if you had already indexed the content, you’ll probably need to clear it first In my environment, I got the following confirmation message:
Successfully indexed 7 items.
I find it to be very helpful to verify the xml response from Solr directly after making changes to the index/schema. The following URL structure will query solr for all results and return all fields: Ex URL: http://drupal7.vm:8080/solr/select/?q=&fl=*
A sample XML document response.
<doc>
<str name="f_ss_search_api_language"/>
<str name="f_ss_url">http://drupal7.vm/user/3</str>
<str name="id">people-3</str>
<str name="index_id">people</str>
<long name="is_uid">3</long>
<str name="item_id">3</str>
<arr name="spell">
<str>nashua</str>
<str>nashua@example.com</str>
<str>nashua</str>
<str>nashua</str>
<str>42.933692,-72.278141</str>
</arr>
<str name="ss_search_api_id">3</str>
<str name="ss_search_api_language"/>
<str name="ss_url">http://drupal7.vm/user/3</str>
<arr name="t_field_geolocation:lat_lon">
<str>42.933692,-72.278141</str>
</arr>
<arr name="t_field_name_first">
<str>nashua</str>
</arr>
<arr name="t_field_name_last">
<str>nashua</str>
</arr>
<arr name="t_mail">
<str>nashua@example.com</str>
</arr>
<arr name="t_name">
<str>nashua</str>
</arr>
</doc>
Take note the field name in the following XML, it is used in the next file edit.
<arr name="t_field_geolocation:lat_lon">
<str>42.933692,-72.278141</str>
</arr>
Update the solr schema.xml configuration and add the geospatial fieldType and field data.
# Edit file: /var/lib/tomcat6/solr/conf/schema.xml
# Just prior to the closing "</types>" tag, I inserted: (around line 287)
<fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
<fieldtype name="geohash" class="solr.GeoHashField"/>
# And, just after the opening "<fields>" tag, I inserted:
<field name="t_field_geolocation:lat_lon" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>
Restart Tomcat
$ /etc/init.d/tomcat6 restart
Since the schema and solr data types have been updated, the content will have to be re-indexed:
- URL: /admin/config/search/search_api/index/people/status
- click: Clear index
- click: Index now
Returning to the solr query above will now show updated xml: (note: no longer an array)
<str name="t_field_geolocation:lat_lon">42.933692,-72.278141</str>
Verify the native solr geospatial searching is working using the following query syntax: URL: http://drupal7.vm:8080/solr/select/?q=&fl=*&fq={!geofilt sfield=t_field_geolocation:lat_lon pt=42.933692,-72.278141 d=100} By putting a distance parameter of 100 (kilometers) and Nashua NH coordinates, I get 2 results: Nashua and Portsmouth, awesome.
Create a solr integrated view:
- URL: /admin/structure/views/add
- View name: People
- Show: People
- Create a Page [checked]
- Path: people
- Continue & edit
Note: at this point, you have full reign over view configuration. For this tutorial, I set the format to Grid, and added some fields:
- Geolocation: Latitude,Longitude (indexed)
- Indexed User: Email
- Indexed User: First Name
- Indexed User: Last Name
- Indexed User: Name
Save the view when edits are complete.
Browsing to the view will show something like this: Ex URL: http://drupal7.vm.people
The next chunk of custom code modifies the solr query executed and adds geospatial filtering. @see: hook_search_api_solr_query_alter(array &$call_args, SearchApiQueryInterface $query)
<?php
function MYMODULE_search_api_solr_query_alter(array &$call_args, SearchApiQueryInterface $query) {
$lat = 42.933692;
$lng = -72.278141;
$distance = 100;
$call_args['params']['fq'][] = "{!geofilt sfield=t_field_geolocation:lat_lon pt={$lat},{$lng} d={$distance}}";
}
?>
The above code will limit the view’s results using the hardcoded coordinates.
Clearly it works but there are loose ends to tie:
- automatically fetch a user’s coordinates to store in the geolocation field
- add a search form to the people view page to allow the user to search for a location (instead of hard coded coordinates, blah)
- translate the user’s location search input to coordinates using an API
Hopefully I can find more time to elaborate on this tutorial in the near future! Cheers.
User comments:
Created by Martin on 2012-08-10:
Great tutorial that helped me get on track with SOLR. About the hook for proximity filtering, you can instead set the option “spatial” on your search query like this:
<?php
$query->setOption("spatial", array('lat'=> " 61.8",
'lng' => "12.916666999999961",
'radius' => "100",
'field' => "field_geofield:latlon",
'radius_measure' => "km"));
?>