background image
HomeRecent PostsDrupalSearchTagsRSSContactAboutAccount
Eric.London's picture

For this article, I'll share some old school procedural PHP scripts I used to scan a directory for duplicate images and display the results for comparison. A while back, I had a hardware failure and had to write some rsync commands to manually pull my iPhoto images off of a dying Time Machine external harddrive. The basic gist of these script is simple: find all the images, create a unique MD5 hash of the image, collect some other details, write the records to a MySQL database, execute some SQL to find MD5 duplicates, and show the results side by side for comparison. Since I was executing this code on my iMac, I used MAMP to provide the Apache and MySQL services.

The first script, which will be included all the rest just sets up a MySQL database connection.

Script: db.php

<?php
// define mysql credentials
$db_user = 'picture_data';
$db_pass = 'picture_data';
$db_database = 'picture_data';
$db_table = 'picture_data';
$db_host = 'localhost';

// connect to mysql database
$db = mysql_connect($db_host, $db_user, $db_pass);

// check for mysql connection
if (!$db) {
  die(
'Could not connect to database.');
}
?>

I then created the script to find all the images, create the md5 hash, and store the data in MySQL. I put this script outside my Apache vhost docroot and only had to execute it once.

Script: scan.php

<?php
//////////////////////////////////////////////////
// DATABASE SETUP

require_once('db.php');

setup_database();

//////////////////////////////////////////////////
// PROCESSING IMAGES

// specify path to images
$images_path = '/Users/Eric/Pictures/iPhoto Library/Originals';

// ensure directory exists
if (!is_dir($images_path)) {
  die(
'Directory does not exist.');
}

// change directory
chdir($images_path);

// get a list of files
$files = `find . -type f | sed 's/^\.\///'`;

// explode files list on newline
$files = explode("\n", trim($files));

// define a list of file extensions to process
$file_extensions = array(
 
'jpg',
 
'jpeg',
 
'png',
 
'bmp',
 
'gif',
 
'tiff',
);

// loop through files
foreach ($files as $file_path) {

 
// get path info
 
$path_info = pathinfo($file_path);
 
$file_name = $path_info['basename'];
 
$file_extension = strtolower($path_info['extension']);
 
 
// check file extension
 
if (!in_array($file_extension, $file_extensions)) {
    continue;
  }

 
// get md5 hash of file
 
$file_md5 = md5_file($file_path);

 
// get file modified time
 
$file_modified = date('Y-m-d H:i:s', filemtime($file_path));

 
// create sql to insert record
 
$sql = sprintf(
   
"insert into `%s` (file_path, file_name, file_extension, file_md5, file_modified) values ('%s','%s','%s','%s','%s')",
   
mysql_real_escape_string($db_table),
   
mysql_real_escape_string($images_path . '/' . $file_path),
   
mysql_real_escape_string($file_name),
   
mysql_real_escape_string($file_extension),
   
mysql_real_escape_string($file_md5),
   
mysql_real_escape_string($file_modified)
  );

 
// execute sql
 
$result = mysql_query($sql, $db);

}

//////////////////////////////////////////////////
// FUNCTIONS

function setup_database() {

  global
$db;
  global
$db_database;
  global
$db_table;

 
// create database if it is does not exist
 
$sql = sprintf(
   
"create database if not exists `%s`",
   
mysql_real_escape_string($db_database)
  );
 
$result = mysql_query($sql, $db);
 
 
// check for error
 
if (!$result) {
    die(
mysql_error());
  }
 
 
// select database
 
$result = mysql_select_db($db_database, $db);
 
 
// check for error
 
if (!$result) {
    die(
mysql_error());
  }
 
 
// create table if it does not exist
 
$sql = sprintf("
    CREATE TABLE IF NOT EXISTS `%s` (
      `fid` int(11) NOT NULL AUTO_INCREMENT,
      `file_path` varchar(255) NOT NULL,
      `file_name` varchar(255) NOT NULL,
      `file_extension` varchar(10) NOT NULL,
      `file_md5` varchar(32) NOT NULL,
      `file_modified` datetime NOT NULL,
      PRIMARY KEY (`fid`),
      KEY `idx_file_md5` (`file_md5`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1"
,
   
mysql_real_escape_string($db_table)
  );
 
$result = mysql_query($sql, $db);
 
 
// check for error
 
if (!$result) {
    die(
mysql_error());
  }
 
 
// drop existing records from table
 
$sql = sprintf(
   
"truncate table `%s`",
   
mysql_real_escape_string($db_table)
  );
 
$result = mysql_query($sql, $db);
 
 
// check for error
 
if (!$result) {
    die(
mysql_error());
  }

}
?>

I then ran the script on the command line. It took a while to go through all 25K+ images in my directory.

$ php scan.php

The next script I wrote will aid in the display of the images. I wrote this script because the absolute path of my images was outside my Apache vhost docroot. It checks for 2 $_GET variables: the md5 hash and a integer representing which duplicate image to show. The images is read and displayed, so this script can be inserted into the "scr" attribute of an img tag.

Script: view-image.php

<?php
//////////////////////////////////////////////////
// DATABASE

require_once('db.php');

// select database
$result = mysql_select_db($db_database, $db);

// check for error
if (!$result) {
  die(
mysql_error());
}

//////////////////////////////////////////////////
// PROCESS REQUEST

$md5 = $_GET['md5'];
$index = intval($_GET['index']);

// fetch images with md5 index
$sql = sprintf("
  select *
  from `%s`
  where file_md5 = '%s'
  order by fid asc
  "
,
 
mysql_real_escape_string($db_table),
 
mysql_real_escape_string($md5)
);

$result = mysql_query($sql, $db);

// check for error
if (!$result) {
  die(
mysql_error());
}

// fetch results
$rows = array();
while (
$row = mysql_fetch_object($result)) {
 
$rows[] = $row;
}

// get image data
$file_path = $rows[$index]->file_path;
$file_extension = $rows[$index]->file_extension;

header("Content-type: image/$file_extension");
readfile($file_path);
?>

The last script ties everything together. It determines which duplicates exist and allows you to view them. For my environment, I decided to store the list of MD5 duplicates in the $_SESSION, to prevent repeat SQL.

Script: view.php

<?php
//////////////////////////////////////////////////
// DATABASE SETUP

require_once('db.php');

// select database
$result = mysql_select_db($db_database, $db);

// check for error
if (!$result) {
  die(
mysql_error());
}

//////////////////////////////////////////////////
// FETCHING MD5S

// start session
session_start();

// check for session data
if (!is_array($_SESSION['md5s']) || empty($_SESSION['md5s'])) {
 
fetch_md5s();
}

// determine which md5 to show
$md5_index = intval($_GET['md5_index']);

// fetch images with md5 index
$sql = sprintf("
  select *
  from `%s`
  where file_md5 = '%s'
  order by fid asc
  "
,
 
mysql_real_escape_string($db_table),
 
mysql_real_escape_string($_SESSION['md5s'][$md5_index])
);

$result = mysql_query($sql, $db);

// check for error
if (!$result) {
  die(
mysql_error());
}

// fetch results
$rows = array();
while (
$row = mysql_fetch_object($result)) {
 
$rows[] = $row;
}

// create image output in a table. note the image scr is calling the view-image.php script with $_GET arguments.
$output = "";
$output .= "<table><tr>";
foreach (
$rows as $index => $data) {
 
$output .= "<td style='width: " . (100/count($rows)) . "%'>";
 
$output .= "<img style='width: 100%' src='/view-image.php?md5=" . $data->file_md5 . "&index=" . $index . "' />";
 
$output .= $data->file_name . "<br/>";
 
$output .= $data->file_path . "<br/>";
 
$output .= "</td>";
}
$output .= "</tr></table>";

$output .= "<a href='/view.php?md5_index=" . ($md5_index+1) . "'>Next >></a>";

print
$output;

//////////////////////////////////////////////////
// FUNCTIONS

function fetch_md5s() {

  global
$db;
  global
$db_table;

 
// get a list of md5 hashes with dupes
 
$sql = sprintf("
    select file_md5
    from `%s`
    group by file_md5
    having count(*) > 1
    "
,
   
mysql_real_escape_string($db_table)
  );
 
 
$result = mysql_query($sql, $db);
 
 
// check for error
 
if (!$result) {
    die(
mysql_error());
  }
 
 
// fetch results
 
$md5s = array();
  while (
$row = mysql_fetch_object($result)) {
   
$md5s[] = $row->file_md5;
  }
 
 
// store md5s in session
 
$_SESSION['md5s'] = $md5s;

}
?>

Now, I went to my browser to execute the view.php script and view the results.

Picture Duplicates

In this article, I'll show the commands I have been using to set up a fresh Centos server, configured for Apache, MySQL, PHP, Tomcat, Drupal, and Apache Solr. For my article, I used Parallels to create a virtual machine from the Centos 5.6 64bit ISOs I downloaded. To simply this article, all commands are being executed as root, firewall configurations and performance tweaks are not accounted for.

Once the distribution is installed, the first thing I do is upgrade all packages.

$ yum update

Install PHP, Apache, and MySQL

$ yum install php53 php53-gd php53-mbstring php53-mysql php53-xml mysql-server httpd

Set runlevels for Apache and MySQL

$ chkconfig --level 2345 httpd on
$ chkconfig --level 2345 mysqld on

Install subversion. I chose to use subversion for this article because the Drupal 6.x installation works well with svn:externals to fetch the SolrPhpClient library. All subversion commands are connecting to a local subversion repository. If you are using an external server (like Beanstalk), you will have to transpose all commands from using "file://" to "https://".

$ yum install subversion

Add a new local subversion repository (OPTIONAL).

$ mkdir /var/subversion
$ svnadmin create /var/subversion/example.com
$ svn mkdir file:///var/subversion/example.com/trunk -m "added trunk"
$ svn mkdir file:///var/subversion/example.com/branches -m "added branches"
$ svn mkdir file:///var/subversion/example.com/tags -m "added tags"

Download/setup drush

$ cd /var/www
$ wget http://ftp.drupal.org/files/projects/drush-7.x-4.4.tar.gz
$ tar -xzf drush-7.x-4.4.tar.gz
$ ln -s /var/www/drush/drush /usr/local/bin/drush

Create a vhost location on the server for the Drupal installation.

$ mkdir /var/www/vhosts
$ cd /var/www/vhosts
$ drush dl drupal
$ mv drupal-7.0/ example.com

Integrate the Drupal files with subversion

$ cd /var/www/vhosts/example.com
$ svn co file:///var/subversion/example.com/trunk .
$ svn add * .htaccess
$ svn commit -m "downloaded drupal"

Download the Drupal apachesolr module

# make a folder for contrib modules
$ mkdir /var/www/vhosts/example.com/sites/all/modules/contrib
$ cd /var/www/vhosts/example.com/sites/all/modules/contrib

# note: in the below command, you may be prompted to choose which version of the Solr module to install. I choose option 2 for the Supported version
$ drush dl apachesolr

# commit to subversion
$ cd /var/www/vhosts/example.com/sites/all/modules
$ svn add contrib
$ svn commit -m "added contrib folder and apachesolr module"

Setup MySQL

# start mysql
$ /etc/init.d/mysqld start

# set root password
$ /usr/bin/mysqladmin -u root password 'new-password'

# create new database, user, and set permissions
$ mysql --execute="create database db_example"
$ mysql --execute="grant all privileges on db_example.* to 'example-user'@'localhost' identified by 'some_password'"

Setup Apache vhost

$ cd /etc/httpd/conf.d

# create a new file "example.com.conf", with the contents:

NameVirtualHost *:80

<Directory /var/www/vhosts>
  AllowOverride All
</Directory>

<VirtualHost *:80>
  ServerName example.com
  DocumentRoot /var/www/vhosts/example.com
  ErrorLog logs/example.com-error_log
  CustomLog logs/example.com-access_log common
</VirtualHost>

Reset Apache file permissions. NOTE: you will need a more solid/secure configuration for this!

$ cd /var/www
$ chown -R apache.apache drush*
$ chown -R apache.apache vhosts

Start Apache

$ /etc/init.d/httpd start

Install Drupal via drush

$ cd /var/www/vhosts/example.com

# note: you can set your user 1 username, password, email, etc in the following command if desired. type "drush help si" for more install options
$ drush site-install standard --sites-subdir=example.com --db-url=mysqli://example-user:some_password@localhost/db_example

At this point, you should be able to browse to your site and it will be up and running.
Drupal Installed

Now, we move onto Tomcat and Solr!

Installing Tomcat and Java. The default Centos yum repositories provide Tomcat5. I prefer Tomcat6, so there are some extras steps below and a dependency issue I had to resolve.

# added repo file to get tomcat6:
$ cd /etc/yum.repos.d/
$ wget http://www.jpackage.org/jpackage50.repo

# install Java JDK:
$ yum install java-1.6.0-openjdk

# install tomcat6:
$ yum install tomcat6 tomcat6-admin-webapps tomcat6-webapps

# dang, dependency issue... (!)

java-1.4.2-gcj-compat-1.4.2.0-40jpp.115.x86_64 from base has depsolving problems
  --> Missing Dependency: /usr/bin/rebuild-security-providers is needed by package java-1.4.2-gcj-compat-1.4.2.0-40jpp.115.x86_64 (base)
Error: Missing Dependency: /usr/bin/rebuild-security-providers is needed by package java-1.4.2-gcj-compat-1.4.2.0-40jpp.115.x86_64 (base)
You could try using --skip-broken to work around the problem
You could try running: package-cleanup --problems
                        package-cleanup --dupes
                        rpm -Va --nofiles --nodigest
The program package-cleanup is found in the yum-utils package.

# Fixing dependency issue (OPTIONAL):
$ mkdir ~/downloads
$ cd ~/downloads
$ wget http://plone.lucidsolutions.co.nz/linux/centos/images/jpackage-utils-com...
$ rpm -Uvh jpackage-utils-compat-el5-0.0.1-1.noarch.rpm
$ yum install tomcat6 tomcat6-admin-webapps tomcat6-webapps

# setting tomcat runlevels
$ chkconfig --level 2345 tomcat6 on

# starting tomcat
$ /etc/init.d/tomcat6 start

At this point, you should be able to access Tomcat in your browser (http://example.com:8080)
Tomcat homepage

Downloading Solr Java library.

$ cd ~/downloads
# note: you may need to choose a different mirror to download
$ wget http://www.fightrice.com/mirrors/apache//lucene/solr/1.4.1/apache-solr-1...
$ tar -xzf apache-solr-1.4.1.tgz

# copy/rename solr war file into Tomcat webapps directory
$ cp ~/downloads/apache-solr-1.4.1/dist/apache-solr-1.4.1.war /var/lib/tomcat6/webapps/solr.war

# copy solr files
$ cp -r ~/downloads/apache-solr-1.4.1/example/solr/ /var/lib/tomcat6/solr/

Copying the java war file into the Tomcat webapps folder will create this directory automatically:

/var/lib/tomcat6/webapps/solr

Create Catalina config file to link war file to solr directory:

$ cd /etc/tomcat6/Catalina/localhost

# create new file: "solr.xml", with the contents:

<?xml version="1.0" encoding="UTF-8"?>
<Context docBase="/var/lib/tomcat6/webapps/solr.war" debug="0" privileged="true" allowLinking="true" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="/var/lib/tomcat6/solr" override="true" />
</Context>

Setup Tomcat admin user(s):

# edit file: /etc/tomcat6/tomcat-users.xml, ensure similar contents exist:

<?xml version='1.0' encoding='utf-8'?>
<tomcat-users>
<role rolename="admin"/>
<role rolename="manager"/>
<user username="eric" password="supersecretpassword" roles="admin,manager"/>
</tomcat-users>

Update WEB-INF/web.xml file:

# edit file: /var/lib/tomcat6/webapps/solr/WEB-INF/web.xml, update section to reflect solr path:

<env-entry>
  <env-entry-name>solr/home</env-entry-name>
  <env-entry-value>/var/lib/tomcat6/solr</env-entry-value>
  <env-entry-type>java.lang.String</env-entry-type>
</env-entry>

Copy conf files from Drupal apachesolr module into Tomcat Solr conf directory (overwrite):

$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/protwords.txt /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/schema.xml /var/lib/tomcat6/solr/conf/
$ cp /var/www/vhosts/example.com/sites/all/modules/contrib/apachesolr/solrconfig.xml /var/lib/tomcat6/solr/conf/

Reset Tomcat permissions/ownership:

$ cd /var/lib
$ chown -R tomcat.tomcat tomcat6/

Restart Tomcat

$ /etc/init.d/tomcat6 restart

At this point, you should be able to access the solr/admin tomcat Page (http://example.com:8080/solr/admin)
Solr Admin

If things are not working well at this point, check the Tomcat logs:

/var/log/tomcat6/catalina.out

And, ensure the solr java module is listed in the Tomcat Web Application Manager: http://example.com:8080/manager/html

If all is well, you can now enable the Drupal apachesolr modules:

$ cd /var/www/vhosts/example.com
$ drush en apachesolr apachesolr_search apachesolr_taxonomy apachesolr_access --uri=example.com

Log into your Drupal site. NOTE: default account (via drush): admin/admin

Edit default Apache Solr Host Settings:
URL: http://example.com/admin/config/search/apachesolr/server/solr/edit
Change url to: http://example.com:8080/solr, and save form.

Go to Drupal search settings page:
URL: http://example.com/admin/config/search/settings
Change the default search mode to "Apache Solr search", and save form.

Now, you are ready to test the indexing and integration.

Add a new piece of content to test indexing.
# Example:
# http://example.com/node/add/article
# title: Test Article
# Body: test test test

Browse to solr index page:
URL: http://example.com/admin/config/search/apachesolr/index
Select: Index queued content, and click on Begin button
You should see a status message like: "1 item processed successfully." and "Number of documents in index: 0 (1 sent but not yet processed)"
A few minutes later, refreshing the index page should show: "Number of documents in index: 1"

Search for "test" to verify Solr results.
URL: http://example.com/search/site/test
Solr Search Results

You can also review search results via solr/admin
URL: http://example.com:8080/solr/admin/
Enter "test" in query string box and click search

Part 2, Multicore Configuration (OPTIONAL)

If you need to run multiple sites off a single Solr Tomcat installation, you can setup multicore..

Copy the multicore xml file into your solr directory:

$ cp ~/downloads/apache-solr-1.4.1/example/multicore/solr.xml /var/lib/tomcat6/solr/

Create a new directory for each multisite in the solr directory:

$ mkdir /var/lib/tomcat6/solr/example.com

Replicate the solr conf directory into the new multisite directory:

cp -r /var/lib/tomcat6/solr/conf /var/lib/tomcat6/solr/example.com/conf/

Update the solr.xml file:

# edit file: /var/lib/tomcat6/solr/solr.xml, added <core> section for each site:
<cores adminPath="/admin/cores">
  <core name="example.com" instanceDir="example.com" />
</cores>

Restart Tomcat

$ /etc/init.d/tomcat6 restart

Now, your new multicore site will be accessible here: http://example.com:8080/solr/example.com/admin/

Eric.London's picture

In this article, I'll share a bash shell script I use periodically to import a directory of CSV files into MySQL tables. This script is most helpful when you need to process a large number of CSV files, and the tables have not yet been created. Of course you could use a GUI tool to accomplish this, but what's the fun in that?

The following script will get a list of CSV files, loop through them, add each table, add each column to the table (based on the first row), and then use the mysqlimport command to load all the CSV records. There are a few caveats though: 1. the first row of each CSV file must contain the column names; 2. it works best when your column names are simple text; and 3. your MySQL user must have permission to process files (see: File_priv).

#!/bin/bash

# show commands being executed, per debug
set -x

# define database connectivity
_db="csv_imports"
_db_user="csv_imports"
_db_password="changeme"

# define directory containing CSV files
_csv_directory="/path/to/the/csv/files"

# go into directory
cd $_csv_directory

# get a list of CSV files in directory
_csv_files=`ls -1 *.csv`

# loop through csv files
for _csv_file in ${_csv_files[@]}
do
   
  # remove file extension
  _csv_file_extensionless=`echo $_csv_file | sed 's/\(.*\)\..*/\1/'`
 
  # define table name
  _table_name="${_csv_file_extensionless}"
 
  # get header columns from CSV file
  _header_columns=`head -1 $_csv_directory/$_csv_file | tr ',' '\n' | sed 's/^"//' | sed 's/"$//' | sed 's/ /_/g'`
  _header_columns_string=`head -1 $_csv_directory/$_csv_file | sed 's/ /_/g' | sed 's/"//g'`
 
  # ensure table exists
  mysql -u $_db_user -p$_db_password $_db << eof
    CREATE TABLE IF NOT EXISTS \`$_table_name\` (
      id int(11) NOT NULL auto_increment,
      PRIMARY KEY  (id)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1
eof
 
  # loop through header columns
  for _header in ${_header_columns[@]}
  do

    # add column
    mysql -u $_db_user -p$_db_password $_db --execute="alter table \`$_table_name\` add column \`$_header\` text"

  done

  # import csv into mysql
  mysqlimport --fields-enclosed-by='"' --fields-terminated-by=',' --lines-terminated-by="\n" --columns=$_header_columns_string -u $_db_user -p$_db_password $_db $_csv_directory/$_csv_file 
 
done
exit

After creating my shell script file, and making it executable, I executed it. Since I added the line "set -x" to the script, a lot of helpful info is shown to debug.

./import.sh

Lastly I executed some SQL on the command line to verify the results. nice.

$ mysql -u csv_imports -pchangeme csv_imports --execute="select * from albums"
+----+-------------+---------------------+
| id | band        | album               |
+----+-------------+---------------------+
|  1 | band        | album               |
|  2 | black keys  | attack & release    |
|  3 | the dodos   | no color            |
|  4 | the xx      | xx                  |
|  5 | surf city   | kudos               |
|  6 | toro y moi  | underneath the pine |
|  7 | cut copy    | zonoscope           |
|  8 | twin shadow | forget              |
+----+-------------+---------------------+

Eric.London's picture

In this tutorial, I'll show how you can use awk, grep, and sed (my favorite command line tools) to backup and archive your MySQL databases. This can be useful to schedule a cron job, transfer your databases to another server, or any other type of scripting.

First, you'll have to get acquainted with connecting to and dumping your database on the command line. Depending on your user, credentials, and where the databases are located, your command might look something like this. Please note, there is no space between the password and the "-p" flag.

$ mysqldump -u user -pPASSWORD -h hostname database > database.sql

To simplify my example, I'm going to shorten the mysqldump command to the follow.

$ mysqldump database > database.sql

Now that we're MySQL command line pros, I'll break down each command. I'll start by showing all the databases.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases"
+---------------------+
| Database            |
+---------------------+
| customers           |
| db_pics_ericlondon  |
| db_thedrupalblog_d6 |
| drupal              |
| drupal-pics         |
| drupalmusicproject  |
| itunes              |
+---------------------+

Now, I'll "pipe" the output from the previous command into awk to show the first column data.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}'
Database
customers
db_pics_ericlondon
db_thedrupalblog_d6
drupal
drupal-pics
drupalmusicproject
itunes

And use grep to remove the first line that says "Database".

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}' | grep -iv ^Database$
customers
db_pics_ericlondon
db_thedrupalblog_d6
drupal
drupal-pics
drupalmusicproject
itunes

And use sed to build the mysqldump command. This one is kinda tricky, sorry. As you can see, I also embedded the date command in there to generate today's date in the format: YYYYMMDD.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}' | grep -iv ^Database$ | sed 's/\(.*\)/mysqldump \1 > \1.'$(date +"%Y%m%d")'.sql/'
mysqldump customers > customers.20100825.sql
mysqldump db_pics_ericlondon > db_pics_ericlondon.20100825.sql
mysqldump db_thedrupalblog_d6 > db_thedrupalblog_d6.20100825.sql
mysqldump drupal > drupal.20100825.sql
mysqldump drupal-pics > drupal-pics.20100825.sql
mysqldump drupalmusicproject > drupalmusicproject.20100825.sql
mysqldump itunes > itunes.20100825.sql

Lastly, if everything looks good, you can pipe the output back to the command line.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}' | grep -iv ^Database$ | sed 's/\(.*\)/mysqldump \1 > \1.'$(date +"%Y%m%d")'.sql/' | sh

Eric-Londons-MacBook-Pro:backup Eric$ ls -1
customers.20100825.sql
db_pics_ericlondon.20100825.sql
db_thedrupalblog_d6.20100825.sql
drupal-pics.20100825.sql
drupal.20100825.sql
drupalmusicproject.20100825.sql
itunes.20100825.sql

You could even take this one step further and pipe the output through gzip to compress the dumps :)

Eric.London's picture

In anticipation of my new iPhone 4 (and a larger hard drive), I decided to write a script to help choose which albums I should include. Through iTunes you can export your library as text, which can then be imported into MySQL, and SQL can be executed to show.. well, whatever your heart desires!

I started by exporting my entire music library, and chose Plain Text as the format. This iTunes functionality creates a tab separated list of columns and data.

I then created a new MySQL database and table (SEE: TABLE_NAME) to contain my iTunes song data. I also added an additional column as a primary key, called sql_id.

Next, I created a PHP script to parse the text file, dynamically add the columns to my table, and import all my song data. This code uses PEAR's DB library for the database access layer.

<?php
// define table name
define('TABLE_NAME', 'itunes');

// create database connection
require_once('DB.php');
$dsn = 'mysqli://db_user:db_password@db_host/database_name';
$DB =& DB::connect($dsn);
if (
DB::isError($DB)) {
  die(
$DB->getMessage());
}
$DB->setFetchMode(DB_FETCHMODE_ASSOC);

// load text file
$file = file_get_contents('Music.txt');

// explode on new line
$file = explode("\r", $file);

// get a list of existing columns
$sql = "show columns in " . TABLE_NAME;
$result = $DB->getAll($sql);
$existing_columns = array();
foreach (
$result as $key => $value) {
 
$existing_columns[] = $value['Field'];
}

// create a variable to contain sql column names
$sql_column_name = array();

// loop through each line in the file
foreach ($file as $key => $value) {

 
// explode on tab to get column list
 
$exploded = explode("\t", $value);

 
// check for first row, which contains column headers
 
if ($key == 0) {

   
// loop through column list and ensure they exist
   
foreach ($exploded as $new_table) {
   
     
// create a new column name without spaces
     
$new_table = str_replace(' ' , '_', $new_table);

     
// check if the new column should be added   
     
if (!in_array($new_table, $existing_columns)) {
          
       
// define sql to add new column
       
$sql = "alter table " . TABLE_NAME . " add column `$new_table` varchar(255) default null";
       
$result = $DB->query($sql);
       
       
// check for error
       
if (!$result) {
          echo
"<pre>" . print_r($result, true) . "</pre>";
          die;
        }
               
       
// define sql to add index
       
$sql = "alter table " . TABLE_NAME . " add index `index_$new_table` ($new_table)";
       
$result = $DB->query($sql);
       
       
// check for error
       
if (!$result) {
          echo
"<pre>" . print_r($result, true) . "</pre>";
          die;         
        }
     
      }
// end if
     
     
$sql_column_names[] = $new_table;
   
    }
// end foreach
     
 
} // end if ($key[0])
 
else {
   
   
// prepare values to insert
   
$insert_values = array();
    foreach (
$exploded as $k => $v) {
     
$insert_values[$k] = $DB->quoteSmart($v);
    }
   
   
// define SQL to insert data into table
   
$sql = "insert into " . TABLE_NAME . " (" . implode(',', $sql_column_names) . ") values (" . implode(',', $insert_values) . ")";
   
   
// execute sql
   
$result = $DB->query($sql);
   
   
// check for SQL error
   
if (!$result) {
      echo
"<pre>" . print_r($result, true) . "</pre>";
    }
 
  }

}
?>

Now, all my iTunes data is accessible by SQL, sweet. I decided to create a view showing my albums with an average ranking.

create view album_rankings as
select Artist, Album, Genre, avg(My_Rating) as album_rating, count(*) as countX
from itunes
where Album != 'misc'
group by Artist, Album
order by avg(My_Rating) desc, countX desc

And finally, I executed the following SQL to show my highest rated albums.

select *
from album_rankings
where countX > 3

Top Rated Albums

Syndicate content