background image
HomeRecent PostsDrupalSearchTagsRSSContactAboutAccount
Eric.London's picture

Just thought I'd share a PHP shell script to scan a file system path, search for all .git/.svn directories recursively, and collect a unique list of all remote repository URLs. It uses "svn info" or "git remote" to get the repository URL path.

<?php
// define a path to scan
$scan_path = '/var/www/vhosts';

// keep track of current working dir
$original_cwd = $_SERVER['PWD'];

// get a list of all .git and .svn directories
$files = trim(`find "$scan_path" -type d | egrep -ir '\.(git|svn)$'`);

// explode on "\n"
$files = explode("\n", $files);

// loop through files
$repo_list = array();
foreach(
$files as $key => $file) {

 
// .git or .svn ?
 
$repo_type = substr($file, -4);
 
$repo_path = substr($file, 0, -4);

  switch (
$repo_type) {
    case
'.svn':

     
// get svn repo root
     
$repo_root = trim(`svn info "$repo_path" | grep ^Repository\ Root | sed 's/Repository Root: //'`);
      if (!
in_array($repo_root, $repo_list)) {
       
$repo_list[] = $repo_root;
      }
      break;

    case
'.git':

     
// change dir
     
chdir($repo_path);
     
     
// get git remote path
     
$repo_root = trim(`git remote -v | grep -ir fetch | awk '{print \$2}' | head -1`);
      if (!
in_array($repo_root, $repo_list)) {
       
$repo_list[] = $repo_root;
      }
      break;
  }

}

// sort
sort($repo_list);

// go back to cwd
chdir($original_cwd);

// output repo list into file
file_put_contents('repo_list.txt', implode("\n", $repo_list) . "\n");
?>

I put this PHP in a file called "scan-for-version-control.php", and run it by typing:

$ php scan-for-version-control.php

It created a file in the current working directory: repo_list.txt containing stuff like..

git@myuser.someversioncontrolhost.com:myuser/somerepo1.git
git@myuser.someversioncontrolhost.com:myuser/somerepo2.git
https://myuser.svn.someversioncontrolhost.com/somerepo1
https://myuser.svn.someversioncontrolhost.com/somerepo2

In this article, I'll show an example of how to implement a Subversion pre-commit hook to integrate with Drupal. Pre-commit hooks can be used to execute any arbitrary code, such as deployment procedures, archiving databases, etc. For this example, I will show how to check for the creation of a subversion tag and archive the database.

To get started, I created a local subversion repository.

# create folder for subversion repositories
$ mkdir /var/subversion

# create the subversion repository
$ svnadmin create /var/subversion/project

Upon creating a new local svn repository, a hooks directory will be created. Example: /var/subversion/project/hooks

Inside this directory will be a bunch of sample scripts ending in ".tmpl" which contain example hook scripts. Here are the contents of the example pre-commit hook without comments:

$ cat pre-commit.tmpl | egrep -iv "(^#|^$)"
REPOS="$1"
TXN="$2"
SVNLOOK=/usr/bin/svnlook
$SVNLOOK log -t "$TXN" "$REPOS" | \
   grep "[a-zA-Z0-9]" > /dev/null || exit 1
commit-access-control.pl "$REPOS" "$TXN" commit-access-control.cfg || exit 1
exit 0

As noted in the pre-commit.tmpl file, there are 2 arguments being passed to the pre-commit script:

[1] REPOS-PATH   (the path to this repository)
[2] TXN-NAME     (the name of the txn about to be committed)

I created a new file called "pre-commit" and added the following contents:

#!/bin/bash

/var/www/vhosts/project.vm/scripts/svn-pre-commit.php "$1" "$2"

I then made the file executable.

$ chmod ug+w pre-commit

The above script simply passes the arguments to a PHP script contained with the Drupal project.

In my scripts folder (/var/www/vhosts/project.vm/scripts), I created the PHP script "svn-pre-commit.php" with the following contents:

#!/usr/bin/php
<?php

// get args
$repo = $argv[1];

// define path to mysql backups
$mysql_backups_path = '/var/www/vhosts/project.vm/database';

// define path to drupal docroot
$drupal_docroot_path = '/var/www/vhosts/project.vm/htdocs';

// get changed path
// example output:
// A   tags/20110503/
$svn_look = `svnlook changed $repo`;

// define pattern to break apart svnlook changed
$pattern = '/^\s*([A-Za-z])\s*(.*)$/';

// execute preg match
preg_match($pattern, $svn_look, $matches);
$svn_action = $matches[1];
$svn_changed_path = $matches[2];

// check if a tag is being created
if ($svn_action == 'A' && substr($svn_changed_path, 0, 5)=='tags/') {

  // get tag name
  $exploded = explode('/', $svn_changed_path);
  $tag_name = $exploded[1];
 
  // change dir to drupal docroot
  chdir($drupal_docroot_path);

  // backup mysql database using drush
  `/var/www/drush/drush sql-dump > {$mysql_backups_path}/tag_{$tag_name}.sql`;

}

I also made this script executable:

$ chmod ug+w svn-pre-commit.php

Now, assuming that my Drupal site is integrated with the subversion repository, and development is at a point to deploy/create a new tag, I executed the following command to create the subversion tag:

To verify, I entered the directory containing my database dumps to checkout the result:

$ cd /var/www/vhosts/project.vm/database

$ ls -1
tag_beta-0.1.sql

Eric.London's picture

In this article, I'll share a bash shell script I use periodically to import a directory of CSV files into MySQL tables. This script is most helpful when you need to process a large number of CSV files, and the tables have not yet been created. Of course you could use a GUI tool to accomplish this, but what's the fun in that?

The following script will get a list of CSV files, loop through them, add each table, add each column to the table (based on the first row), and then use the mysqlimport command to load all the CSV records. There are a few caveats though: 1. the first row of each CSV file must contain the column names; 2. it works best when your column names are simple text; and 3. your MySQL user must have permission to process files (see: File_priv).

#!/bin/bash

# show commands being executed, per debug
set -x

# define database connectivity
_db="csv_imports"
_db_user="csv_imports"
_db_password="changeme"

# define directory containing CSV files
_csv_directory="/path/to/the/csv/files"

# go into directory
cd $_csv_directory

# get a list of CSV files in directory
_csv_files=`ls -1 *.csv`

# loop through csv files
for _csv_file in ${_csv_files[@]}
do
   
  # remove file extension
  _csv_file_extensionless=`echo $_csv_file | sed 's/\(.*\)\..*/\1/'`
 
  # define table name
  _table_name="${_csv_file_extensionless}"
 
  # get header columns from CSV file
  _header_columns=`head -1 $_csv_directory/$_csv_file | tr ',' '\n' | sed 's/^"//' | sed 's/"$//' | sed 's/ /_/g'`
  _header_columns_string=`head -1 $_csv_directory/$_csv_file | sed 's/ /_/g' | sed 's/"//g'`
 
  # ensure table exists
  mysql -u $_db_user -p$_db_password $_db << eof
    CREATE TABLE IF NOT EXISTS \`$_table_name\` (
      id int(11) NOT NULL auto_increment,
      PRIMARY KEY  (id)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1
eof
 
  # loop through header columns
  for _header in ${_header_columns[@]}
  do

    # add column
    mysql -u $_db_user -p$_db_password $_db --execute="alter table \`$_table_name\` add column \`$_header\` text"

  done

  # import csv into mysql
  mysqlimport --fields-enclosed-by='"' --fields-terminated-by=',' --lines-terminated-by="\n" --columns=$_header_columns_string -u $_db_user -p$_db_password $_db $_csv_directory/$_csv_file 
 
done
exit

After creating my shell script file, and making it executable, I executed it. Since I added the line "set -x" to the script, a lot of helpful info is shown to debug.

./import.sh

Lastly I executed some SQL on the command line to verify the results. nice.

$ mysql -u csv_imports -pchangeme csv_imports --execute="select * from albums"
+----+-------------+---------------------+
| id | band        | album               |
+----+-------------+---------------------+
|  1 | band        | album               |
|  2 | black keys  | attack & release    |
|  3 | the dodos   | no color            |
|  4 | the xx      | xx                  |
|  5 | surf city   | kudos               |
|  6 | toro y moi  | underneath the pine |
|  7 | cut copy    | zonoscope           |
|  8 | twin shadow | forget              |
+----+-------------+---------------------+

I recently found some time to switch my site's search framework from Lucene to Apache Solr. The module's README.txt makes installation for small production sites easy and straight forward.

Following the installation guide, I started the java Solr process by entering the right directory and executing the java jar..

$ java -jar start.jar

Everything was up and running in minutes.. until I closed my terminal and the java service ended with my shell process. Short term, I decided to writing a bash shell script to ensure Solr is running, and cron it to run every five minutes.

Here are the contents of my bash shell script:

#!/bin/bash

# check for process id
pid=`ps ax | grep -ir java.*jar.*start\.jar | grep -iv grep | awk '{print $1}'`

# check if pid is not an integer
if ! [[ "$pid" =~ ^[0-9]+$ ]] ; then

  # start service
  cd /path/to/my/apache-solr-1.4.1/installation
  java -jar start.jar &

  # send email notification
  message='ericlondon.com: starting solr service'
  subject='ericlondon.com: starting solr service'
  to='myemail@example.com'
  echo "$message" | mail -s "$subject" $to

  exit 1;

fi

And I added the following cronjob:

$ crontab -l
*/5 * * * * /path/to/my/scripts/folder/check_solr.sh

A better option would be to setup initialization scripts for the process (/etc/init.d/), or install Solr as a more permanent solution, but I guess this will do for the time being :) ...


Part 2, Using Supervisor (updated: 2011/04/12)

As mentioned above, using a cronjob is probably not the best solution. I decided to install and configure supervisord to monitor the process.

Unfortunately supervisor was not available for for Centos 5.5 (RHEL):

$ yum search supervisor
Finished
Warning: No matches found for: supervisor
No Matches found

Luckily, I found some RPMs via http://rpmfind.net. I installed supervisor and its one dependency:

# downloading RPMs
$ wget ftp://rpmfind.net/linux/epel/5/x86_64/supervisor-2.1-3.el5.noarch.rpm
$ wget ftp://rpmfind.net/linux/epel/5/x86_64/python-meld3-0.6.3-1.el5.x86_64.rpm

# installing RPMs
$ rpm -Uvh python-meld3-0.6.3-1.el5.x86_64.rpm
$ rpm -Uvh supervisor-2.1-3.el5.noarch.rpm

# setting run level for supervisord
$ chkconfig --level 2345 supervisord on

# starting supervisor
$ /etc/init.d/supervisord start

Next, I create a simple shell script to start the Solr process and made the script executable. NOTE: file contents have been simplified:

#!/bin/bash

# enter solr dir
cd /path/to/my/apache-solr-1.4.1/installation

# start solr
java -jar start.jar

Lastly, I added a few line to my supervisor conf file (/etc/supervisord.conf):

[program:apache_solr]
command=/path/to/my/scripts/folder/apache-solr-supervisor-run.sh

Upon restarting supervisor, solr started automatically

$ /etc/init.d/supervisord restart

$ ps aux | grep -ir java | grep -iv grep
root     28670  0.1  8.5 1041076 43548 ?       Sl   13:30   0:02 java -jar start.jar

I killed the script and it immediately came back (with a different process ID)!

$ kill 28670

$ ps aux | grep -ir java | grep -iv grep
root     28869 62.0  5.3 1021532 27016 ?       Sl   13:50   0:00 java -jar start.jar

Rsync is a great command line program for copying and sync'ing data. It can use standard SSH protocol (default port 22) to copy files from computer to computer, or locally from one path to another. It frequently comes on linux/unix systems, but if you're using Windoze, I suggest installing Cygwin.

Part One
The first step in this tutorial is to setup passwordless SSH. Open a terminal on the computer you want to copy files from, referred to in this article as "local".

# use the ssh-keygen command to generate a public and private key
# I left the passphrase empty, and used the default path: ~/.ssh/id_dsa
local$ ssh-keygen -t dsa

# the above command will create two files (public and private keys)
local$ ls -l ~/.ssh/id_dsa*
-rw-------  1 Eric  staff  668 Feb 26 11:32 /Users/Eric/.ssh/id_dsa
-rw-r--r--  1 Eric  staff  611 Feb 26 11:32 /Users/Eric/.ssh/id_dsa.pub

SCP the public key file (id_dsa.pub) to the computer that will receive the files, referred to as "remote".

# NOTE: you'll need to replace "Eric@remote" with your remote username and IP address
local$ scp ~/.ssh/id_dsa.pub Eric@remote:~/.ssh/id_dsa.pub.transferred

SSH to the remote system and execute a few commands to enable passwordless SSH

$ SSH to remote system
local$ ssh Eric@remote

# append public key to "authorized_keys"
remote$ cat ~/.ssh/id_dsa.pub.transferred >> ~/.ssh/authorized_keys

# remove obsolete public key
remote$ rm ~/.ssh/id_dsa.pub.transferred

# exit remote system
remote$ exit

To verify that the public/private keys are working, SSH to the remote system. You should not be prompted for a password this time.

Part Two
The second step of this tutorial is creating an executable shell script that will transfer the files. I chose to put my scripts in the folder "~/scripts/", but you could put them anywhere you want.

Open up your favorite text editor (emacs, vi, nano, etc) and enter your rsync command.

#!/bin/bash
rsync -avz --delete /path/on/local/computer/ Eric@remote:/path/on/remote/computer/

Please note, the "--delete" flag is optional, and will remove files on the remote computer that do not exist on the local computer. Please use caution.

For my real life example, I setup a script to rsync my iTunes library from my iMac to my MacBookPro.

#!/bin/bash
rsync -avz --delete --exclude '*.m4v' --exclude '*.mp4' ~/Music/iTunes/ Eric@remote:~/Music/iTunes/

After saving the script, set it to be executable using chmod.

local$ chmod u+x /path/to/local/rsync.script.sh

Test your script on the command line, and then SSH to the remote computer to verify the copied files.

local$ /path/to/local/rsync.script.sh

If all is working well, you can setup a cron job to run at your desired time interval. Remember, both computers must be running for this to be automated, so choose a time you know they'll both be on. For example, to run this script daily..

local$ crontab -e

# min hour dayMonth month dayWeek command
0 0 * * * /path/to/local/rsync.script.sh

Syndicate content