background image
HomeRecent PostsDrupalSearchTagsRSSContactAboutAccount
Eric.London's picture

In this article, I'll share a bash shell script I use periodically to import a directory of CSV files into MySQL tables. This script is most helpful when you need to process a large number of CSV files, and the tables have not yet been created. Of course you could use a GUI tool to accomplish this, but what's the fun in that?

The following script will get a list of CSV files, loop through them, add each table, add each column to the table (based on the first row), and then use the mysqlimport command to load all the CSV records. There are a few caveats though: 1. the first row of each CSV file must contain the column names; 2. it works best when your column names are simple text; and 3. your MySQL user must have permission to process files (see: File_priv).

#!/bin/bash

# show commands being executed, per debug
set -x

# define database connectivity
_db="csv_imports"
_db_user="csv_imports"
_db_password="changeme"

# define directory containing CSV files
_csv_directory="/path/to/the/csv/files"

# go into directory
cd $_csv_directory

# get a list of CSV files in directory
_csv_files=`ls -1 *.csv`

# loop through csv files
for _csv_file in ${_csv_files[@]}
do
   
  # remove file extension
  _csv_file_extensionless=`echo $_csv_file | sed 's/\(.*\)\..*/\1/'`
 
  # define table name
  _table_name="${_csv_file_extensionless}"
 
  # get header columns from CSV file
  _header_columns=`head -1 $_csv_directory/$_csv_file | tr ',' '\n' | sed 's/^"//' | sed 's/"$//' | sed 's/ /_/g'`
  _header_columns_string=`head -1 $_csv_directory/$_csv_file | sed 's/ /_/g' | sed 's/"//g'`
 
  # ensure table exists
  mysql -u $_db_user -p$_db_password $_db << eof
    CREATE TABLE IF NOT EXISTS \`$_table_name\` (
      id int(11) NOT NULL auto_increment,
      PRIMARY KEY  (id)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1
eof
 
  # loop through header columns
  for _header in ${_header_columns[@]}
  do

    # add column
    mysql -u $_db_user -p$_db_password $_db --execute="alter table \`$_table_name\` add column \`$_header\` text"

  done

  # import csv into mysql
  mysqlimport --fields-enclosed-by='"' --fields-terminated-by=',' --lines-terminated-by="\n" --columns=$_header_columns_string -u $_db_user -p$_db_password $_db $_csv_directory/$_csv_file 
 
done
exit

After creating my shell script file, and making it executable, I executed it. Since I added the line "set -x" to the script, a lot of helpful info is shown to debug.

./import.sh

Lastly I executed some SQL on the command line to verify the results. nice.

$ mysql -u csv_imports -pchangeme csv_imports --execute="select * from albums"
+----+-------------+---------------------+
| id | band        | album               |
+----+-------------+---------------------+
|  1 | band        | album               |
|  2 | black keys  | attack & release    |
|  3 | the dodos   | no color            |
|  4 | the xx      | xx                  |
|  5 | surf city   | kudos               |
|  6 | toro y moi  | underneath the pine |
|  7 | cut copy    | zonoscope           |
|  8 | twin shadow | forget              |
+----+-------------+---------------------+

Eric.London's picture

In this tutorial, I'll show how you can use awk, grep, and sed (my favorite command line tools) to backup and archive your MySQL databases. This can be useful to schedule a cron job, transfer your databases to another server, or any other type of scripting.

First, you'll have to get acquainted with connecting to and dumping your database on the command line. Depending on your user, credentials, and where the databases are located, your command might look something like this. Please note, there is no space between the password and the "-p" flag.

$ mysqldump -u user -pPASSWORD -h hostname database > database.sql

To simplify my example, I'm going to shorten the mysqldump command to the follow.

$ mysqldump database > database.sql

Now that we're MySQL command line pros, I'll break down each command. I'll start by showing all the databases.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases"
+---------------------+
| Database            |
+---------------------+
| customers           |
| db_pics_ericlondon  |
| db_thedrupalblog_d6 |
| drupal              |
| drupal-pics         |
| drupalmusicproject  |
| itunes              |
+---------------------+

Now, I'll "pipe" the output from the previous command into awk to show the first column data.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}'
Database
customers
db_pics_ericlondon
db_thedrupalblog_d6
drupal
drupal-pics
drupalmusicproject
itunes

And use grep to remove the first line that says "Database".

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}' | grep -iv ^Database$
customers
db_pics_ericlondon
db_thedrupalblog_d6
drupal
drupal-pics
drupalmusicproject
itunes

And use sed to build the mysqldump command. This one is kinda tricky, sorry. As you can see, I also embedded the date command in there to generate today's date in the format: YYYYMMDD.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}' | grep -iv ^Database$ | sed 's/\(.*\)/mysqldump \1 > \1.'$(date +"%Y%m%d")'.sql/'
mysqldump customers > customers.20100825.sql
mysqldump db_pics_ericlondon > db_pics_ericlondon.20100825.sql
mysqldump db_thedrupalblog_d6 > db_thedrupalblog_d6.20100825.sql
mysqldump drupal > drupal.20100825.sql
mysqldump drupal-pics > drupal-pics.20100825.sql
mysqldump drupalmusicproject > drupalmusicproject.20100825.sql
mysqldump itunes > itunes.20100825.sql

Lastly, if everything looks good, you can pipe the output back to the command line.

Eric-Londons-MacBook-Pro:backup Eric$ mysql --execute="show databases" | awk '{print $1}' | grep -iv ^Database$ | sed 's/\(.*\)/mysqldump \1 > \1.'$(date +"%Y%m%d")'.sql/' | sh

Eric-Londons-MacBook-Pro:backup Eric$ ls -1
customers.20100825.sql
db_pics_ericlondon.20100825.sql
db_thedrupalblog_d6.20100825.sql
drupal-pics.20100825.sql
drupal.20100825.sql
drupalmusicproject.20100825.sql
itunes.20100825.sql

You could even take this one step further and pipe the output through gzip to compress the dumps :)

Eric.London's picture

Here's a quick script to reset ownership on a directory and then commit all changes (deletions, additions, and modifications) to subversion...

#!/bin/sh

_DIR="/path/to/my/svn/directory"
_DATE=`date +%Y\-%m\-%d\ %H\:%I\:%S`

_USER="Eric"
_GROUP="Eric"

# reset file ownership
find ${_DIR} -exec chown ${_USER}.${_GROUP} {} \;

# add new files
svn stat ${_DIR} | grep ^? | sed 's/^?      /svn add "/' | sed 's/$/"/' | sh

# remove deleted files
svn stat ${_DIR} | grep ^! | sed 's/^!      /svn del "/' | sed 's/$/"/' | sh

# commit modifications
svn commit ${_DIR} -m "Automated Commit: ${_DATE}"

Eric.London's picture

If you've ever gotten the following error, you might need to reset the file permissions and ownership on all of your project files, including the hidden subversion files (located in the .svn folders). This can occur if you've ever executed a subversion command as root, which I try to avoid doing.

svn: Can't open file 'PATH/TO/YOUR/FILES/.svn/lock': Permission denied

The following commands can be used to reset the permissions and ownership for all the files in your directory. NOTE: only execute these commands if you feel comfortable with the shell and know what the file permissions should be set to for your files.

# go to the path of your project
cd /PATH/TO/MY/PROJECT

# reset ownership
# NOTE: replace apache.staff with your user and group
sudo find . -exec chown apache.staff {} \;

# reset permissions
# NOTE: replace 2770 with your file permissions
sudo find . -exec chmod 2770 {} \;

# Now you can run the cleanup command to repair your .svn folders
svn cleanup

Eric.London's picture

Here's a command to add all the new files in your current path to subversion:

svn stat | grep ^? | sed 's/?      /svn add "/' | sed 's/$/"/' | sh
svn commit -m "added all my new files"

Syndicate content