In this blog post, I’ll share some code and configurations I have used to setup and demo Hadoop, Pig, and Ruby Map/Reduce via Homebrew on OSX.
Install Hadoop and Pig via Brew:
Added environment variables (file: ~/.bash_profile or ~/.bashrc). Note: I upgraded my OSX Java version (to 1.7) using Oracle’s binary.
Revised default Hadoop config. directory: $HADOOP_CONF_DIR
Execute first example:
Created a Ruby script for map/reduce functions:
Simple stream example using pipes and sort (external to hadoop):
Executing the above script in Hadoop via stream:
Using Pig (on Hadoop with HDFS)
Create JSON file using above script:
Use Pig command line tool to load JSON file and convert to CSV:
Output CSV data to local filesystem:
More examples to follow :)