Jugnu Life :-): Sqoop installation tutorial

Sqoop is a tool which is used to import / export data from RDBMS to HDFS

It can be downloaded from the apache website. As of writing this post the Sqoop is in incubation project with apache , but it would come as full project in the near future.

Sqoop is a client tool , you are not required to install it to all nodes of Cluster. The best practice is to just install it on client ( or edge node of the cluster) . The data transfer is direct between Cluster and Database , incase you are worried for traffic between machine where you install Sqoop and Database.

Installation steps

You can download the latest version of sqoop from apache website
http://sqoop.apache.org/

The installation is fairly simple to start off for development purpose with Sqoop

Download the latest sqoop binary file

Extract it in some folder

Specify the SQOOP_HOME and add Sqoop path variable so that we can directly run the sqoop commands

For example i downloaded sqoop in following directory and my environment variables look like this
export SQOOP_HOME="/home/hadoop/software/sqoop-1.4.3"

export PATH=$PATH:$SQOOP_HOME/bin

Sqoop can be connected to various types of databases .

For example it can talk to mysql , Oracle , Postgress databases. It uses JDBC to connect to them. JDBC driver for each of databases is needed by sqoop to connect to them.

JDBC driver jar for each of the database can be downloaded from net. For example mysql jar is present at link below

http://dev.mysql.com/downloads/connector/j/

Download the mysql j connector jar and store in lib directory present in sqoop home folder.

Thats it.

Just test your installation by typing

$ sqoop help

You should see the list of commands with there use in sqoop

Happy sqooping :)

4 comments:

NIKHILJune 22, 2012 at 2:34 AM
Thanks for the post. Helpful
UnknownNovember 14, 2012 at 3:48 PM
is Sqoop should be install on the development machine ? or on the hadoop node ! ?

How can I run it under windows
JugnuNovember 15, 2012 at 2:22 PM
Sqoop is a client software , no need to run on Hadoop node.

You can install it on some machine and setup configuration settings to send data to cluster.
fozzDecember 11, 2012 at 12:42 PM
I can't make Sqoop work on Windows.

Could you please provide steps on how to setup it on Windows? Thanks!

Please share your views and comments below.

Thank You.