How to run simple Hadoop programs
Let us write simple hadoop program and try to run it in Hadoop
Copy the following code in your eclipse java project in some class file
Configure the eclipse path to remove any errors. If you need help in setting up eclipse for hadoop then please see other post.
package org.jagat.hdfs;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HDFSCopyAll {
public static void main(String[] args) throws IOException {
Configuration conf = new Configuration();
FileSystem hdfs = FileSystem.get(conf);
FileSystem local = FileSystem.getLocal(conf);
FileStatus[] localinput = local.listStatus(new Path(
"/home/hadoop/software/20/pig-0.10.0/docs/api"));
for(int i=0;i<localinput.length;i++){
System.out.println( localinput[i].getLen());
}
}
}
It is just using Hadoop API to get the length of files present in api directory.
The intention of this post is not to teach anything about hadoop api or about mapreduce programs , but just how to run a hadoop program you write.
Change the path above (/home/hadoop/software/20/pig-0.10.0/docs/api) to some real path present in your computer.
Now its time to package this as a Jar
Go to File > Export
Eclipse will show us menu.
Choose Main class as the above class name HDFSCopyAll and create a jar in your computer
Now its time to run it.
Open terminal and go to place where you made the jar
and invoke the jar as follows
$hadoop jar learnHadoop.jar
This will run and show you the output.
If you are stuck somewhere just post message in comments below.
Thanks for reading
No comments:
Post a Comment
Please share your views and comments below.
Thank You.