Following packages helps working with Bigdata from R.
1. rmr2
2. rhdfs
3. HadoopStreamingR
4. Rhipe
5. h2o
6. SparkR
The links to documentation and tutorials for each of them are below.
All the packages work on the basis of Hadoop Streaming to run the work on cluster instead of single R node. If you are new to Hadoop read the basics of Hadoop Streaming on https://hadoop.apache.org/docs/stable2/hadoop-streaming/HadoopStreaming.html and short tutorial on writing jobs which run using Python. http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
Package Name
Useful Tutorials / Readings
rmr2
Web links
https://github.com/RevolutionAnalytics/rmr2/blob/master/docs/tutorial.md
Book
R in Nutshell 2nd edition ( Chapter 26 )
http://shop.oreilly.com/product/0636920022008.do
rhdfs
Wiki
https://github.com/RevolutionAnalytics/RHadoop/wiki/user%3Erhdfs%3EHome
HadoopStreamingR
Cran package documentation
https://cran.r-project.org/web/packages/HadoopStreaming/HadoopStreaming.pdf
Rhipe
Web links
http://tessera.io/docs-RHIPE/#install-and-push
h2o
Documentation using h2o from R
http://h2o-release.s3.amazonaws.com/h2o/rel-slater/1/docs-website/h2o-docs/index.html#%E2%80%A6%20From%20R
R h2o package documentation ( ~140 pages )
http://h2o-release.s3.amazonaws.com/h2o/rel-slater/1/docs-website/h2o-r/h2o_package.pdf
SparkR
Api
https://spark.apache.org/docs/latest/api/R/index.html
Documentation
https://spark.apache.org/docs/latest/api/R/index.html
1. rmr2
2. rhdfs
3. HadoopStreamingR
4. Rhipe
5. h2o
6. SparkR
The links to documentation and tutorials for each of them are below.
All the packages work on the basis of Hadoop Streaming to run the work on cluster instead of single R node. If you are new to Hadoop read the basics of Hadoop Streaming on https://hadoop.apache.org/docs/stable2/hadoop-streaming/HadoopStreaming.html and short tutorial on writing jobs which run using Python. http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
Package Name
Useful Tutorials / Readings
rmr2
Web links
https://github.com/RevolutionAnalytics/rmr2/blob/master/docs/tutorial.md
Book
R in Nutshell 2nd edition ( Chapter 26 )
http://shop.oreilly.com/product/0636920022008.do
rhdfs
Wiki
https://github.com/RevolutionAnalytics/RHadoop/wiki/user%3Erhdfs%3EHome
HadoopStreamingR
Cran package documentation
https://cran.r-project.org/web/packages/HadoopStreaming/HadoopStreaming.pdf
Rhipe
Web links
http://tessera.io/docs-RHIPE/#install-and-push
h2o
Documentation using h2o from R
http://h2o-release.s3.amazonaws.com/h2o/rel-slater/1/docs-website/h2o-docs/index.html#%E2%80%A6%20From%20R
R h2o package documentation ( ~140 pages )
http://h2o-release.s3.amazonaws.com/h2o/rel-slater/1/docs-website/h2o-r/h2o_package.pdf
SparkR
Api
https://spark.apache.org/docs/latest/api/R/index.html
Documentation
https://spark.apache.org/docs/latest/api/R/index.html