Process XML data in Hadoop

To read XML files

Mahout has XML input format , see the blog post below to read more

https://github.com/apache/mahout/blob/ad84344e4055b1e6adff5779339a33fa29e1265d/examples/src/main/java/org/apache/mahout/classifier/bayes/XmlInputFormat.java

http://xmlandhadoop.blogspot.com.au/2010/08/xml-processing-in-hadoop.html

Pig has XMLLoader

http://pig.apache.org/docs/r0.7.0/api/org/apache/pig/piggybank/storage/XMLLoader.html

No comments:

Post a Comment

Please share your views and comments below.

Thank You.