Jugnu Life :-): August 2012

Number of cores in Hadoop cluster

Recently while working for one of the customer we had issues with number of cores shown in the cluster was 2 per each node.

$ cat /proc/cpuinfo

Whereas the processor of the system was Intel Xeon E5620 , it should have 4 cores and 8 threads.

After analysis we found that the number of cores were wrongly shown as apci was turned off in all the nodes

/etc/grub.conf

Changing apci=ht in all 12 nodes

Made Redhat to detect all the threads in the system since it was stopped earlier.

This made the Hadoop cluster to perform like anything , it processed lot lot better and customer was happy. Not sure who was at fault , why this was off earlier. I found it and we fixed it that’s the happy part.

How do you handle your installations so that you avoid such kids of errors?

Just after that I modified the Hadoop map tasks and reduce tasks . The first performance tuning step which we all do

Oozie Operation category READ is not supported in state standby

Oozie gave error as

Problem

Error: E0501 : E0501: Could not perform authorization operation, Operation category READ is not supported in state standby at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1375) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:717) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2565) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:663) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:624) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42648) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)

Solution

It means that the Namenode to which Oozie is trying to connect is on Standby Mode

Open job.properties

See the Namenode value double verify that this Namenode is active one

Also check oozie.log to verify that oozie is reading correct Namenode value

In HA mode the

namenode URL should point to nameservice

e,g

namenodeURL=hdfs://nameserviceName

Inside your job properties

Fuse mount on HA (High Availability) cluster

If you need to know about how to install Fuse Mount on Cluster then you can go to link mentioned below.

This post is more of Fuse mount in HA mode cluster

The instructions on the Cloudera page explains the installation steps for Fuse mount

https://ccp.cloudera.com/display/CDH4DOC/Mountable+HDFS

If you read line

hadoop-fuse-dfs dfs://<name_node_hostname>:<namenode_port> <mount_point>

The above

dfs://NamenodeAddress:Port

Type of configuration cannot be done if that the Cluster is in High Availability Cluster (HA) Mode. This is due to expectations for URI in specific format with IP and port , which cannot be used if the cluster is in HA mode as we use nameservice name

There is one Jira which would resolve this

https://issues.apache.org/jira/browse/HDFS-3609