Cloudera Hadoop certification now available worldwide 1 May 2012

At last its 1 May 2012

Cloudera has opened certifications through vue to worldwide people

Details are as follows from


Developer Exam

Exam Name: Cloudera Certified Developer for Apache Hadoop
Current Version: (CCD-333)
Certification Requirement: Required for Cloudera Certified Developer for Apache Hadoop (CCDH)
Number of Questions: 60
Time Limit: 90 minutes
Passing Score : 67%
Languages: English (Japanese forthcoming)


Adminstrator exam


Exam Name: Cloudera Certified Administrator for Apache Hadoop
Current Version: (CCA-332)
Certification Requirement: Required for Cloudera Certified Administrator for Apache Hadoop (CCAH)
Number of Questions: 30
Time Limit: 60 minutes
Passing Score: 67%
Languages: English (Japanese forthcoming)

You can register for it at

Questions can be single choice and multiple choice correct answer types


Syllabus guidelines for Developer exam

    Core Hadoop Concepts
    Recognize and identify Apache Hadoop daemons and how they function both in data storage and processing. Understand how Apache Hadoop exploits data locality. Given a big data scenario, determine the challenges to large-scale computational models and how distributed systems attempt to overcome various challenges posed by the scenario.

    Storing Files in Hadoop
    Analyze the benefits and challenges of the HDFS architecture, including how HDFS implements file sizes, block sizes, and block abstraction. Understand default replication values and storage requirements for replication. Determine how HDFS stores, reads, and writes files. Given a sample architecture, determine how HDFS handles hardware failure.

    Job Configuration and Submission
    Construct proper job configuration parameters, including using JobConf and appropriate properties. Identify the correct procedures for MapReduce job submission. How to use various commands in job submission (“hadoop jar” etc.)

    Job Execution Environment
    Given a MapReduce job, determine the lifecycle of a Mapper and the lifecycle of a Reducer. Understand the key fault tolerance principles at work in a MapReduce job. Identify the role of Apache Hadoop Classes, Interfaces, and Methods. Understand how speculative execution exploits differences in machine configurations and capabilities in a parallel environment and how and when it runs.

    Input and Output
    Given a sample job, analyze and determine the correct InputFormat and OutputFormat to select based on job requirements. Understand the role of the RecordReader, and of sequence files and compression.

    Job Lifecycle
    Analyze the order of operations in a MapReduce job, how data moves from place to place, how partitioners and combiners function, and the sort and shuffle process.

    Data processing
    Analyze and determine the relationship of input keys to output keys in terms of both type and number, the sorting of keys, and the sorting of values. Given sample input data, identify the number, type, and value of emitted keys and values from the Mappers as well as the emitted data from each Reducer and the number and contents of the output file(s).

    Key and Value Types
    Given a scenario, analyze and determine which of Hadoop’s data types for keys and values are appropriate for the job. Understand common key and value types in the MapReduce framework and the interfaces they implement.

    Common Algorithms and Design Patterns
    Evaluate whether an algorithm is well-suited for expression in MapReduce. Understand implementation and limitations and strategies for joining datasets in MapReduce. Analyze the role of DistributedCache and Counters.

    The Hadoop Ecosystem
    Analyze a workflow scenario and determine how and when to leverage ecosystems projects, including Apache Hive, Apache Pig, Sqoop and Oozie. Understand how Hadoop Streaming might apply to a job workflow.


Syllabus guidelines for Admin exam

    Apache Hadoop Cluster Core Technologies
    Daemons and normal operation of an Apache Hadoop cluster, both in data storage and in data processing. The current features of computing systems that motivate a system like Apache Hadoop.

    Apache Hadoop Cluster Planning
    Principal points to consider in choosing the hardware and operating systems to host an Apache Hadoop cluster.

    Apache Hadoop Cluster Management
    Cluster handling of disk and machine failures. Regular tools for monitoring and managing the Apache Hadoop file system

    Job Scheduling
    How the default FIFO scheduler and the FairScheduler handle the tasks in a mix of jobs running on a cluster.

    Monitoring and Logging
    Functions and features of Apache Hadoop’s logging and monitoring systems.


For more details please see


If you want to form study group with me for preparation then please message me at jagatsingh [at] gmail [dot] com

See more details at Hadoop Study Group

No comments:

Post a Comment

Please share your views and comments below.

Thank You.