Cloudera Certified Developer for Apache Hadoop Syllabus exam topics and contents (CCDH)

Cloudera Certified Developer for Apache Hadoop (CCDH)

Update : 6 April 2013

Cloudera has added exam learning resources on the website , please read this link for latest.

http://university.cloudera.com/certification/prep/ccdh.html

http://jugnu-life.blogspot.in/2012/05/cloudera-hadoop-certification-now.html

Syllabus , exam contents

http://university.cloudera.com/certification.html

To earn a CCDH certification, candidates must pass an exam designed to test a candidate’s fluency with the concepts and skills required in the following areas:

If you are interested in Administrator exam then you should read other post

http://jugnu-life.blogspot.in/2012/03/cloudera-certified-administrator-for.html

 

Exam syllabus for Developer and Study sources are mentioned below.

1. Core Hadoop Concepts (CCD-410:25% | CCD-470: 33%)

Objectives
  • Recognize and identify Apache Hadoop daemons and how they function both in data storage and processing under both CDH3 and CDH4.
  • Understand how Apache Hadoop exploits data locality, including rack placement policy.
  • Given a big data scenario, determine the challenges to large-scale computational models and how distributed systems attempt to overcome various challenges posed by the scenario.
  • Identify the role and use of both MapReduce v1 (MRv1) and MapReduce v2 (MRv2 / YARN) daemons.
Section Study Resources

 

2. Storing Files in Hadoop (7%)

Objectives
  • Analyze the benefits and challenges of the HDFS architecture
  • Analyze how HDFS implements file sizes, block sizes, and block abstraction.
  • Understand default replication values and storage requirements for replication.
  • Determine how HDFS stores, reads, and writes files.
  • Given a sample architecture, determine how HDFS handles hardware failure.
Section Study Resources
  • Hadoop: The Definitive Guide, 3rd edition: Chapter 3
  • Hadoop Operations: Chapter 2
  • Hadoop in Practice: Appendix C: HDFS Dissected

3. Job Configuration and Submission (7%)

Objectives
  • Construct proper job configuration parameters
  • Identify the correct procedures for MapReduce job submission.
  • How to use various commands in job submission
Section Study Resources
  • Hadoop: The Definitive Guide, 3rd Edition: Chapter 5

4. Job Execution Environment (10%)

Objectives
  • Given a MapReduce job, determine the lifecycle of a Mapper and the lifecycle of a Reducer.
  • Understand the key fault tolerance principles at work in a MapReduce job.
  • Identify the role of Apache Hadoop Classes, Interfaces, and Methods.
  • Understand how speculative execution exploits differences in machine configurations and capabilities in a parallel environment and how and when it runs.
Section Study Resources
  • Hadoop in Action: Chapter 3
  • Hadoop: The Definitive Guide, 3rd Edition: Chapter 6

5. Input and Output (6%)

Objectives
  • Given a sample job, analyze and determine the correct InputFormat and OutputFormat to select based on job requirements.
  • Understand the role of the RecordReader, and of sequence files and compression.
Section Study Resources
  • Hadoop: The Definitive Guide, 3rd Edition: Chapter 7
  • Hadoop in Action: Chapter 3
  • Hadoop in Practice: Chapter 3

6. Job Lifecycle (18%)

Objectives
  • Analyze the order of operations in a MapReduce job.
  • Analyze how data moves through a job.
  • Understand how partitioners and combiners function, and recognize appropriate use cases for each.
  • Recognize the processes and role of the the sort and shuffle process.
Section Study Resources
  • Hadoop: The Definitive Guide, 3rd Edition: Chapter 6
  • Hadoop in Practice: Techniques in section 6.4
Two blog posts from Philippe Adjiman’s Hadoop Tutorial Series

7. Data processing (6%)

Objectives
  • Analyze and determine the relationship of input keys to output keys in terms of both type and number, the sorting of keys, and the sorting of values.
  • Given sample input data, identify the number, type, and value of emitted keys and values from the Mappers as well as the emitted data from each Reducer and the number and contents of the output file(s).
Section Study Resources
  • Hadoop: The Definitive Guide, 3rd Edition: Chapter 7 on Input Formats and Output Formats
  • Hadoop in Practice: Chapter 3

8. Key and Value Types (6%)

Objectives
  • Given a scenario, analyze and determine which of Hadoop’s data types for keys and values are appropriate for the job.
  • Understand common key and value types in the MapReduce framework and the interfaces they implement.
Section Study Resources
  • Hadoop: The Definitive Guide, 3rd Edition: Chapter 4
  • Hadoop in Practice: Chapter 3

9. Common Algorithms and Design Patterns (7%)

Objectives
  • Evaluate whether an algorithm is well-suited for expression in MapReduce.
  • Understand implementation and limitations and strategies for joining datasets in MapReduce.
  • Analyze the role of DistributedCache and Counters.
Section Study Resources
  • Hadoop: The Definitive Guide, 3rd Edition: Chapter 8
  • Hadoop in Practice: Chapter 4, 5, 7
  • MapReduce Algorithms tutorial video. Note: uses the old API.
  • Hadoop in Action: Chapter 5.2

10. The Hadoop Ecosystem (8%)

Objectives
  • Analyze a workflow scenario and determine how and when to leverage ecosystems projects, including Apache Hive, Apache Pig, Sqoop and Oozie.
  • Understand how Hadoop Streaming might apply to a job workflow.
Section Study Resources

18 comments:

  1. hey I would like to join in. I tried to find your mail id but was unable to.
    If you are still into hadoop plz connect with me at
    sameersurjikar(at)gmail.com. I am just getting started. Will be of great help to have a person to share experience with :)

    ReplyDelete
  2. Hi Sameer

    Great to hear from your side.

    I have taken Hadoop Definitive guide as reference for preparation.

    My target is to write it in May month only.

    How you planning to prepare ?

    I am emailing you my contact details.

    ReplyDelete
  3. Hi JJ,

    Great blog, i would say. Though i have been working as a Senior project manager and with 17 years of IT Experience. Currently, I am working in a E-commerce company and spent large time managing project issues and risk and cost, emerging technology has always been an interesting area for me. I have been following GFS and BigTable for sometime 2 years ago but could not take it up further. Recently with tremendous opportunity in the Big Data Anlaysis space, Hadoop along with other contrib such as Hbase, Hive, Pig, Scoop, Cassendra, NoSQL and the count is endless with Karmasphere and Cloudera in the IDE space for these technology.


    As like you, i would also like to appear in the Hadoop Developer training and certification (Cloudera Certification), please let me know more on this.

    I have been following, Following book
    1. Hadoop - A definitive Guide, Second volume
    3. hadoop - A quick Introduction (IBM site)
    4. Hadoop in Action - by Chuck Lam (Sample chapter 10 - Programmming in PIG)
    5. Article on Big data Analysis by Ravi kalakota (Search in Google. He is excellent writer on Predictive analysis and other form of Big Data Analysis
    6. Storm - The big data analysis tool from twitter.
    7. Understaning Big Data - Analytics for Enterprise Class Hadoop & Streaming Data.

    Please share the exam syllebus for the Cloudera exam at sanjaybsl@yahoo.com

    ReplyDelete
  4. Hello Sanjay

    Thanks for your comment.

    You having so much industry experience , would love to learn lot of things from you.

    With Reference to Hadoop , you have already covered all the group preparation by reading those books.

    I will also go through few links you mentioned.

    Few days back i had talk with Cloudera people , they said just before start of VUE exams they would release few more details about exam.

    Otherwise syllabus as i already mentioned would be enough

    Hadoop Computing Environment
    Hadoop Distributed File System
    MapReduce
    Hadoop API
    Hadoop Ecosystem

    Regards

    Jagat Singh

    ReplyDelete
  5. Cleared the exam on 27th April. One of the toughest exam I have ever faced.

    ReplyDelete
  6. Hello BNM

    Congratulations :)

    Any tips on how to prepare for the exam ?

    In which way you say it was toughtest?

    Thanks

    ReplyDelete
  7. Hi,
    Congratulations.
    Can you tell me how to prepare for the exam ?
    Thanks
    Arjun

    ReplyDelete
  8. Hi guys,
    I am planning to go for it too, but just curious to know how will it help to go forward in career and how much weight does this certification carry w.r.t. the job opportunity in today's market. BTW I don't have any domain exp. of file systems/big data analytic, but interested to know and pursue further.

    ReplyDelete
  9. I am planning to take Hadoop exam. Kindly someone provide inputs

    ReplyDelete
  10. I had taken exam and am sure that i had answered most of them correctly. Still could not cleared the exam. Kindly let me know any tips or how to prepare for the exam

    ReplyDelete
    Replies
    1. Hello Roopa ji,

      No worries better luck next time.You can check Cloudera website if they give next turn free or not?

      Okay coming to how to prepare.

      What have you read already ? I would reccomend you to read the Definitive guide book atleast 2 times. Since options are confusing and very similar so you might end up choosing wrong one if we are not through with the concepts. Follow the topics given in Syllabus and plan your schedule accordingly.

      And dont worry for failure its very common :)

      Good luck

      Delete
  11. Hi, i had read Hadoop definitive guide.I am pretty confident about my answers and am sure those are correct. I have even verified after exam. But not sure on what basis they are evaluating

    ReplyDelete
  12. Hello Rohan ,

    I would share the topics to study from which all sections with you soon.

    When are you planning to write?

    Thanks,

    ReplyDelete
  13. Hi JJ,

    I have a great interest in learn & make things with Hadoop. So kindly suggest me, which books/material should need to follow. Please share the syllabus, recommended links/material at aleem.btech@gmail.com

    I just want to have complete understanding of Hadoop Indepth, your help in this regard will be highly appreciable.

    Thanks,
    Mohammed Aleem

    ReplyDelete
  14. Hi JJ,
    Right now working as Linux Admin at one of the MNC,
    Now i want to learn about Bigdata and Hadoop ,
    Can u mail me the Study materials and Syllabus,
    my email id is :erankitkhanduri@gmail.com
    thaks
    ankit khanduri

    ReplyDelete
    Replies
    1. Hello Ankit,

      Being Admin you are already half way to move towards Hadoop Adminstration.

      I would suggest you to grab Hadoop Definitive guide and start reading about it. Dont worry for certification as of now , that you can clear easily.

      Delete
  15. Hi JJ

    Thanks for your blog.

    I have one query . I would like to complete Hadoop certification .
    Just to give you background of my knowledge :I worked on Apache Hadoop 1.0.4 , Sqoop, Oozie , Hive . I see lots of certification in Market from Cloudera , HortonWorks , Big Data university. Can you help me to decide which one would be relevant in my case. Thanks in advance.

    ---
    Somi

    ReplyDelete
    Replies
    1. Hi Somi,

      Few questions i would ask to you.

      Is your company sponsoring your certification fees?

      Yes

      Then go for any doesnt matter

      No

      Go for Hortonworks

      Last but not least its knowledge which matter the most rather then certification. So just work hard and try to learn and dont worry for exam it would be cake walk when you give it.

      Both Hortonworks and Cloudera have equal reputation.


      Delete

Please share your views and comments below.

Thank You.