Talend Big Data Real Time v6 Certified Developer Exam

TALEND certification exams measure candidates’ knowledge of product usage and underlying methods required to successfully implement quality projects. Preparation is critical to passing.

Certification Exam Details

Exam content is updated periodically. The number and difficulty of questions may change. The passing score is adjusted to maintain a consistent standardfor example, a new exam version with more-difficult questions may have a lower passing score.

Approximate number of questions: 60
Time limit: 1 hour (60 minutes)
Types of questions:

  • Multiple choice
  • Multiple response

Recommended Experience

General knowledge of Hadoop: HDFS, MapReduce v1 and v2, Hive, Pig, HBase, Hue, Zookeeper, and Sqoop.

General knowledge of Spark and Kafka.

Experience with Talend Big Data real-time 6.x solutions and Talend Studio, including metadata creation, configuration, and troubleshooting.


To prepare for the certification exam, Talend recommends:

  • Take the Big Data Basics and Advanced training courses
  • Study the training material
  • Read the product documentation
  • Acquire experience by using the product for at least 6 months

Certification Exam Topics

Big Data—general concepts

  • The different YARN daemons
  • How HDFS works
  • The Hadoop ecosystem: Pig, Hive, Hue, Hbase, and Sqoop
  • The process for creating cluster metadata in Talend Studio
  • Different ways to test connectivity to a Hadoop cluster


  • What is HDFS?
  • Talend components dedicated to HDFS: names, how they work, how to configure them
  • Mandatory configuration to connect to HDFS
  • Troubleshooting common issues


  • What is Hive?
  • Talend components dedicated to Hive: names, how they work, how to configure them
  • How to create, profile, and preview Hive tables
  • Troubleshooting common issues


  • What is Pig?
  • Talend components dedicated to Pig: names, how they work, how to configure them
  • Troubleshooting common issue


  • What is HBase?
  • Talend components dedicated to HBase: names, how they work, how to configure them
  • Troubleshooting common issues


  • What is Sqoop?
  • Talend components dedicated to Sqoop: names, how they work, how to configure them
  • Troubleshooting common issues


  • What is Spark?
  • Configuration to use Spark and Spark streaming frameworks: execution modes, mandatory parameters, resource limitations, tuning memory
  • Troubleshooting common issues


  • What is Kafka?
  • Talend components dedicated to Kafka: names, how they work, how to configure them
  • Troubleshooting common issues

Sample Questions

1. You designed a Big Data batch using the MapReduce framework. You plan to execute it on a cluster using MapReduce v1. What mandatory configurations must be specified on the Hadoop Configuration tab of the Run view? Choose all that apply.

a. Name Node

b. Data Node

c. Resource Manager

d. Job Tracker

2. What is HDFS?

a. A data warehouse infrastructure tool for processing structured data in Hadoop

b. A tool for importing/exporting tables from/to the Hadoop file system

c. A column-oriented key/value data store built to run atop the Hadoop file system

d. The primary storage system used by Hadoop applications

3. In which perspective of Studio can you run analysis on Hive table content?

a. Profiling

b. Integration

c. Big Data

d. Mediation

4. HDFS components can only be used in Big Data batch or Big Data streaming Jobs.

a. True

b. False

5. ZooKeeper service is mandatory to coordinate transactions between Talend Studio and HBase.

a. True

b. False

6. Which outgoing links are allowed for a tSqoopImport component?

a. Main

b. Iterate

c. OnSubjobOk

d. SqoopCombine

7. Which type of incoming data should you get from a tKafkaOutput component?

a. Serialized byte arrays

b. Integers

c. Bytes

d. String

8. In Studio, the components in the palette are the same for Spark Jobs and MapReduce Jobs.

a. True

b. False


  1. a and d
  2. d
  3. a
  4. b
  5. a
  6. b and c
  7. a
  8. b