This repository contains all the documents related to HDPCD certification.
Switch branches/tags
Nothing to show
Clone or download
milindjagre creating SQL file
This SQL file is used for creating a Hive table for performing the ORDER BY operation
Latest commit dc6e2ab Sep 13, 2017
Permalink
Failed to load latest commit information.
100_customers.csv renaming to maintain the sequence May 5, 2017
10_pig_transformation_script.pig creating pig script Mar 30, 2017
11_transformation_for_hive.csv updating file Mar 31, 2017
12_transform_data_for_hive.pig creating pig file Mar 31, 2017
13_group_in_pig.pig creating pig script for group Apr 20, 2017
14_group_input_file.csv creating input file Apr 20, 2017
15_remove_NULL.pig creating pig script Apr 21, 2017
16_NULL_values_input.csv creating input file Apr 21, 2017
17_input_to_HDFS.txt creating input file Apr 21, 2017
18_pig_load_to_HDFS.pig creating pig script Apr 21, 2017
19_input_pig_to_hive.csv creating input csv file May 5, 2017
1_Sqoop_Import_Command.txt Creating Sqoop Import Command File Feb 14, 2017
200_join.pig renaming to keep the sequence May 5, 2017
200_orders.csv renaming to keep the sequence May 5, 2017
20_pig_to_hive.pig creating pig script May 5, 2017
21_hive_table_creation.hql creating hive create table comamnd May 5, 2017
22_input_for_sort.csv creating input file May 7, 2017
23_sort_in_pig.pig creating pig script May 7, 2017
24_input_for_removing_duplicates.csv input CSV file May 9, 2017
25_removing_duplicates.pig creating pig script May 9, 2017
26_input_parallel_tasks.csv creating input CSV file May 11, 2017
27_SET_multiple_reducers.pig creating pig script May 11, 2017
28_PARALLEL_multiple_reducers.pig creating pig script May 11, 2017
29_customers_input.csv creating input CSV file May 13, 2017
2_example.conf Renaming file to keep track of the sequence Feb 24, 2017
30_orders_input.csv creating orders input file May 13, 2017
31_join_operation.pig creating pig script for join operation May 13, 2017
32_customers_input.csv create input customers csv file May 17, 2017
33_orders_input.csv create orders input csv file May 17, 2017
34_replicated_join.pig creating pig script for replicated join May 17, 2017
35_input_TEZ_mode.txt renaming the file to keep sequence May 18, 2017
36_pig_script_tez_mode.pig creating pig script tez mode May 18, 2017
37_input_UDF_invoke.csv creating input csv file May 18, 2017
38_UDF_invocation.pig updating script May 18, 2017
39_hive_query.sql creating hive query file Jun 10, 2017
3_pig_demo.txt creating pig demo text file Feb 24, 2017
40_hive_managed_table.sql creating hive table managed creation schema Jun 10, 2017
41_input_hive_external_table.csv creating input csv file Jun 20, 2017
42_input_partition_hive_table.csv creating input csv file Jun 21, 2017
43_hive_partitioned_table.sql creating SQL file Jun 21, 2017
44_hive_bucketed_table.sql creating SQL file Jun 26, 2017
45_hive_table_with_ORC.sql creating SQL file Jul 3, 2017
46_sequence_file_hive.sql creating sql file Jul 15, 2017
47_input_delimiter_hive.tsv input TSV file Jul 17, 2017
48_hive_table_tab_delimiter.sql creating SQL file Jul 19, 2017
49_input_to_load_from_local.csv creating input csv file Jul 25, 2017
4_pig_wordcount.pig Creating pig_wordcount.pig file Feb 24, 2017
50_create_hive_table_for_local_load.sql creating sql file Jul 25, 2017
51_input_to_load_from_hdfs.csv creating input csv file Jul 26, 2017
52_create_hive_table_for_hdfs_load.sql creating SQL file Jul 26, 2017
53_create_hive_table_for_SELECT_load.sql creating SQL file Aug 6, 2017
54_input_file_for_compressed_data.csv creating input CSV file Aug 7, 2017
55_hive_table_for_compressed_data.sql creating SQL file Aug 7, 2017
56_first_input_file_for_join.csv creating input CSV file Aug 24, 2017
57_second_input_file_for_join.csv creating input CSV file Aug 24, 2017
58_first_hive_table_for_join.sql creating hive table Aug 24, 2017
59_second_hive_table_for_join.sql creating hive table Aug 24, 2017
5_Pig_Schema_Less_Relation.pig Updating this file Mar 9, 2017
60_input_file_for_subquery.csv creating input CSV file Sep 10, 2017
61_hive_create_table_for_subquery.sql creating Hive table Sep 10, 2017
62_input_file_for_ordering_output.csv creating input CSV file Sep 11, 2017
63_hive_create_table_for_order_by.sql creating SQL file Sep 13, 2017
6_input.csv Creating input.csv Mar 14, 2017
7_Pig_Relation_With_Schema.pig Creating Pig Script Mar 14, 2017
8_hive_to_pig.pig Creating hive_to_pig.pig file Mar 25, 2017
9_pig_transformation_input.txt creating input file Mar 30, 2017
README.md updating README.md Sep 11, 2017
_config.yml Set theme jekyll-theme-cayman Feb 14, 2017

README.md

Welcome to HDPCD Repository

You can use this repository for preparing the Hortonworks Data Platform Certified Developer certification. The link for the certification is https://hortonworks.com/services/training/certification/exam-objectives/#hdpcd

Following objectives are tested through this certification

## DATA INGESTION
- Import data from a table in a relational database into HDFS
- Import the results of a query from a relational database into HDFS
- Import a table from a relational database into a new or existing Hive table
- Insert or update data from HDFS into a table in a relational database
- Given a Flume configuration file, start a Flume agent
- Given a configured sink and source, configure a Flume memory channel with a specified capacity

## DATA TRANSFORMATION
- Write and execute a Pig script
- Load data into a Pig relation without a schema
- Load data into a Pig relation with a schema
- Load data from a Hive table into a Pig relation
- Use Pig to transform data into a specified format
- Transform data to match a given Hive schema
- Group the data of one or more Pig relations
- Use Pig to remove records with null values from a relation
- Store the data from a Pig relation into a folder in HDFS
- Store the data from a Pig relation into a Hive table
- Sort the output of a Pig relation
- Remove the duplicate tuples of a Pig relation
- Specify the number of reduce tasks for a Pig MapReduce job
- Join two datasets using Pig
- Perform a replicated join using Pig
- Run a Pig job using Tez
- Within a Pig script, register a JAR file of User Defined Functions
- Within a Pig script, define an alias for a User Defined Function
- Within a Pig script, invoke a User Defined Function

## DATA ANALYSIS
- Write and execute a Hive query
- Define a Hive-managed table
- Define a Hive external table
- Define a partitioned Hive table
- Define a bucketed Hive table
- Define a Hive table from a select query
- Define a Hive table that uses the ORCFile format
- Create a new ORCFile table from the data in an existing non-ORCFile Hive table
- Specify the storage format of a Hive table
- Specify the delimiter of a Hive table
- Load data into a Hive table from a local directory
- Load data into a Hive table from an HDFS directory
- Load data into a Hive table as the result of a query
- Load a compressed data file into a Hive table
- Update a row in a Hive table
- Delete a row from a Hive table
- Insert a new row into a Hive table
- Join two Hive tables
- Run a Hive query using Tez
- Run a Hive query using vectorization
- Output the execution plan for a Hive query
- Use a subquery within a Hive query
- Output data from a Hive query that is totally ordered across multiple reducers
- Set a Hadoop or Hive configuration property from within a Hive query

Hope you guys like it. You can visit my LinkedIn profile at https://www.linkedin.com/in/milindjagre/