Skip to main content

Posts

Showing posts from 2018

CCA 175 Preparation with this 6 practice questions- Try to solve it in 1 hour

As I found very few practice questions available on the internet for CCA 175 - Hadoop & spark developer exam. I set 6 questions exam with the solution provided in the comment section.  If you complete it in less than 1 hour then and then think to apply CCA 175 exam else you need more practice. Question's prerequisites : import data from orders table with parquet file format and save data to hdfs path: /user/rj_example/parquet_data/orders import data from customers table and save data to hdfs path: /user/rj_example/data/customers import data from customers table with avro file format and save data to hdfs path: /user/rj_example/avro_data/customers import data from customers table and save data to hdfs path: /user/rj_example/data/categories import data from products table and save data to hdfs path: /user/rj_example/data/products with '\t' as fields seperator create local dir 'rj_example' copy data from /user/rj_example/data/products to local dir

Sqoop

User Guide: https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html For Sqoop with MySQL, It requires MySQL connector jar in Sqoop binary location . Sqoop binary location: usr/ hdp/current/sqoop Sqoop Commands: #List Databases > sqoop list-databases --connect jdbc:mysql://<mysql-URL>:<port> --username <USERNAME> --password <PASSWORD> # List Table from database > sqoop list-tables --connect jdbc:mysql://<mysql-URL>:<port>/<DBName> --username <USERNAME> --password <PASSWORD> # Evaluate sql query For evaluate sql query > sqoop eval --connect jdbc:mysql://<mysql-URL>:<port>/<DBName> --username <USERNAME> --password <PASSWORD> --query “<SQL QUERY>”

Sqoop- Import MySql Database to/from hive

Import MySql database to Hive >hive To enter into hive --hive-import To enable hive import in sqoop import command --hive-table To specify table in which import will done --hive-database To specify hive database. You can also use --hive-table to specify db by passing <DBName>.<TableName> In each hive import following command will be common. > sqoop-import --connect jdbc:mysql://<mysql-URL>:<port>/ <DBName> \  --username <USERNAME> --password <PASSWORD> \    -- table <table-name> \         --hive-import \          --hive-table <table-name> \      --hive-database <db-name>         Internally hive-import is run with following steps. 1. It will import a table into HDFS 2. Copy imported a table into hive 3. Delete directory from HDFS. Note:  If this operation fails in between then dir ectory in HDFS will not delete.

Sqoop - Importing MySql data to/from HDFS

#Import Database Table to HDFS 1. warehouse-dir warehouse-dir import is for importing data by creating a directory with a name of the table to the path specified and put data into it. > sqoop- import --connect jdbc:mysql://<mysql-URL>:<port>/<DBName> \ --username <USERNAME> --password <PASSWORD> \ --table <table_name> - -warehouse-dir <HDFS dir path> 2. target-dir target-dir import is for importing data to the specified path directly. > sqoop- import --connect jdbc:mysql://<mysql-URL>:<port>/<DBName> \ --username <USERNAME> --password <PASSWORD> \ --table <table_name> -- target -dir <HDFS dir path> Make sure that the target directory should not exist. If exist then It can be removed through > hadoop fs -rm -R <dir-name> OR append sqoop import command with --delete-target-dir NOTE: If your DB table doesn’t have a