Sqoop CMD to export MYSQL table to Hive

Here is the handy command to run quickly without referring  the official Sqoop long help !! , This  command needs  minimum import control argument for a faster export  and we are assuming that  same table doesn’t  exist in Hive as well , else command will fail by throwing the error “AlreadyExistsException(message:Table struct_data already exists)” Sqoop […]

Read More…

Hive commands at your fingertips

  CREATE TABLE Syntax. CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name (col_name data_type [COMMENT ‘col_comment’], …) [COMMENT ‘table_comment’] [WITH SERDEPROPERTIES (‘key1’=’value1’, ‘key2’=’value2’, …)] [ROW FORMAT row_format] [STORED AS file_format] [LOCATION ‘hdfs_path’] [TBLPROPERTIES (‘key1’=’value1’, ‘key2’=’value2’, …)] CREATE TABLE Example. CREATE EXTERNAL TABLE IF NOT EXISTS XYZ.CUSTOMER (Cust_no int COMMENT ‘Customer Number ‘,Cust_name string COMMENT ‘Customer Name ‘) COMMENT […]

Read More…

Spark – Word count

Word counting using  Spark program from  Windows 7 cmd prompt.  Create a file in the name “SPARK_WORD_COUNT” and save on the C drive . Here we are trying to count the word “HADOOP” from the  saved  file . I added 17 “HADOOP” words in this file  and end of the step , spark program counts 17 “HADOOP” […]

Read More…

Apache Spark – tuning spark jobs-Optimal setting for executor, core and memory

Executor, memory and core setting for optimal performance on Spark Spark is adopted by tech giants to bring intelligence to their applications. Predictive analysis and machine learning along with traditional data warehousing is using spark as the execution engine behind the scenes. I have been exploring spark since incubation and I have used spark core […]

Read More…

FREE BigData Sandbox from Talend

We can use Bigdata platforms in Talend sandbox for  free  ,  personally  I felt this is very  helpful to understand the Hadoop  ecosystem in different views .  Usage of this  sandbox is  available for  30 days  trail period. Try the below  link for the  free  download  and go from zero to big data without coding in under 10 minutes. […]

Read More…