Sqoop CMD to export MYSQL table to Hive

Here is the handy command to run quickly without referring  the official Sqoop long help !! , This  command needs  minimum import control argument for a faster export  and we are assuming that  same table doesn’t  exist in Hive as well , else command will fail by throwing the error “AlreadyExistsException(message:Table struct_data already exists)” Sqoop […]

Read More…

Hive commands at your fingertips

  CREATE TABLE Syntax. CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name (col_name data_type [COMMENT ‘col_comment’], …) [COMMENT ‘table_comment’] [WITH SERDEPROPERTIES (‘key1’=’value1’, ‘key2’=’value2’, …)] [ROW FORMAT row_format] [STORED AS file_format] [LOCATION ‘hdfs_path’] [TBLPROPERTIES (‘key1’=’value1’, ‘key2’=’value2’, …)] CREATE TABLE Example. CREATE EXTERNAL TABLE IF NOT EXISTS XYZ.CUSTOMER (Cust_no int COMMENT ‘Customer Number ‘,Cust_name string COMMENT ‘Customer Name ‘) COMMENT […]

Read More…

Spark – Word count

Word counting using  Spark program from  Windows 7 cmd prompt.  Create a file in the name “SPARK_WORD_COUNT” and save on the C drive . Here we are trying to count the word “HADOOP” from the  saved  file . I added 17 “HADOOP” words in this file  and end of the step , spark program counts 17 “HADOOP” […]

Read More…

Easy Analysis on BigData with Bigsheets

BigSheets-Spread sheet like interface to HDFS files Bigsheets is a browser-based tool that is included in the BigInsights data scientist package or data analyst package, to analyze and visualize big data.BigSheets uses a spreadsheet-like interface that can model, filter, combine, and chart data collected from multiple sources, such as an application work on big data […]

Read More…

Apache Spark – tuning spark jobs-Optimal setting for executor, core and memory

Executor, memory and core setting for optimal performance on Spark Spark is adopted by tech giants to bring intelligence to their applications. Predictive analysis and machine learning along with traditional data warehousing is using spark as the execution engine behind the scenes. I have been exploring spark since incubation and I have used spark core […]

Read More…