Friday, January 1, 2016

HIVE Tutorial

There are 2 ways we can load data into  the HIVE Tables.
In this tutorial we are loading data into the HIVE managed Table.

This is my sample data with filename 'manageTableSampleData.txt"

10101|Sataya|Hyderabad|23232333
10102|ankur|Chennai  |3452454555
10102|Balaji|Tirupathi|2323121221
10103|chennapa|Chennai  |6767667677
10104|Danunjai|Cochin   |7676666666
10105|Estella|Adilabad |2323333333
10106|Feranandez|Bangaloe |7878777777
10107|Govind|Surat    |2323232323
10108|Hari|Bangalore|2323333333
10109|Kumar|Rajamundry|9029292992
10110|Narendra|Vijayawada|3343233333
10111|Narayana|Vizag     |9029292999
10112|Paramesh|Cochin    |8928282888
10113|Pavan|Kabul     |0202020200
10114|Nag|Kerala    |9292929299
10115|Adithya|Hyderabad|9290202002
10116|vedic|Bombay|8978399338
10117|Vinay|Hyderabad|9393919191
10118|Vamsi|Hyderabad|9292922222
10119|Santosh|Trichy   |9282828288
10120|Sahith|Delhi    |9303030202
10121|Sankar|Bombay   |8938339933
10122|Ramesh|Chennai  |4949494999
10123|Harish|Bangalore|9003399339

First way to load the data into Hive Managed Table from 'localpath'




Second way to load the data into Hive Managed Table from 'hdfs path'

There are 2 steps required in this setup
step 1: First copy the file from LFS(Local File System) into HDFS.
Command :


Step 2 : Now issue the below command to load the data into the HIVE managed Table from the

HDFS input path in to the Hive Managed Table.








Note:
Differences while loading the data from the Local input path and from the HDFS path

  1. There is keyword local  does not exist in the command while loading the data from HDFS path.
  2. The input data loaded from local path still exists after loaded into the Hive managed table also,Where as the input data loaded from the hdfs path into the HIVE Managed table does not exist.

In simple words:

It is CUT - COPY - PASTE Operation in loading from HDFS path,
where as loading from local path  is COPY-PASTE.

OVERWRITE keyword

Overwrite keyword  in the Load Data statement tells Hive to Delete any existing files in
 the directory for the table.

Files in the Name node before the Overwrite command in load data statement.

Command to overwrite the table data.

.

Files in Warehouse Directory after overwrite in load statement.


How to view the log files in HIVE?

The log file located at the directory /tmp/{$USER}/hive.log


How to alter the table column data type in HIVE?

Below the table managedtable4 contains the mobile column data type as int, I would like to change from int to Bigint?









Command to change the column data type.














                             CREATE TABLE AS SELECT (CTAS)
A table can be created by select all fields/some fields/or based on condition from another table.

Note: Only Managed tables can be created with this feature.

Below is the example of creating a table ctas1 from the managedtable5



1 comment:

vijay said...

Excellent blog. I loved it very much. Really it was very impressive. Keep giving like this. Thanks for sharing your informative post on development.Your work is very good and I appreciate you and hoping for some more informational posts.keep writing and sharing.
Salesforce Training in Chennai

Salesforce Online Training in Chennai

Salesforce Training in Bangalore

Salesforce Training in Hyderabad

Salesforce training in ameerpet

Salesforce Training in Pune

Salesforce Online Training

Salesforce Training

AWS certification question

AWS AWS Hi! this is for questions related to AWS questions. EC2 instances EC2 storage types cold HDD : 1. Defines performance in terms...