Big Data Developer

Enjoy the course You can enroll in this course from your student dashboard. You need to be logged in.

Creating UDF in PIG Hadoop

In this blog “Creating UDF in PIG Hadoop” I am going to show you how to create your own UDF( User Defined Functions) and Integrate with PIG. 1) Create a JAVA class IsOfAge.Java 2) Export the JAR to the machine where PIG is running 3) Register the JAR in PIG and use it PIG statements[…]

Creating user in Vertica

You can create a user and assign password by following ” Creating user in Vertica”  and dbadmin=> create user jim identified by ‘pluto’; CREATE USER dbadmin=> \q bash-3.2$ vsql -U jim -w pluto SET Welcome to vsql, the Vertica Analytic Database v5.0.11-0 interactive terminal. Type:  \h for help with SQL commands        \? for help[…]

Load file using Copy Command Vertica

Here is how you can load the file to Vertica using Copy Command Vertica . 1) STDIN input format cat /tmp/test.csv | vsql -c “copy customer from stdin direct delimiter ‘,'” 2) From a source file COPY  public.Users ( USERID,USERNAME,USERGENDER ) FROM  ‘/home/notroot/lab/data/USER.csv’ SKIP 1 NULL AS ‘null’ ENCLOSED BY U&’\0027′ DELIMITER ‘|’ rejected data ‘/home/notroot/lab/data/reject.csv’ exceptions[…]

Sequencefile processing mapreduce hadoop

When we have large number of small files for example millions of small xml, how to process using hadoop mapreduce by using SequenceFileInputFormat is what I am going show you now the Sequencefile processing mapreduce hadoop 1) Create Driver class 2) Mapper class using this code you can process sequence files.