Big Data Developer

Creating UDF in PIG Hadoop

In this blog “Creating UDF in PIG Hadoop” I am going to show you how to create your own UDF( User Defined Functions) and Integrate with PIG. 1) Create a JAVA class IsOfAge.Java 2) Export the JAR to the machine where PIG is running 3) Register the JAR in PIG and use it PIG statements[…]

Creating user in Vertica

You can create a user and assign password by following ” Creating user in Vertica”  and dbadmin=> create user jim identified by ‘pluto’; CREATE USER dbadmin=> \q bash-3.2$ vsql -U jim -w pluto SET Welcome to vsql, the Vertica Analytic Database v5.0.11-0 interactive terminal. Type:  \h for help with SQL commands        \? for help[…]

Load file using Copy Command Vertica

Here is how you can load the file to Vertica using Copy Command Vertica . 1) STDIN input format cat /tmp/test.csv | vsql -c “copy customer from stdin direct delimiter ‘,'” 2) From a source file COPY  public.Users ( USERID,USERNAME,USERGENDER ) FROM  ‘/home/notroot/lab/data/USER.csv’ SKIP 1 NULL AS ‘null’ ENCLOSED BY U&’\0027′ DELIMITER ‘|’ rejected data ‘/home/notroot/lab/data/reject.csv’ exceptions[…]

Sequencefile processing mapreduce hadoop

When we have large number of small files for example millions of small xml, how to process using hadoop mapreduce by using SequenceFileInputFormat is what I am going show you now the Sequencefile processing mapreduce hadoop 1) Create Driver class 2) Mapper class using this code you can process sequence files.