Creating UDF in PIG Hadoop

In this blog “Creating UDF in PIG Hadoop” I am going to show you how to create your own UDF( User Defined Functions) and Integrate with PIG.

1) Create a JAVA class IsOfAge.Java


package functions.utils;

import java.io.IOException;

import org.apache.hadoop.hdfs.server.common.Storage;
import org.apache.pig.FilterFunc;
import org.apache.pig.backend.executionengine.ExecException;
import org.apache.pig.data.Tuple;

public class IsOfAge extends FilterFunc {
@Override
public Boolean exec(Tuple tuple) throws IOException {

if (tuple == null || tuple.size() == 0) {
return false;
}

try {
Object object = tuple.get(0);
if (object == null) {
return false;
}
int i = (Integer) object;
if (i == 18 || i == 19 || i == 21 || i == 23 || i == 27) {
return true;
} else {
return false;
}
} catch (ExecException e) {
throw new IOException(e);
}
}
}

2) Export the JAR to the machine where PIG is running

3) Register the JAR in PIG and use it PIG statements

REGISTER /home/notroot/lab/programs/HadoopTraining.jar
student = LOAD 'input/student' using PigStorage(',') AS (name:chararray,age:int,mark:double);
b = FILTER student BY functions.utils.IsOfAge($1);
DUMP b;

Thanks have a great day ! enjoy

Leave a Reply