Tips from learning: January 2016

Saturday, January 16, 2016

Find Common Manger

Question :
Op/ of the Below question is FRED

Here I am attaching the code.

public class BinaryNode {

    private String value;

    BinaryNode(String value){
        this.value = value;
        this.left = null;
        this.right = null;
    }

    public String getValue() {
        return value;
    }

    public void setValue(String value) {
        this.value = value;
    }

    public BinaryNode getLeft() {
        return left;
    }

    public void setLeft(BinaryNode left) {
        this.left = left;
    }

    private BinaryNode left;

    public BinaryNode getRight() {
        return right;
    }

    public void setRight(BinaryNode right) {
        this.right = right;
    }

    private BinaryNode right;

}

ManagerRelation.java

import java.util.*;


public class ManagerRelation {

    private static Set<BinaryNode> treeNodes = new HashSet<BinaryNode>();

    private static BinaryNode root;

    public static void main(String[] args) {
        try {
            ManagerRelation mr = new ManagerRelation();
            Scanner in = new Scanner(System.in);
            int _count;
            _count = Integer.parseInt(in.nextLine());
            mr.readInput(_count, in);

        } catch (NumberFormatException ne) {
            ne.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }

    }


    private BinaryNode commonManager(BinaryNode root, String s1, String s2) {
        if (root == null) {
            return null;
        }
        if (root.getValue().equalsIgnoreCase(s1) || root.getValue().equalsIgnoreCase(s2)) {
            return root;
        }

        BinaryNode left = commonManager(root.getLeft(), s1, s2);
        BinaryNode right = commonManager(root.getRight(), s1, s2);
        if (left != null && right != null) {
            return root;
        }
        if (left == null) {
            return right;
        } else {
            return left;
        }
    }

    private boolean isNodeExists(String value) {
        Iterator<BinaryNode> treeNodesIterator = treeNodes.iterator();

        while (treeNodesIterator.hasNext()) {
            BinaryNode n1 = treeNodesIterator.next();
            if (value.equalsIgnoreCase(n1.getValue())) {
                return true;
            }
        }
        return false;
    }

    private BinaryNode getNode(String value) {
        BinaryNode node = null;
        Iterator<BinaryNode> treeNodesIterator = treeNodes.iterator();
        while (treeNodesIterator.hasNext()) {
            node = treeNodesIterator.next();
            if (value.equalsIgnoreCase(node.getValue())) {
                return node;
            }
        }
        return node;
    }

    public void readInput(int count, Scanner in) {
        //read the employee1 first name for whom we need to find the ancestor
        String employee1 = in.next();
        //read the employee2 first name for whom we need to find the ancestor
        String employee2 = in.next();

        //Define a set contains unique employees names, this set count should not be more than count of
        // unique employees

        Set<String> uniqueEmployees = new HashSet<String>(count);

        uniqueEmployees.add(employee1);
        uniqueEmployees.add(employee2);

        while (uniqueEmployees.size() <= count) {
            StringBuffer sb = new StringBuffer();
            for (int i = 0; i < 2; i++) {
                sb.append(in.next());
                if (i == 0) {
                    sb.append(",");
                }
            }

            String[] relationLine = sb.toString().split(",");
            uniqueEmployees.add(relationLine[0]);
            uniqueEmployees.add(relationLine[1]);
            if (uniqueEmployees.size() <= count) {
                constructRelationTree(relationLine[0], relationLine[1]);
            }
        }
        BinaryNode node = commonManager(root, employee1, employee2);
        System.out.print(" FOund " + node.getValue());
    }

    private void printRelationTree(BinaryNode root) {
        if (root == null) {
            return;
        }
        System.out.print(root.getValue());
        printRelationTree(root.getLeft());
        printRelationTree(root.getRight());
    }

    private BinaryNode constructRelationTree(String s, String s1) {
        BinaryNode node;
        if (isNodeExists(s)) {
            node = getNode(s);
        } else {
            node = new BinaryNode(s);
            treeNodes.add(node);
            if (root == null) {
                root = node;
            }
        }
        if (node.getLeft() == null) {
            BinaryNode n1 = new BinaryNode(s1);
            node.setLeft(n1);
            treeNodes.add(n1);
        } else {
            BinaryNode n2 = new BinaryNode(s1);
            node.setRight(n2);
            treeNodes.add(n2);
        }
        return node;
    }


}

Wednesday, January 13, 2016

1) How to write the output of PIG script into a JSON file?

2) What is the data injection tool you used? how to increase the limit?
and how and what exactly the parallel tasks run while doing the data injection?

Tuesday, January 12, 2016

print only the number's from a give String

Question:
your function should take a string and print only the numbers contained in the string.
i/p : String inputString = "Sekhar124hekkkoer9239939siiipere23+++33!!!!!!";
o/p: [124, 9239939, 23, 33]

Code:

public class OnlyIntegers {

    public static void main(String[] args) {

        String inputString = "Sekhar124hekkkoer9239939siiipere23+++33!!!!!!";

        List numbers = new ArrayList();

        char[] temp = null;

        char[] in = inputString.toCharArray();
        int tempIndex = 0;
        for (char c : in) {
            if (temp == null) {
                temp = new char[10];
            }
            int ascii = (int) c;
            if (ascii > 47 && ascii < 58) {
                temp[tempIndex++] = c;
            } else {
                if ((new String(temp).substring(0,tempIndex).length()) > 0) {
                    numbers.add((new String(temp).substring(0,tempIndex)));
                    temp = null;
                    tempIndex=0;
                }
            }
        }
        System.out.println(numbers);
    }
}

Saturday, January 9, 2016

Difference between Text and String DataTypes in Hadoop

Here I am going to discuss some of the differences between Text and String class in Hadoop.

Text class lies in the package: import org.apache.hadoop.io.*;

Difference 1:

Text is not immutable : String is immutable

Text t = new Text("hadoop");
t.set("BigData")
print "t" --> prints "BigData"

Difference 2 :
Text stores the string in a byte buffer with UTF-8 unicode encoding

Example : Text t = new Tex("hadoop");

will get converted into byte[] array, and then places in to ByteBuffer.

so the string "hadoop" will get stored like this [UTF-CODE(h),UTF-code(a)........ UTF-code(p)]

so this this is the byte[] array representation for string "hadoop"
[104,97,100,111,111,112]

Why ?

Text uses standard UTF-8 which makes it potentially easier to inter-operate with other tools that
understand UTF-8.

Difference 3 :
CharAt(int index) in string returns the char at specified index.

charAt(int index) in Text returns the Unicode point in the above case it i 100.

Difference 4 : Due to lack of Rich API for manipulating strings in Text many cases we use to
convert it to String.

Difference 5 : Iterating over Text characters is tedious process when compared to string.
Example of iterating over charactes in Text;

Text t = new Text("hadoop");
ByteBuffer bf = ByteBuffer.wrap(t.getBytes(),0,t.getLength());
int cp;
while(bf.hasRemaining()){
cp = Text.bytesToCodePoint(bf);
System.out.print((char) cp);
}

Similarity 1 : find in Text equals to indexOf in String.

Text t = new Text("hadoop");
String s = new String("hadoop");

System.out.println(" >>> "+t.find("o"));
System.out.println(" >>> "+s.indexOf("o"));

Friday, January 1, 2016

HIVE Tutorial

Second way to load the data into Hive Managed Table from 'hdfs path'

There are 2 steps required in this setup
step 1: First copy the file from LFS(Local File System) into HDFS.
Command :

Step 2 : Now issue the below command to load the data into the HIVE managed Table from the

HDFS input path in to the Hive Managed Table.

Note:
Differences while loading the data from the Local input path and from the HDFS path

There is keyword local does not exist in the command while loading the data from HDFS path.
The input data loaded from local path still exists after loaded into the Hive managed table also,Where as the input data loaded from the hdfs path into the HIVE Managed table does not exist.

In simple words:

It is CUT - COPY - PASTE Operation in loading from HDFS path,
where as loading from local path is COPY-PASTE.

OVERWRITE keyword

Overwrite keyword in the Load Data statement tells Hive to Delete any existing files in
the directory for the table.

Files in the Name node before the Overwrite command in load data statement.

Command to overwrite the table data.

.

Files in Warehouse Directory after overwrite in load statement.

How to view the log files in HIVE?

The log file located at the directory /tmp/{$USER}/hive.log

How to alter the table column data type in HIVE?

Below the table managedtable4 contains the mobile column data type as int, I would like to change from int to Bigint?

Command to change the column data type.

CREATE TABLE AS SELECT (CTAS)
A table can be created by select all fields/some fields/or based on condition from another table.

Note: Only Managed tables can be created with this feature.

Below is the example of creating a table ctas1 from the managedtable5

Tips from learning