Wednesday, December 04, 2013

Hadoop host identification process

In Hadoop, when datanode starts up, there are three steps to discover the name of the server. The first step is to call Java's InetAddress.getLocalHost() to determine the hostname of the server. The second step is to canonicalize the hostname by calling Java's InetAddress.getCanonicalHostName(). The third step is to set the canonicalized name internally and use it as the official name send to namenode or jobtracker.

The results of getLocalHost() and getCanonicalHostName() are platform-specific. On Linux, getLocalHost() calls gethostname() then calls uname() system call, which has nothing to do with /etc/hosts file or DNS server. Although the name it returns is usually similar or even identical.

The command hostname, for instance, exclusively uses gethostname() and sethostname() where command host and dig use gethostbyname() and gethostbyaddr(). gethostname() and sethostname() is the same way Linux kernel sees the server's hostname. gethostbyname() and gethostbyaddr() use name resolution (i.e gethostbyname() is network aware, so it consults /etc/nsswitch.conf and  /etc/host.conf to decide whether to read information in /etc/sysconfig/network or /etc/hosts). So the result of getLocalHost() and gethostbyname() could be different if the hostname doesn't resolve to an IP address. Then you will see issues. But most of the cases you won't see the issue because there’s usually at least an entry in /etc/hosts. For example, in the datanode, at least you will have:
10.6.70.30 datanode1.mycompany.com datanode1
in /etc/hosts.

Assume the first step went okay, now we are at the second step and we will call getCanonicalHostName() to canonicalize the hostname. The Hostname canonicalization is the process of finding the complete, official hostname according to the hostname resolution system (Also called finding the fully qualified hostname). getCanonicalHostName() calls the internal method InetAddress.getHostFromNameService(), which gets the hostname by address via the OS resolver.

A simple program from "Hadoop operations" can help you determine the hostname and canonicalName:

import java.net.InetAddress;
import java.net.UnknownHostException;

public class dns {
    public static void main(String[] args) throws UnknownHostException {
        InetAddress addr = InetAddress.getLocalHost();
        System.out.println(
            String.format(
                "IP:%s hostname:%s canonicalName:%s",
                addr.getHostAddress(),         // The "default" IP address
                addr.getHostName(),            // The hostname (from gethostname()
                addr.getCanonicalHostName()    // The canonicalized hostname (from resolver)
            )
        );
    }
}

$ java dns
IP:10.6.70.30 hostname:datanode1.mycompany.com canonicalName:datanode1.mycompany.com

Please check your datanodes hostname and canonicalized hostname first, you absolutely don't want to your datanode to report itself to namenode as "localhost". This could cause some damage to the client trying to write data into datanode. For example, client trys to write data into datanode1, which datanoed1 report himself as "localhost" to namenode, then the client will start writing to itself!




No comments: