Tips from learning: HOW TO DEBUG DATANODE NOT RUNNING ISSUES IN HADOOP

Hi,
In Recent past I am came across below issue.
Problem Point:
I tried to copy a file from my local file system into hadoop file system(HFDS) with this command.

bin/hadoop dfs -put /home/user/sample.txt /user/username/.
But then I got this error:

15/11/26 08:36:57 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/username/Sample.txt could only be replicated to 0 nodes, instead of 1
bla bla ....

I understood that hdfs could not replicate the file.
Check the cluster health with the command jps
shows me the following output:

user@localhost:~$ jps
5798 SecondaryNameNode
7648 Jps
6050 TaskTracker
5896 JobTracker
5474 NameNode

Observe the above output the Datanode is not running.

Now I visited the hadoop health status url : http://localhost:50070/dfshealth.jsp

Click on the URL "Namenodelogs"

there I checked for the recent data-logs node logs with the name:

hadoop-user-datanode-localhost.log

25542 bytes

At the semi end of the log file search for the string "STARTUP_MSG: Starting DataNode STARTUP_MSG"

follow this paragraph observed this error:

"ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data: namenode namespaceID = 429859175; datanode namespaceID = 26135836"

then on in local file system changed the current directory to

> cd /app/hadoop/tmp/dfs/data

> ls

total 28

drwxr-xr-x 6 sekhar sekhar 4096 Nov 26 08:34 ./

drwxrwxr-x 5 sekhar sekhar 4096 Nov 21 18:57 ../

drwxrwxr-x 2 sekhar sekhar 4096 Nov 23 23:14 blocksBeingWritten/

drwxrwxr-x 2 sekhar sekhar 4096 Nov 23 23:14 current/

drwxrwxr-x 2 sekhar sekhar 4096 Nov 21 18:57 detach/

-rw-rw-r-- 1 sekhar sekhar 157 Nov 21 18:57 storage

drwxrwxr-x 2 sekhar sekhar 4096 Nov 26 07:40 tmp/

> cd current
> ll
sekhar@localhost:/app/hadoop/tmp/dfs/data/current$ ll
total 24
drwxrwxr-x 2 sekhar sekhar 4096 Nov 23 23:14 ./
drwxr-xr-x 6 sekhar sekhar 4096 Nov 26 08:34 ../
-rw-rw-r-- 1 sekhar sekhar 4 Nov 23 23:14 blk_-2585174188513577469
-rw-rw-r-- 1 sekhar sekhar 11 Nov 23 23:14 blk_-2585174188513577469_1002.meta
-rw-rw-r-- 1 sekhar sekhar 193 Nov 23 23:19 dncp_block_verification.log.curr
-rw-rw-r-- 1 sekhar sekhar 154 Nov 26 07:40 VERSION

> vi VERSION

change the namespaceId to namespaceId of the namenode

#Thu Nov 26 08:50:16 IST 2015

namespaceID=26135836

storageID=DS-1674635271-127.0.0.1-50010-1448112482162

cTime=0

storageType=DATA_NODE

layoutVersion=-41

namespaceID should be changed to namespaceID of the namenodeID simply get this value from log
here in this case it is 429859175.

All set that'it. Just start the cluster now:
> sekhar@localhost:~$ jps
15278 JobTracker
14828 NameNode
15556 Jps
15172 SecondaryNameNode
15001 DataNode
15440 TaskTracker

WOW. Datanode is back.

Tips from learning

Thursday, November 26, 2015

HOW TO DEBUG DATANODE NOT RUNNING ISSUES IN HADOOP

No comments:

AWS certification question

Search This Blog