hadoop2.7.2分布式部署时Live Nodes 0的问题

问题描述

bin/start-all.sh启动集群,然后namenode的log报错:

2016-11-15 15:10:13,140 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=10.104.206.27, hostname=10.104.206.27)  
2016-11-15 15:10:13,141 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 10.104.206.27:48015 Call#651 Retry#0  
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=10.104.206.27, hostname=10.104.206.27): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=254ce00f-2562-418d-9f33-bbf4e94fd137, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-0049bd29-b8ea-4dfd-8faa-71117f93c0ac;nsid=1427745989;c=0)  
    at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4528)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1285)
    at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:96)
    at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28752)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

解决办法

修改 hdfs-site.xml 增加以下配置

<property>  
   <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
   <value>false</value>
</property>  

原因

查hadoop的文档,发现 dfs.namenode.datanode.registration.ip-hostname-check 的描述是这样的:

If true (the default), then the namenode requires that a connecting datanode's address must be resolved to a hostname. If necessary, a reverse DNS lookup is performed. All attempts to register a datanode from an unresolvable address are rejected. It is recommended that this setting be left on to prevent accidental registration of datanodes listed by hostname in the excludes file during a DNS outage. Only set this to false in environments where there is no infrastructure to support reverse DNS lookup.

所以,当我们在配置 datanode 时,如果不是使用了主机名加 dns 解析或者 hosts 文件解析的方式,而是直接使用 ip 地址去配置 slaves 文件,那么就会产生这个错误。