core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> //NameNode为集群名称 </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>master:2181,slave1:2181,slave2:2181</value> </property> </configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:///usr/local/hadoop-2.6.0/dfs/name</value> //NameNode节点目录 ,需要提前创建,并可写权限,如果目录不存在,会忽略此配置 </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///usr/local/hadoop-2.6.0/dfs/data</value> //DataNode节点目录 ,需要提前创建,并可写权限,如果目录不存在,会忽略此配置,如果有多个硬盘最好配置多个目录,增加读写速度 </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.hosts.exclude</name> <value>/usr/local/hadoop-2.6.0/exclude</value> </property> <property> <name>dfs.nameservices</name> <value>mycluster</value> <description> Comma-separated list of nameservices. </description> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> <description> The prefix for a given nameservice, contains a comma-separated list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE). </description> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>master:8020</value> <description> RPC address for nomenode1 of mycluster </description> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>slave1:8020</value> <description> RPC address for nomenode2 of mycluster </description> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>master:50070</value> <description> The address and the base port where the dfs namenode1 web ui will listen on. </description> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>slave1:50070</value> <description> The address and the base port where the dfs namenode2 web ui will listen on. </description> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://master:8485;slave1:8485;slave2:8485/mycluster</value> <description>A directory on shared storage between the multiple namenodes in an HA cluster. This directory will be written by the active and read by the standby in order to keep the namespaces synchronized. This directory does not need to be listed in dfs.namenode.edits.dir above. It should be left empty in a non-HA cluster. </description> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> <description> Whether automatic failover is enabled. See the HDFS High Availability documentation for details on automatic HA configuration. </description> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/local/hadoop-2.6.0/dfs/journal/</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence(hadoop)</value> <description> 在failover期间用来隔离Active Namenode的脚本或者java 类列表。 虽然JNS可以确保集群中只有一个Active Node写入edits,这对保护edits一致性很重要,但是在failover期间,有可能Acitive Node仍然存活,Client可能还与其保持连接提供旧的数据服务,我们可以通过此配置,指定shell脚本或者java程序,SSH到Active NameNode然后Kill Namenode进程。它有两种可选值(具体参见官方文档): 1) sshfence:SSH登录到Active Namenode,并Kill此进程。首先当前机器能够使用SSH登录到远端,前提是已经授权(rsa)。 2) shell:运行shell指令隔离Active Namenode。 </description> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> </configuration>
yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> //mapreduce的frmawork指定为yarn <description>Execution framework set to Hadoop YARN.</description> </property> <property> <name>mapreduce.task.io.sort.mb</name> <value>2000</value> //mapreduce的frmawork指定为yarn <description>Execution framework set to Hadoop YARN.</description> </property> </configuration>
更多配置,配置含义参见hadoop HA
在任意一台namenode机器上通过jps命令查找到namenode的进程号,然后通过kill -9的方式杀掉进程,观察另一个namenode节点是否会从状态standby变成active状态
jps 查看namenode的pid kill -9 xxx 然后查看: http://master:50070/ http://slave1:50070/