==== 安装环境 ==== 虚拟机: * 192.168.0.2 master * 192.168.0.3 slave1 * 192.168.0.4 slave2 * jave版本:"1.7.0_80" (oracle下载) * hadoop版本:2.6 (apache下载) ==== 开始安装 ==== - 创建一个新用户:useradd hadoop 给hadoop添加sudo vim /etc/sudoers hadoop ALL=(ALL) NOPASSWD: ALL sudo chown hadoop:hadoop -R /usr/local - 下载jdk解压到/usr/loacl - 下载hadoop解压到/usr/loacl - 设置环境变量 - export JAVA_HOME=/usr/local/jdk1.7.0_80 export HADOOP_HOME=/usr/local/hadoop-2.6.0 export PATH=$JAVA_HOME/bin/:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH - 无密码登录 - ssh-keygen -t rsa cat /home/hadoop/.ssh/id_rsa.pub >>/home/hadoop/.ssh/authorized_keys 避免首次询问:Are you sure you want to continue connecting (yes/no)? vim /etc/ssh/ssh_config 把StrictHostKeyChecking ask 改成 StrictHostKeyChecking no - 设置hostname - cat /etc/sysconfig/network NETWORKING=yes HOSTNAME=master - 设置本地路由,添加如下记录 - vim /etc/hosts 192.168.0.2 master 192.168.0.3 slave1 192.168.0.4 slave2 - 禁用selinuxvim /etc/selinux/config SELINUX=disabled - 关闭防火墙service iptables stop chkconfig iptables off ==== 配置master主机 ==== **core-site.xml** fs.defaultFS hdfs://master:9000 //NameNode为集群名称 io.file.buffer.size 131072 **hdfs-site.xml** dfs.namenode.name.dir file:///usr/local/hadoop_data/dfs/name //NameNode节点目录 ,需要提前创建,并可写权限,如果目录不存在,会忽略此配置 dfs.datanode.data.dir file:///usr/local/hadoop_data/dfs/data //DataNode节点目录 ,需要提前创建,并可写权限,如果目录不存在,会忽略此配置 **yarn-site.xml** yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.hostname master **mapred-site.xml** mapreduce.framework.name yarn //mapreduce的frmawork指定为yarn Execution framework set to Hadoop YARN. **slaves** slave1 slave2 ==== 配置slave主机==== * 配置和master大致一样,不同点如下 - 主机名不同 - ip不同 如果是用虚拟机搭建测试环境的话,配置号master然后克隆2个slave1改改就好了 ==== 启动hadoop ==== * hdfs namenode -format master #master节点格式化hdfs * start-dfs.sh * start-yarn.sh * jps 查看启动进程 web 查看状态 http://master:8088 http://master:50070 http://master:50070 ==== 测试 ==== **hadoop自带的例子** hadoop jar /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input /output **自己编译hadoop程序** hadoop fs -rmr /input hadoop fs -rmr /output hadoop fs -rmr /tmp hadoop fs - input.txt /input/ echo "hello word" > input.txt hadoop fs -mkdir /input hadoop fs -put input.txt /input/ rm -rf input.txt # 指定编译依赖的jar包,WordCount.java中的代码百度搜索即可 javac -classpath $HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0.jar:$HADOOP_HOME/share/hadoop/common/hadoop-common-2.6.0.jar WordCount.java jar cf wc.jar WordCount*.class hdfs dfs -rmr /out* hadoop jar wc.jar WordCount /input /output