Windows下使用hadoop

1、从apache下载hadoop,并解压缩,例如hadoop-2.7.3.tar.gz

2、在hadoop-env.cmd里修改设置JAVA_HOME和HADOOP_HOME

set JAVA_HOME="C:\Program Files\Java\jdk1.7.0_79"
set HADOOP_HOME="C:\install\hadoop-2.7.3"

注意如果JAVA_HOME有空格,要用双引号,否则提示JAVA_HOME incorrect set。

3、下载winutils

winutils包含winutils.exe和hadoop.dll,将这两个文件复制到hadoop/bin目录下,否则执行hadoop命令会提示如下错误:

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/St ring;JZ)V at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method) at org.apache.hadoop.util.NativeCrc32.calculateChunkedSumsByteArray(NativeCrc32.java:86) at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(DataChecksum.java:430) at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:202)

下载地址:https://github.com/steveloughran/winutils

注意与所使用的hadoop版本要匹配。参考链接

4、查看远程hdfs文件列表(根目录)

hadoop fs -ls hdfs://192.168.130.100/

其他文件操作类似,可以执行hadoop fs命令查看。

5、在Eclipse里调试mapreduce程序

在eclipse里直接运行mapreduce程序时可能会提示:

ERROR [main] util.Shell (Shell.java:getWinUtilsPath(303)) - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)

原因是没有配置HADOOP_HOME环境变量,在run configuration里加上,或者在windows系统环境变量里加上这个环境变量即可。参考