使用JAVA API连接HDFS时我们需要使用NameNode的地址,开启HA后,两个NameNode可能会主备切换,如果连接的那台主机NameNode挂掉了,连接就会失败. HDFS提供了nameservices的方式进行访问,这样只要有一个NameNode活着,都可以正常访问.
HDFS NameNode HA
在没有HA的环境中,通常使用NameNode hostname访问HDFS的URL.
hdfs://ochadoop111.jcloud.local:8020
为了保证HDFS服务的高可用,生产环境是必须要开启NameNode HA的,此时应该用nameservices作为统一的logical name连接HDFS.
使用Ambari Enable NameNode HA之后,我的集群在ochadoop112.jcloud.local上增加了一个NameNode, HA相关的配置会自动产生.
使用nameservices访问HDFS的URL:
hdfs://mycluster:8020
JAVA API连接HDFS HA代码如下:
Configuration conf=new Configuration(false); String nameservices = "mycluster"; String[] namenodesAddr = {"ochadoop111.jcloud.local:8020","ochadoop112.jcloud.local:8020"}; String[] namenodes = {"nn1","nn2"}; conf.set("fs.defaultFS", "hdfs://" + nameservices); conf.set("dfs.nameservices",nameservices); conf.set("dfs.ha.namenodes." + nameservices, namenodes[0]+","+namenodes[1]); conf.set("dfs.namenode.rpc-address." + nameservices + "." + namenodes[0], namenodesAddr[0]); conf.set("dfs.namenode.rpc-address." + nameservices + "." + namenodes[1], namenodesAddr[1]); conf.set("dfs.client.failover.proxy.provider." + nameservices,"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"); String hdfsRPCUrl = "hdfs://" + nameservices + ":" + 8020; DistributedFileSystem dfs = new DistributedFileSystem(); try { dfs.initialize(URI.create(hdfsRPCUrl),conf); Path tmpPath2 = new Path("/tmp2"); dfs.mkdir(tmpPath2,new FsPermission("777")); FileStatus[] list = dfs.listStatus(new Path("/")); for (FileStatus file : list) { System.out.println(file.getPath()); } dfs.setQuota(tmpPath2,100,1000); } catch (IOException e) { e.printStackTrace(); } finally{ try { dfs.close(); } catch (IOException e) { e.printStackTrace(); } }
广告